- Type: TableTableInput table
This component calculates Variance Inflation Factor (VIF) across all numeric variables in the input data table. It can be used to remove collinear features in a regression. Multicollinearity occurs when two or more columns are correlated among each other and provide redundant information when jointly considered as predictors of a model. VIF is used to diagnose the extent of multicollinearity within predictors of a model. For instance, a VIF of 3 tells us that the variance of a column is 3 times larger than it would be if that column was fully uncorrelated with all other predictors. As a rule of thumb, columns with VIF higher than 5 should be removed as predictors of a model in order to reduce dimensionality while minimizing collinearity (James et al., 2014). The interactive view of the component will show the VIF values and highlight the ones above threshold. References: James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2014. An Introduction to Statistical Learning: With Applications in R. Springer Publishing Company, Incorporated. This component is free to use and modify. Author: Andrea De Mauro, aboutbigdata.net
- Type: TableVIF valuesTable including VIF values for each column in the input table
Used extensions & nodes
Created with KNIME Analytics Platform version 4.3.2
By using or downloading the component, you agree to our terms and conditions.