This component provides some important linear regression diagnostics which are missing in standard KNIME nodes.
Use the dialog to select the target variable of the linear regression model used.
Attention: The component does not scale well for large datasets (More than 5,6 k rows).
Application on large datasets could be take a lot of time.
Connecting this node to the "Linear Regression Predictor" allows to create a dashboard useful to:
- Identify influencial rows in OLS estimates using Cook's distance;
- Inspect the distribution of errors (it should be approximately normal);
- Visualize the scatterplot of the predicted values vs the real values;
- Visualize the scatterplot of the errors vs the predicted values;
- Visualize the scatterplot of the errors vs true values;
- View the ANOVA TABLE and the F statistic with relative p-value;
- Check the residuals standard error.
- Type: PMMLPMML Linear RegressionConnect the estimated Regression Model by using the PMML Port
- Type: TableCoefficients tableConnect to "Coefficients and Statistics" data output of the Linear Regression Learner node
- Type: TableLinear Regression input tableConnect to the same table which feeds the Linear Regression Learner node