This workflow shows an example of parameter optimization in a logistic regression model. In the logistic regression we optimize step size and variance.
The different parameters are output as flow variables by the Parameter Optimization Loop Start node. The parameter settings of the logistic regression algorithm are overwritten by the flow variables and trees with different settings are trained.
Since this is a binary classification we can use the ROC Curve node and create a flow variable with the AUC in each iteration. This is then fed into the Parameter Optimization Loop End node. The end node compares the accuracies and supplies the best value in the first output. We use "Hill Climbing Strategy".
Please notice that in each iteration cross validation is adopted in order to test the combination of parameters over different folds and get an average performance. This should avoid finding a set of parameter values which performs well only on the training data, but which does not work for new data. That is we are avoiding to overfitting parameters on a training partition.
Workflow
Parameter Optimization Loop with Cross Validation
External resources
Used extensions & nodes
Created with KNIME Analytics Platform version 4.5.2
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
Loading deployments
Loading ad hoc jobs
Legal
By using or downloading the workflow, you agree to our terms and conditions.