For feature selection/ elemination purposes, this component calculates IV(Information Values) for optimal categories of variables. This component also calculates WOE (Weight of Evidence) of categorized variables.
Step By Step Guide:
1- Initially, to run this component one should install Python Integration extensions.
2- For obtain a better Python node performance, pyarrow library should be installed.
3- Having installed pyarrow library, select serialization library as Apache Arrow under preferences. This option makes a huge difference as performance compared to Flatbuffers Column Serialization.
4- Then, specify desired IV threshold, target (label) and its bad category from dialog window. Target should be a string form to run this component.
- Type: TableDataRaw Data