s_401 - prepare label encoding with spark
prepare the preparation of data in a big data environment
- label encode string variables
- transform numbers into Double format (Spark ML likes that)
- remove highly correlated data
- remove NaN variables
- remove continous variables
- optional: normalize the data
External resources
Used extensions & nodes
Created with KNIME Analytics Platform version 4.2.0 Note: Not all extensions may be displayed.
Legal
By using or downloading the workflow, you agree to our terms and conditions.