This exercise goes through statistics, data distribution, outilers and different scales. Steps: read excel file boston house prices, available in folder data/ rid off the first index column Check the data distribution of the prices (column MEDV) Are there any outliers? Check the values of the distribution for all columns and check if there are any missing values. Learn about the basic statistics indicators Remove rows containing outliers of MEDV and check it (to avoid having any outlier the IQR should be set up 1) Use the box plot to check the outliers Normalize your data by using z-score check the box plots and statistics again (resulted standard deviation should be 1 for all columns)
Used extensions & nodes
Created with KNIME Analytics Platform version 4.2.1
Loading ad hoc executions
By using or downloading the workflow, you agree to our terms and conditions.
Discussions are currently not available, please try again later.