Discover relationship between three categorical features
===========================================
Draws mosaic plot between three categorical variables. Input variable values (labels) are abbreviated to four letters in the graph. Data types of the three columns must be of string type.
Output plot indicates both the p-value of chi-sqaure test and Pearson Residuals. Null hypothesis is that there is no relationship between the features and all are independent. Value of p less than 0.05, as a rule of thumb, indicates relationship .
Pearson residuals = (obs - exp) / sqrt(exp)
Pearson residuals are calculated for each cell. As a rule of thumb, a cell with pearson residual of 3 or greater contributes more to relationship. Intensity of colour in each cell also indicates the extent observed values deviate from expected values.
Note that mosaic() splits the data in the order in which the variables are provided: first on Ist Categorical feature, then on IInd and finally on IIIrd.
For more about mosaic plots please see Wikipedia: https://en.wikipedia.org/wiki/Mosaic_plot . For more about pearson's residuals please see: https://www.statology.org/pearson-residuals/ . For a more thorough overview, please see: https://www.datavis.ca/courses/VCD/vcd-tutorial.pdf. The component needs R's vcd, package.
- Type: TablePort 1Input data may be KNIME data frame