Calculates for each pair of selected columns a correlation coefficient, i.e. a measure of the correlation of the two variables.
All measures are based on the rank of the cells. Where the rank of a cell value refers to its position in a sorted list of all entries. All correlation can be calculated on any kind of DataColumn. However please note that we use the default ordering of the values. If there is no ordering defined in the column, a string representation will be used. Spearman's rank correlation coefficient is a statistical measure of the strength of a monotonic relationship between paired data. Where the monotonic relationship is characterised by a relationship between ordered sets that preserves the given order, i.e., either never increases or never decreases as its independent variable increases. The value of this measure ranges from -1 (strong negative correlation) to 1 (strong positive correlation). A perfect Spearman correlation of +1 or −1 occurs when each of the variables is a perfect monotone function of the other. Goodman and Kruskal's gamma as well as Kendall's tau rank correlation coefficient is used to measure the strength of association between two measured quantities. Both are based on the number of concordant and discordant pairs. Kendall's Tau A and Tau B coefficients can be considered as standardized forms of Gamma. The difference between Tau A and Tau B is that Tau A statistic does not consider tied values, while Tau B makes adjustments for them. By tied observations we consider two or more observations having the same value. Both Kruskal's gamma and Kendall's Tau A are mostly suitable for square tables, whereas Tau B is most appropriately used for rectangular tables. The coefficients must be in the range from −1 (100% negative association, or perfect inversion) to +1 (100% positive association, or perfect agreement). A value of zero indicates the absence of association.
Rows containing Missing Values will be ignored, not used for the calculations. For other behaviors please resolve them before.
- Type: Data Numeric input data to evaluate
- Type: Data Correlation variables in a square matrix
- Type: Correlation A model containing the correlation measures. This model is appropriate to be read by the Correlation Filter node.
- Type: Data A table containing the ranks of the columns. Where the rank corresponds to the values position in a sorted table.
Analytics > Statistics
Make sure to have this extension installed:
Update site for KNIME Analytics Platform 3.7:
KNIME Analytics Platform 3.7 Update Site