Correlation Pairs To Matrix
The Correlation Pairs to Matrix node is designed to take a list of Input Correlation Pairs and convert it into an equivalent Correlation Matrix.
The Correlation Matrix represents the degree of Horizontal Differentiation between Features, Benefits, Attributes, Levels, and Products. The Correlation Matrix may be used by a downstream node (such as the Matrix Distributions node or the Feature Generation node) to generate a set of Customer Distributions comprising the Willingness To Pay (WTP) of individual Virtual Customers.
For example, the list of Input Correlation Pairs would individually list the correlations between the 'A', 'B', and 'C' Customer Distributions as A:B, A:C, and B:C pairs. The Output Correlation Matrix would then be a 3x3 matrix of the same correlation values (doubles between -1.0 and +1.0) with row names and column names of A, B, and C. The matrix describes all the correlations between Customer Distribution A, Customer Distribution B, and Customer Distribution C.
The Input Correlation Pairs will first be converted into a clean and symmetrical Correlation Matrix. That means: (a) the diagonal A:A, B:B, C:C correlations will be set to 1.0; (b) correlation values will be limit-ranged to between -1.0 and +1.0; (c) missing correlations will be set to 0.0; and (d) the correlation for A:B will be set the same as the correlation for B:A. The correlation values a the bottom of the Input Correlation Pairs table will supersede correlation values at the top of the input table.
The purpose of this node is to provide the user with flexibility when setting and managing the Horizontal Differentiation (correlations) between Customer Distributions. A downstream Feature Generation node or a downstream Matrix Distribution node both require an Input Correlation Matrix to generate a set of Customer Distributions. With this node, the user could edit the Correlation Pairs list, scale the Correlation Pairs list so that each Customer Distribution was more-or-less correlated with other Customer Distributions, or concatenate the Correlation Pairs list with another set of correlations developed elsewhere. Working with a list can be easier than working with a matrix.
More Help: Examples and sample workflows can be found at the Scientific Strategy website: www.scientificstrategy.com.
Input ports
- Type: Data Input Correlation Pairs: The input set of correlations as a list of pairs. Each pair should quantify the correlation between a single row and a single column for all unique row-column combinations for the Output Correlation Matrix. The Input Correlation Pairs should include the following columns:
- From Distribution (string): The name of the first Customer Distribution for a row/column within the Output Correlation Matrix
- To Distribution (string): The name of the second Customer Distribution for a column/row within the Output Correlation Matrix
- Correlation (double): The degree of correlation between the first Customer Distribution and the Second Customer Distribution. If multiple correlations are provided for A:B or B:A then the lower correlations found in the input table will be used. This allows the user to append new rows to the bottom of the Input Correlation Pairs table and be confident that the new values will override the old values.
Output ports
- Type: Data Output Correlation Matrix: The output set of correlations that define the relationship between Customer Distributions. The Correlation Matrix will be symmetrical such that the number of data rows match the number of columns. Each row Distribution Name will be unique and correspond to a column of the same name. The Output Correlation Matrix will contain these columns:
- Distribution: The row name of the first Customer Distribution within the Output Correlation Matrix.
- Correlated Distributions: The column name of the second Customer Distribution within the Output Correlation Matrix, along with the degree of correlation to the row Customer Distribution. Output correlations will be symmetrical and range-limited to -1.0 and +1.0.
- Type: Data Output Correlation Repaired Matrix: The repaired output set of correlations that define the relationship between Customer Distributions. Repairing is required when the correlations are unrealistic. For example, if A is highly correlated to B (for example, A:B = +0.99) and if A is highly correlated with C (for example, A:C = +0.99) then B must be highly correlated with C (that is, B:C >> 0.0). More precisely, the Correlation Matrix must have all positive definite Eigenvalues. Note that it is not necessary for downstream nodes that generate Customer Distributions (such as the Matrix Distributions node or the Feature Generation node) to use this Correlation Repaired Matrix as these downstream nodes will always first self-repair the Input Correlation Matrix. The Output Correlation Repaired Matrix will contain the same columns as the Output Correlation Matrix:
- Distribution: The row name of the first Customer Distribution within the Output Correlation Repaired Matrix.
- Correlated Distributions: The column name of the second Customer Distribution within the Output Correlation Matrix, along with the repaired degree of correlation to the row Customer Distribution. Output correlations will be symmetrical and range-limited to -1.0 and +1.0.
- Type: Data Output Correlation Error Matrix: The difference between the Output Correlation Matrix and the Output Correlation Repaired Matrix. This is a convenience output to show how the Correlation Matrix needs to be repaired before Customer Distributions can be generated. The Output Correlation Error Matrix will contain the same columns as the Output Correlation Matrix:
- Distribution: The row name of the first Customer Distribution within the Output Correlation Error Matrix.
- Correlated Distributions: The column name of the second Customer Distribution within the Output Correlation Matrix, along with the difference between the output correlation and the repaired correlation.