Correlation Matrix To Pairs
The Correlation Matrix to Pairs node is designed to take an Input Correlation Matrix and convert it into an equivalent list of Correlation Pairs.
The Correlation Matrix represents the degree of Horizontal Differentiation between Features, Benefits, Attributes, Levels, and Products. The Correlation Matrix may be used by a downstream node (such as the Matrix Distributions node or the Feature Generation node) to generate a set of Customer Distributions comprising the Willingness To Pay (WTP) of individual Virtual Customers.
For example, the Input Correlation Matrix may be a 3x3 matrix of correlation values (doubles between -1.0 and +1.0) with row names and column names of 'A', 'B', and 'C'. The matrix describes all the correlations between Customer Distribution A, Customer Distribution B, and Customer Distribution C. The Output Correlation Pairs would then individually list the same correlations between A:B, A:C, and B:C.
The Input Correlation Matrix will first be converted into a clean and symmetrical Correlation Matrix. That means: (a) the diagonal A:A, B:B, C:C correlations will be set to 1.0; (b) correlation values will be limit-ranged to between -1.0 and +1.0; (c) missing correlations will be set to 0.0; and (d) the correlation for A:B will be set the same as the correlation for B:A (hence lower-left-triangle and upper-right-triangle correlation matrices can be input).
The purpose of this node is to provide the user with flexibility when setting and managing the Horizontal Differentiation (correlations) between Customer Distributions. A downstream Feature Generation node or a downstream Matrix Distribution node both requires an Input Correlation Matrix to generate a set of Customer Distributions. With this node, the user could edit the Correlation Pairs list, scale the Correlation Pairs list so that each Customer Distribution was more-or-less correlated with other Customer Distributions, or concatenate the Correlation Pairs list with another set of correlations developed elsewhere. Working with a list can be easier than working with a matrix.
More Help: Examples and sample workflows can be found at the Scientific Strategy website: www.scientificstrategy.com.
Input ports
- Type: Data Input Correlation Matrix: The input set of correlations that define the relationship between Customer Distributions of the same name. The Correlation Matrix must be symmetrical such that the number of data rows match the number of columns. Each row Distribution Name should be unique and correspond to a column of the same name. The Input Correlation Matrix should include the following columns:
- Distribution (string): The name of the Customer Distribution. This name should correspond to a column of the same name in the same Input Correlation Matrix. The Distribution column can have any name. If multiple string columns are found then the first column is treated as the Distribution name column and the other string columns are ignored. If no string columns are found then the RowID column is treated as the Distribution name column.
- Correlation Values (double): The correlation value between each Customer Distribution row and each Customer Distribution column. As the Correlation Matrix is expected to be symmetrical, each row-column value should be the same as each column-row value. If multiple correlations are provided for A:B or B:A then the highest-non-zero correlation will be used. Left-Lower or Right-Upper triangle matrices can also be used. The diagonal values should all be equal to 1.0.
Output ports
- Type: Data Output Correlation Pairs: The output set of correlations as a list of pairs. Each pair shows the correlation between a single row and a single column for all unique row-column combinations from the Input Correlation Matrix. The Output Correlation Pairs will contain these columns:
- From Distribution: The name of the first Customer Distribution from a row/column within the Input Correlation Matrix.
- To Distribution: The name of the second Customer Distribution from a column/row within the Input Correlation Matrix.
- Correlation: The degree of correlation between the first Customer Distribution and the Second Customer Distribution. Output correlations will be range-limited to -1.0 and +1.0.