- Type: TableValidation DataData to be summarized containing all the features SHAP Loop Start needs. Supported domains: Numeric (double), Numeric (integer), String.
This Component can be used before the bottom input port of SHAP Loop Start. This technique will use k-means to summarize the validation set and create a sampling table to use when creating coalitions. The created sampling table is large n rows, each row is a different prototype of the data. This n can be adjusted from the configuration dialogue of the Component. The n default value is 100. The output sampling table has, for each of the n clusters created by k-means, a prototype row and a column "SHAP Summarizer Sampling weight" that can be used by the SHAP Loop Start node. This Component can summarize data of the following domains: Number (integer), Number (double) and String. DISCLAIMER : the Component statistical sampling is not always guaranteed when you provide String columns in the input table. Current computer science research is still looking for a more solid solution than training k-means via one-hot encoding-decoding of categorical columns.
- Type: TableSummarized Sampling TableA table with a prototype for each cluster with all the features with average value of the belonging cluster. Additionaly a column called "SHAP Summarizer Sampling weight" that can be used by SHAP Loop Start as sampling weight.
Used extensions & nodes
Created with KNIME Analytics Platform version 4.1.3
By using or downloading the component, you agree to our terms and conditions.