This is an implementation of the model explanation technique developed by H2O.ai called K-LIME using the KNIME H2O Machine Learning Integration. To find more informations about the K-LIME machine learning interpretability technique please refer to the H2O.ai documentation:
h2o.ai/wp-content/uploads/2017/09/driverlessai/interpreting.html#k-lime
The component allows to cluster input data and build local linear models to explain predictions of a complex black-box-like model. The optimal number of clusters is defined by the optimal value of R^2 value summed over all models in clusters. The number of linear models built is much lower than the corresponding number in the LIME algorithm, where a model is build in the neighbourhoud of each explanation instance. Thus, the algorithm is expected to be faster for large data samples.
Note, that this implementation of K-LIME can not handle missing values and all rows with missing values will be dropped. Thus, no explanations for those will be available. If you want to get explanations for all input rows, please, fill missing values before fedding the data into the app.
Also note, that all but the selected column with predictions will be used to interpret prediction. Therefore, you have to filter out all irrelevant columns as well as the original target column prioir to feeding a table to the input.
Categorical features will be converted into numerical representation using One-Hot Encoding (OHE) (also called "dummy" or "binary"). Numerical features will be scaled as required by linear models. The only pre-processing steps that might be required from the user is to remove outliers and fill missing values, as those might bias surrogate models that are used by the algorithm.
- Type: TableInput dataTable with features and predictions.