Rule-based Row Splitter


This node takes a list of user-defined rules and tries to match them to each row in the input table in the defined order. If a rule matches and the outcome is TRUE, the row will be selected for inclusion in the first output table. If the first matching rule yields FALSE the row will be included in the second output table. If no rule matches this row will be put to the second output table (like when the last rule would be TRUE => FALSE). The roles of first and second output table can be exchanged, see option below.

Each rule is represented by a line. The comments start with // in a line, and anything after that is not interpreted as a rule in that line. Rules consist of a condition part (antecedent), which must evaluate to true or false , and an outcome (consequent, after the => symbol) which is either TRUE or FALSE .

If no rule matches, the outcome is treated as if it were FALSE .

Columns are given by their name surrounded by $, numbers are given in the usual decimal representation. Note that strings must not contain (double-)quotes (for those cases use the following syntax: /Oscar Wilde's wisdom: "Experience is simply the name we give our mistakes."/). The flow variables are represented by $${ TypeCharacterAndFlowVarName }$$ . The TypeCharacter should be 'D' for double (real) values, 'I' for integer values and 'S' for strings.

The logical expressions can be grouped with parentheses. The precedence rules for them are the following: NOT binds most, AND, XOR and finally OR the least. Comparison operators always take precedence over logical connectives. All operators (and their names) are case-sensitive.

The ROWID represents the row key string, the ROWINDEX is a the index of the row (first row has 0 value), while ROWCOUNT stands for the number of rows in the table.

Some example rules (each should be in one line):

// This is a comment
$Col0$ > 0 => TRUE
When the values in Col0 are greater than 0, we select the row to the first outport (if no previous rule matched with FALSE outcome).
$Col0$ = "Active" AND $Col1$ <= 5 => TRUE
You can combine conditions.
$Col0$ LIKE "Market Street*" AND 
    ($Col1$ IN ("married", "divorced") 
        OR $Col2$ > 40) => FALSE
With parentheses you can combine multiple conditions.
$Col0$ MATCHES $${SFlowVar0}$$ OR $$ROWINDEX$$ < $${IFlowVar1}$$ =>
The flow variables, table constants can also appear in conditions.

You can use either Ctrl+Space to insert predefined parts, or select them from the upper controls.

The following comparisons result true (other values are neither less, nor greater or equal to missing and NaN values):

  • ? =,<=,>= ?
  • NaN =,<=,>= NaN

Input Ports

  1. Type: Data Any data table that should be split

Output Ports

  1. Type: Data First output table with rows that are evaluated to TRUE
  2. Type: Data Second output table with rows that are evaluated to FALSE

Find here

Manipulation > Row > Filter

Make sure to have this extension installed:


Update site for KNIME Analytics Platform 3.7:
KNIME Analytics Platform 3.7 Update Site

How to install extensions