Provides Rule Engine functionality across multiple selected columns.
Rules are based on those written for the core Rule Engine node.
There is a small change to the handling of flow variables (see below).
In addition to referring to columns by name, you can refer to $CURRENTCOLUMN$
and additionally can include rules specific to a column by including
"<CURRENTCOLUMNNAME>"
e.g.
"<CURRENTCOLUMNNAME>" = "InvoiceNumber" AND MISSING $CURRENTCOLUMN$ => $NewInvoiceNumber$
"<CURRENTCOLUMNNAME>" = "Amount" AND MISSING $CURRENTCOLUMN$ => $NewAmount$
which would replace missing invoice numbers with a new invoice number but missing amounts with be replaced with the new amount column value.
Updating all MISSING VALUES to ZERO except on specific named columns:
MISSING $CURRENTCOLUMN$ AND NOT "<CURRENTCOLUMNNAME>" = "InvoiceDate" => 0
Setting all ZERO value to MISSING... just use the following rule:
NOT $CURRENTCOLUMN$ = 0 => $CURRENTCOLUMN$
Flow variables are referenced in a similar manner to existing rule engine, but because this is a component, they need to be distinguished from the component's internal flow variables, so require a prefix:
e.g. a String variable "variable_1" is referenced as
{{(Svar@:variable_1)$$
instead of
$${{Svariable_1}$$
As an extension beyond the regular Rule Engine, it is possible to reference the column's data type and index position.
This means that it is possible to write a series of rules, and have some rules only apply to appropriate columns.
e.g.
// handling missing values according to data type:
"<CURRENTCOLUMNTYPE>" = "Number (integer)" AND MISSING $CURRENTCOLUMN$ => 0
"<CURRENTCOLUMNTYPE>" = "String" AND MISSING $CURRENTCOLUMN$ => "N"
"<CURRENTCOLUMNTYPE>" = "Boolean" AND MISSING $CURRENTCOLUMN$ => FALSE
TRUE => $CURRENTCOLUMN$
// Rule that applies only beyond 5th column:
<CURRENTCOLUMNINDEX> >=5 AND $CURRENTCOLUMN$="X" => "Y"
TRUE => $CURRENTCOLUMN$
As a further extension to the original Rule Engne, a rule may be split across lines. To indicate that a rule is split, simply start a new line with ... at the very beginning of the line
e.g.
$MyColumn$ = "N" AND
... $SomeOtherColumn$="X" => "Y"
This is a purely aesthetic change to the rule and does not change its behaviour; the component will adapt this rule at runtime for compatibility with the underlying Rule Engine, by concatenating the lines and removing the "..."
To assist with writing rules, and remembering syntax, lists of flow variables, column names and some other literal values, and operators are displayed in the dialog. These are displayed in the format that they should be referenced. If the lists are not populated, exit config, execute the node and then open configuration again.This allows the lists to be populated.
Although it is not possible within the component to provide interactive features in the same way that a bespoke node can, you can highlight an item in the list boxes and then use the keyboard shortcuts for your Operating System to copy and then paste the item into the Rules box.
e.g. Copy / Paste:
Windows: Ctrl C / Ctrl V
Mac: Cmd C / Cmd V
Linux: Ctrl Shift C / Ctrl Shift V
Any feedback or problems, please tag me @takbb on the KNIME Community Forum
29 Aug 2024 - correction to v7, and make compatible with v4.7.8 (previously 5.x only)
06 Oct 2024 - update to better handle booleans
@takbb Brian Bates
Rules are based on those written for the core Rule Engine node.
There is a small change to the handling of flow variables (see below).
In addition to referring to columns by name, you can refer to $CURRENTCOLUMN$
and additionally can include rules specific to a column by including
"<CURRENTCOLUMNNAME>"
e.g.
"<CURRENTCOLUMNNAME>" = "InvoiceNumber" AND MISSING $CURRENTCOLUMN$ => $NewInvoiceNumber$
"<CURRENTCOLUMNNAME>" = "Amount" AND MISSING $CURRENTCOLUMN$ => $NewAmount$
which would replace missing invoice numbers with a new invoice number but missing amounts with be replaced with the new amount column value.
Updating all MISSING VALUES to ZERO except on specific named columns:
MISSING $CURRENTCOLUMN$ AND NOT "<CURRENTCOLUMNNAME>" = "InvoiceDate" => 0
Setting all ZERO value to MISSING... just use the following rule:
NOT $CURRENTCOLUMN$ = 0 => $CURRENTCOLUMN$
Flow variables are referenced in a similar manner to existing rule engine, but because this is a component, they need to be distinguished from the component's internal flow variables, so require a prefix:
e.g. a String variable "variable_1" is referenced as
{{(Svar@:variable_1)$$
instead of
$${{Svariable_1}$$
As an extension beyond the regular Rule Engine, it is possible to reference the column's data type and index position.
This means that it is possible to write a series of rules, and have some rules only apply to appropriate columns.
e.g.
// handling missing values according to data type:
"<CURRENTCOLUMNTYPE>" = "Number (integer)" AND MISSING $CURRENTCOLUMN$ => 0
"<CURRENTCOLUMNTYPE>" = "String" AND MISSING $CURRENTCOLUMN$ => "N"
"<CURRENTCOLUMNTYPE>" = "Boolean" AND MISSING $CURRENTCOLUMN$ => FALSE
TRUE => $CURRENTCOLUMN$
// Rule that applies only beyond 5th column:
<CURRENTCOLUMNINDEX> >=5 AND $CURRENTCOLUMN$="X" => "Y"
TRUE => $CURRENTCOLUMN$
As a further extension to the original Rule Engne, a rule may be split across lines. To indicate that a rule is split, simply start a new line with ... at the very beginning of the line
e.g.
$MyColumn$ = "N" AND
... $SomeOtherColumn$="X" => "Y"
This is a purely aesthetic change to the rule and does not change its behaviour; the component will adapt this rule at runtime for compatibility with the underlying Rule Engine, by concatenating the lines and removing the "..."
To assist with writing rules, and remembering syntax, lists of flow variables, column names and some other literal values, and operators are displayed in the dialog. These are displayed in the format that they should be referenced. If the lists are not populated, exit config, execute the node and then open configuration again.This allows the lists to be populated.
Although it is not possible within the component to provide interactive features in the same way that a bespoke node can, you can highlight an item in the list boxes and then use the keyboard shortcuts for your Operating System to copy and then paste the item into the Rules box.
e.g. Copy / Paste:
Windows: Ctrl C / Ctrl V
Mac: Cmd C / Cmd V
Linux: Ctrl Shift C / Ctrl Shift V
Any feedback or problems, please tag me @takbb on the KNIME Community Forum
29 Aug 2024 - correction to v7, and make compatible with v4.7.8 (previously 5.x only)
06 Oct 2024 - update to better handle booleans
@takbb Brian Bates