This workflow shows how to perform a SumIf and a CountIf function in KNIME Analytics Platform.
Of course, there are different ways to get this done using KNIME. In the following example, we show you one of the possible ones.
With the Athletes' information as input data, we want to calculate the total of medals won and count the appearances at the Olympic Games with certain conditions.
We will see how to do it node by node:
To view the configuration for each node, either Double-click or right-click > "configuration" to open the configuration dialogue.
Excel Reader node:
1 - In the "File and Sheet" tab, the file name and the sheet to read from are already set up (when you download the workflow, it's also included the folder with the data)
2 - Within the "Data Area" tab, we have selected read the "whole area" since the spreadsheet already has a standard format (non-empty rows or columns, etc.).
Rule-based Row Filter node:
1 - It is the node that implements the IF logic, in our case: retains rows that contain "Swimming" in the "sport" column and "Female" in the "sex" column. That's all.
2 - This is the expression that we used in the configuration dialogue of the node: $sport$ = "Swimming" AND $sex$ = "Female" => TRUE
3 - We can add multiple conditions and build complex rules using the expression builder.
Duplicate Row filter node (lowest branch):
1 - We need to exclude all those columns that can have different values for each swimmer: "event", "medal", and "isTeamSport".
2 - After running the node, we will have one row by Athlete and Olympic Games editions they have participated in.
Row Aggregator node (lowest branch):
1 - As an aggregation method, we use "Occurrence count" and, as a category column, the athlete's "name".
2 - The result is the count of each athlete's appearances in an Olympic Games edition.
3- Countif by Athlete is done, but also, in the second port, we have the Countif total that we will use later.
Row Aggregator node (top branch):
1 - The aggregation method is Sum, including only the "medal" column to calculate the total medals, and the category column is the athlete "name" column.
2 - We are done with our SumIf: total medals of swimmers (female).
3 - The result is the total medals by each athlete and the total.
Value Lookup node:
Its function is to append the total medal value using the athlete's name as a key value.
Excel writer node:
We have configured the node to generate two different sheets in a new file, so we have two active input ports.
1 - Top port: we are passing the total medals and count of appearances in the Olympic Games by swimmer (female).
2- Low port: total medals and total appearance.
Used extensions & nodes
Created with KNIME Analytics Platform version 5.0.0
Legal
By using or downloading the workflow, you agree to our terms and conditions.