Concatenate Tables and Combine String Cells
This workflow shows how to concatenate data tables from two different sheets and how to combine string-type columns using KNIME Analytics Platform.
The aim is to access two different sheets of the same Excel file (Olympic_Athlete_Event_Results.xlsx), concatenate the data of the two sheets into one table, and combine the values of the two columns "sport" and "event" into one column.
💡 To view each node's configuration, select the node and see the configuration pane on the right side of the workflow editor.
Let's walk through the different nodes involved in this operation:
Excel Reader nodes:
Since the folder with the data is already included when you download the workflow, in the "File and Sheet" tab, we choose to "Read from" the "Current workflow data area" and select the dataset.
In the "Data Area" tab, we select to read the "Whole sheet" since the spreadsheet already has a standard format (i.e., non-empty rows or columns, etc.) and we intend to keep its original structure.
Concatenate node:
In the configuration window, we control how the two input tables are concatenated: Columns with the same names are concatenated. If one input table contains columns that the other table does not, the columns can either be filled with missing values (union - all columns are retained in the output table) or filtered out (intersection - only columns that are contained in both tables are concatenated).
For the Row IDs, we choose to "Reuse existing" Row IDs and "Append suffix" to handle duplicate Row Ids. It means, if we have two rows with the id "Row0", one gets changed to "Row0_dup".
Column Combiner node:
We want to combine the values of the two columns "sport" and "event" into one column called "discipline".
In the include/exclude panel, we select only the columns we want to combine ("sport and "discipline"). As delimiter, we define a colon (":"), and we choose to remove the input columns from the output table.
Excel Writer node:
We append the dataset to a new sheet called "Combined_Sport_Event" in the existing Excel file located in the workflow data area.
After executing the node, the file will open automatically.
As you can see from the output, we concatenated the two input tables and replaced the "sport" and "event" columns with the combined column "discipline".