use R to list excel sheet names, extract the data and keep only columns that are present in all sheets use R package readxl to list all sheets of excel files from a folder, determine their sheets and columns and guess the type. In the end keep only those columns and data that are present in all files I built a solution but you may want to check it out if it works for you. With R I check all the sheets in the excel files from a folder. The sheets get imported and read back into KNIME the type is determined by a guess from the first 50k lines. Then I try to find out which combination of type and column name is there the most (all of the time - you might adapt that) and then only those are kept. But initially, all the data is loaded into KNIME so you might use it later. Filename and sheet-name are stored for later use.
Used extensions & nodes
Created with KNIME Analytics Platform version 4.0.2 Note: Not all extensions may be displayed.
By using or downloading the workflow, you agree to our terms and conditions.
Discussions are currently not available, please try again later.