use R to list excel sheet names, extract the data and keep only columns that are present in all sheets
use R package readxl to list all sheets of excel files from a folder, determine their sheets and columns and guess the type. In the end keep only those columns and data that are present in all files
I built a solution but you may want to check it out if it works for you. With R I check all the sheets in the excel files from a folder. The sheets get imported and read back into KNIME the type is determined by a guess from the first 50k lines.
Then I try to find out which combination of type and column name is there the most (all of the time - you might adapt that) and then only those are kept. But initially, all the data is loaded into KNIME so you might use it later. Filename and sheet-name are stored for later use.
External resources
Used extensions & nodes
Created with KNIME Analytics Platform version 4.0.2
Note: Not all extensions may be displayed.
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
Loading deployments
Loading ad hoc executions
Legal
By using or downloading the workflow, you agree to our terms and conditions.
Discussion
Discussions are currently not available, please try again later.