Challenge 15: Extracting a Table from a PDF
Level: Hard
Description: Given a text-based PDF document with a table, can you partially extract the table into a KNIME data table for further analysis? For this challenge we will extract the table from this PDF document and attempt to partially reconstruct it within KNIME. The corresponding KNIME table should contain the following columns: Day, Max, Min, Norm, Depart, Heat, and Cool. Note 1: Your final output should be a table, not a single row with all the relevant data. Note 2: The Tika Parser node is better suited for this task than the PDF Parser node. We completed this task without components, regular expressions, or code-snippet nodes. In fact, our solution has a total of 10 nodes, but labeling the columns required a bit of manual effort.
Author: Victor Palacios
Workflow
KNIME_challenge15_solution
Used extensions & nodes
Created with KNIME Analytics Platform version 4.5.2
Legal
By using or downloading the workflow, you agree to our terms and conditions.