Hub
  • Software
  • Blog
  • Forum
  • Events
  • Documentation
  • About KNIME
  • KNIME Hub
  • Nodes
  • PDF Parser
NodeNode / Source

PDF Parser

Other Data Types Text Processing IO
Drag & drop
Like
Copy short link

This node allows you to read PDF documents and create a document for each file. The documents title and authors will be extracted form the PDFs meta data. The full text of the PDF is extracted, the structure of the PDF is not taken into account. For text extraction the PDFBox library is used. (see http://pdfbox.apache.org/ for details).

Node details

Output ports
  1. Type: Table
    Documents output table
    An output table containing the parsed document data.

Extension

The PDF Parser node is part of this extension:

  1. Go to item

Related workflows & nodes

  1. Go to item
    Chinese text pdf parsing
    This workflow parses a pdf containing chinese text.
    marten_kose > Public > forum_topics > chinese_text_pdf > pdf_parsing
  2. Go to item
    Simple PDF Text Extraction
    victor_palacios > Public > Simple PDF Text Extraction
  3. Go to item
    Filter Bunch of PDF Documents By Keywords Matches
    Pdf reader Compliance dept
    This workflow enables the machine to read all the pdf files in a folder, search for speci…
    bedy_kharisma > Public > Read Multiple PDF and Find Keywords
  4. Go to item
    text2str
    Chemical name text mining. Details on MyExperiment site.
    sauberns > Public > text2str
  5. Go to item
    Regex with Widgets
    Data app Refresh button Dynamic
    +2
    Dynamically update your regex searches with this mini data-app built for experimentation …
    victor_palacios > Public > Regex with Widgets
  6. Go to item
    Reading PDF and extracting information
    jyotendra > Public > Reading PDF and extracting information
  7. Go to item
    Outlier Dection / Fraud Detection in Contracts
    Fraud detection Anomaly detection Text processing
    +4
    Discover anomalies / irregularities / Frauds(?) in contracts payment amounts via: - data …
    rs1 > Public > Contracts_Fraud_Detection_Usecase_example
  1. Go to item
  2. Go to item
  3. Go to item
  4. Go to item
  5. Go to item
  6. Go to item

KNIME
Open for Innovation

KNIME AG
Hardturmstrasse 66
8005 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • E-Learning course
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • KNIME Open Source Story
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more on KNIME Server
© 2022 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Credits