Hub
Pricing About
WorkflowWorkflow

Connecting to Amazon EMR

AmazonAWSEMRS3Apache Livy
+2
knime profile image
Versionv1.0Latest, created on 
Feb 21, 2024 2:29 PM
Drag & drop
Like
Download workflow
Workflow preview
This workflow demonstrates how to create a Spark context via Apache Livy and execute a simple Spark job on an Amazon EMR cluster. This example uses the NYC taxi dataset from the AWS Registry of Open Data to build a simple prediction model with Random Forest. Additionally, this workflow also shows how to configure Amazon Athena to query dataset that is located on an Amazon S3 bucket.

External resources

  • KNIME on Amazon EMR
  • KNIME Amazon EMR Documentation
Loading deploymentsLoading ad hoc jobs

Used extensions & nodes

Created with KNIME Analytics Platform version 4.2.0 Note: Not all extensions may be displayed.
  • Go to item
    KNIME Amazon Athena ConnectorTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.2.0

    knime profile image
    knime
  • Go to item
    KNIME DatabaseTrusted extension

    KNIME AG, Zurich, Switzerland

    Versions 4.1.1, 4.2.0

    knime profile image
    knime
  • Go to item
    KNIME Extension for Apache SparkTrusted extension

    KNIME AG, Zurich, Switzerland

    Versions 4.1.1, 4.2.0

    knime profile image
    knime
  • Go to item
    KNIME JavaScript ViewsTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.1.2

    knime profile image
    knime

Legal

By using or downloading the workflow, you agree to our terms and conditions.

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • Courses + Certification
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more about KNIME Business Hub
© 2025 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Data Processing Agreement
  • Credits