Connecting to Amazon EMR

Workflow

Versionv1.0Latest, created on

This workflow demonstrates how to create a Spark context via Apache Livy and execute a simple Spark job on an Amazon EMR cluster. This example uses the NYC taxi dataset from the AWS Registry of Open Data to build a simple prediction model with Random Forest. Additionally, this workflow also shows how to configure Amazon Athena to query dataset that is located on an Amazon S3 bucket.

External resources

KNIME on Amazon EMR
KNIME Amazon EMR Documentation

Loading deploymentsLoading ad hoc jobs

Legal

By using or downloading the workflow, you agree to our terms and conditions.