This workflow demonstrates how to create a Spark context via Apache Livy and execute a simple Spark job on an Amazon EMR cluster.
This example uses the NYC taxi dataset from the AWS Registry of Open Data to build a simple prediction model with Random Forest.
Additionally, this workflow also shows how to configure Amazon Athena to query dataset that is located on an Amazon S3 bucket.
Workflow
Connecting to Amazon EMR
External resources
Used extensions & nodes
Created with KNIME Analytics Platform version 4.2.0
Note: Not all extensions may be displayed.
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
Loading deployments
Loading ad hoc executions
Legal
By using or downloading the workflow, you agree to our terms and conditions.
Discussion
Discussions are currently not available, please try again later.