Spark RDD Java Snippet


This node allows you to execute arbitrary java code to manipulate or create Spark RDD. Simply enter the java code in the text area.

Note, that this node also supports flow variables as input to your Spark job. To use a flow variable simply double click on the variable in the "Flow Variable List".

It is also possible to use external java libraries. In order to include such external jar or zip files, add their location in the "Additional Libraries" tab using the control buttons. For details see the "Additional Libraries" tab description below.
The used libraries need to be present on your cluster and added to the class path of your Spark job server. They are not automatically uploaded!

You can define reusable templates with the "Create templates..." button. Templates are stored in the users workspace by default and can be accessed via the "Templates" tab. For details see the "Templates" tab description below.

Input Ports

  1. Type: Spark Data First input Spark RDD.
  2. Type: Spark Data Optional second input Spark RDD. (optional)

Output Ports

  1. Type: Spark Data Result Spark RDD.

Find here

Tools & Services > Apache Spark > Misc > Java Snippet

Make sure to have this extension installed:

KNIME Extension for Apache Spark

Update site for KNIME Analytics Platform 3.7:
KNIME Analytics Platform 3.7 Update Site

How to install extensions