In this use case, we will use the NYC taxi dataset and a Random Forest to train a simple time series prediction model to predict taxi demand in the next hour based on data from past hours. Given the large size of the dataset, we train and deploy the machine learning model of choice on a Spark cluster. The KNIME Big Data Extension allows you to run a KNIME workflow on the big data platform you prefer, via in-database processing or via Spark.
Used extensions & nodes
Created with KNIME Analytics Platform version 4.2.3 Note: Not all extensions may be displayed.
By using or downloading the workflow, you agree to our terms and conditions.
Discussions are currently not available, please try again later.