Big Data Analytics - Model Selection to Predict Flight Departure Delays on Hive & Spark
This workflow trains a number of data analytics models on Hadoop and Spark and automatically selects the best model to predict departure delays from a selected airport. Data is the airline dataset downloadable from: http://stat-computing.org/dataexpo/2009/the-data.html. Departure delay is a delay > 15min. Default selected airport is ORD. This workflow implements data reading, data blending, ETL, guided analytics, dimensionality reduction, advanced data mining models, model selection using: Hadoop, Spark, in-memory, parallelization, grid computing, multithreading and/or in-database to speed up computationally intensive operations. Data available in knime://knime.workflow/data/1_Input
Workflow
02_Scaling_Analytics_w_BigData
External resources
Used extensions & nodes
Created with KNIME Analytics Platform version 5.2.3
Legal
By using or downloading the workflow, you agree to our terms and conditions.