This workflow reads CENSUS data from a Hive database in HDInsight; it then moves to Spark where it performs some ETL operations; and finally it trains a Spark decision tree model to predict COW values based on all other attributes. Data for this example come from the new CENSUS dataset which is publicly available and can be downloaded from: http://www.census.gov/programs-surveys/acs/data/pums.html A full explanation of all attributes can be found in: http://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMSDataDict15.pdf
Used extensions & nodes
Created with KNIME Analytics Platform version 3.4.0
By using or downloading the workflow, you agree to our terms and conditions.
Discussions are currently not available, please try again later.