Create a Big Data Hive/Parquet table with a partition based on an existing KNIME table and add more partitions later
You can create a Hive table with Parquet format with the DB Table Creator node with additional options specifying the PARQUET format and a PARTITION. You will leave the partition column out when creating the table from the example and later use the column as partition when you insert the KNIME table into your newly created Hiv table. You can later add more partitions just by uploading it thru "DB Loader" - the partitions will automatically be created or an existing partition will be appended.
=> please download the whole workflow group "kn_example_bigdata_hive_partitions"
External resources
- Hive - upload data in several ORC files to HDFS and bring them together as an EXTERNAL table
- KNIME and Hive - load multiple Parquet files at once via external table
- KN workflow Group: Create a Big Data Hive/Parquet table with a partition based on an existing KNIME table and add more partitions later
- META: A meta collection of KNIME and databases (SQL, Big Data/Hive/Impala and Spark/PySpark)
- HUB: Create a Big Data Hive/Parquet table with a partition based on an existing KNIME table and add more partitions later
- KNIME Big Data Extensions User Guide
- School of Hive - with KNIME's local Big Data environment (SQL for Big Data)
Used extensions & nodes
Created with KNIME Analytics Platform version 4.7.0
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
Loading deployments
Loading ad hoc executions
Legal
By using or downloading the workflow, you agree to our terms and conditions.
Discussion
Discussions are currently not available, please try again later.