Hive/Big Data - a simple "self-healing" (automated) ETL or analytics system on a big data cluster The scenario: you have a table on a big data system with daily data ("default.db_main_table") partitioned by d_date and you want to do some reporting (the number of lines per day) stored in a new table "default.db_analytics" The system is a big data system and you want to do this with partitioned Hive tables The main workflow should be able to be run several times per day and then do the report if it has not been done yet. If a day is missing it will do the job again until it is finished. You could do this by hand or schedule that on the KNIME server.
Used extensions & nodes
Created with KNIME Analytics Platform version 4.5.1
By using or downloading the workflow, you agree to our terms and conditions.
Discussions are currently not available, please try again later.