In this workflow, a number of ETL operations are performed on the sales2008-2011.csv dataset. Besides showing what ETL features are, the goal of this workflow is to move from a series of contracts with different customers in different countries to a one-row summary description for each one of the customers.
The one-row description includes:
1. the customer's unique ID
2. the total amount of money paid by the customer to the company
3. the countries the customer has been active in
4. the date of the first contract (this is always useful to estimate customer loyalty)
5. the number of days between the first and the last purchase, that is the number of days the customer has been with the company
In the end, each one-row customer summary information is joined together with each contract data row from the original file and the resulting table is written to a CSV file in a "data" folder located in the workflow folder.
Workflow
Transform Data using GroupBy and Joiner nodes
External resources
- KNIME Self Paced Course
- KNIME Cheat Sheet : Building a KNIME Workflow for Beginners
- What's data aggregation? - KNIME TV -YouTube
- Basic Aggregations with the GroupBy node - KNIME TV - YouTube
- ETL with KNIME. What is a Join operation - KNIME TV - YouTube
- ETL with KNIME. The Joiner Node - Part I - KNIME TV - YouTube
- ETL with KNIME. The Joiner Node - Part II - KNIME TV - YouTube
Used extensions & nodes
Created with KNIME Analytics Platform version 4.5.2
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
Loading deployments
Loading ad hoc executions
Legal
By using or downloading the workflow, you agree to our terms and conditions.
Discussion
Discussions are currently not available, please try again later.