Snowflake Partitioning

Component

Draft Latest edits on

The input table is split into two partitions (i.e. row-wise), e.g. train and test data. The two partitions are available as database queries at the two output ports. To perform the partitioning the node appends a new partitioning column to the entered table with a random number. This is necessary since the random number function is not deterministic even with a given seed. Once the partition queries are consumed use the Snowflake Partitioning Cleanup component to remove the added partitioning column.

Component details

Ports Options Views

Input ports

Type: DB Session
DB Session
DB Session

Output ports

Type: DB Data
First partition
First data partition of the entered table.
Type: DB Data
Second partition
Second data partition of the entered table.
Type: Flow Variable
Flow variables
The entered table and random column name used in the Snowflake Partitioning Cleanup component.

External resources

Legal

By using or downloading the component, you agree to our terms and conditions.