Equal Size Sampling


Removes rows from the input data set such that the values in a categorical column are equally distributed. This can be useful, for instance if a learning algorithm is prone to unequal class distributions and you want to downsize the data set so that the class attributes occur equally often in the data set.

The node will remove random rows belonging to the majority classes. The rows returned by this node will contain all records from the minority class(es) and a random sample from each of the majority classes, whereby each sample contains as many objects as the minority class contains.

Input Ports

  1. Type: Data Arbitrary input data.

Output Ports

  1. Type: Data The input data with fewer rows.

Find here

Manipulation > Row > Transform

Make sure to have this extension installed: