NodeEqual Size Sampling


Removes rows from the input data set such that the values in a categorical column are equally distributed. This can be useful, for instance if a learning algorithm is prone to unequal class distributions and you want to downsize the data set so that the class attributes occur equally often in the data set.

The node will remove random rows belonging to the majority classes. The rows returned by this node will contain all records from the minority class(es) and a random sample from each of the majority classes, whereby each sample contains as many objects as the minority class contains.

Input ports

  1. Input Type: Data
    Arbitrary input data.

Output ports

  1. Downsampled input Type: Data
    The input data with fewer rows.