NodeConditional Box Plot

Visualizer

A box plot displays robust statistical parameters: minimum, lower quartile, median, upper quartile, and maximum. These parameters are called robust, since they are not sensitive to extreme outliers.

The conditional box plot partitions the data of a numeric column into classes according to another nominal column and creates a box plot for each of the classes.

A box plot for one numerical attribute is constructed in the following way: The box itself goes from the lower quartile (Q1) to the upper quartile (Q3). The median is drawn as a horizontal bar inside the box. The distance between Q1 and Q3 is called the interquartile range (IQR). Above and below the box are the so-called whiskers. They are drawn at the minimum and the maximum value as horizontal bars and are connected with the box by a dotted line. The whiskers never exceed 1.5 * IQR. This means if there are some data points which exceed either Q1 - (1.5 * IQR) or Q3 + (1.5 * IQR) than the whiskers are drawn at the first value in these ranges and the data points are drawn separately as outliers. For the outliers the distinction between mild and extreme outliers is made. As mild outliers are those data points p considered for which holds: p < Q1 - (1.5 * IQR) AND p > Q1 - (3 * IQR) or p > Q3 + (1.5 * IQR) AND p < Q3 + (3 * IQR). In other words mild outliers are those data points which lay between 1.5 * IRQ and 3 * IRQ. Extreme outliers are those data points p for which holds: p < Q1 - (3 * IQR) or p > Q3 + (3 * IQR). Thus, three times the box width (IQR) marks the boundary between "mild" and "extreme" outliers. Mild outliers are painted as dots, while extreme outliers are displayed as crosses. In order to identify the outliers they can be selected and hilited. This provides a quick overview over extreme characteristics of a dataset.

The node supports custom CSS styling. You can simply put CSS rules into a single string and set it as a flow variable 'customCSS' in the node configuration dialog. You will find the list of available classes and their description on our documentation page.

Input Ports

  1. Port Type: Data
    Data table containing the categories and values to be plotted in a box plot.
  2. Port Type: Data
    Data table containing the category names with colors applied. (optional)

Output Ports

  1. Port Type: Image
    SVG image of the box plot.