# IDF

Learner

Computes three variants of the inverse document frequency (idf) for each term according to the given set of documents and adds a column containing the idf value. Smooth, normalized, and probabilistic idf. The default variant is smooth idf specified as follows: idf(t) = log(1 + (f(D) / f(d, t))).

The normalized idf is defined by: idf(t) = log(f(D) / f(d,t)).

The probabilistic idf is defined by: idf(t) = log((f(D) - f(d,t)) / f(d,t)), where f(D) is the number of all documents and f(d,t) is the number of documents containing term t.

### Input Ports

- Type: Data The input table which contains terms and documents.

### Output Ports

- Type: Data The output table which contains terms documents and a corresponding frequency value.

## Find here

Other Data Types > Text Processing > Frequencies

Make sure to have this extension installed:

## KNIME Textprocessing

Update site for KNIME Analytics Platform 3.7:

KNIME Analytics Platform 3.7 Update Site