The Deferred Frequency Index (DFI) is a tool for string mining under frequency constraints, i.e., predicates that evaluate solely the frequency of a pattern occurrence in the data. The frequency of a pattern is defined as the number of distinct sequences in a database that contain the pattern at least once. Currently the implementation contains 3 different predicates and can easily be extended by user-defined frequency predicates. The frequencies are calculated during the construction of a suffix tree over all databases, which enables to limit the index construction to a problem-specific minimum referred to as the optimal monotonic hull.
(c) Copyright 2010 by David Weese and Marcel H. Schulz