The Document Similarity Learner develops a model for identifying a new documents most similar matches from an existing corpus of documents. It consumes already processed documents (refer to Document Preprocessing Component) as input and provides as output both the corpus of documents and a model for use with the Document Similarity Predictor Component.
- Type: TablePreprocessed DocumentsDocuments which have already been preprocessed (via Document Preprocessing).