Hub
Pricing About
ComponentComponent

Autofeat Generator

ashokharnal profile image
Draft Latest edits on 
Oct 29, 2021 5:53 AM
Drag & drop
Like
Use or download
This component uses 'autofeat' python library to generate new features. The use of these features is directed towards building linear models. The performance of the linear models is comparable to non-linear models. These linear models have an additional benefit of models being transparent and easy to explain and interpret. Inputs to the component are train and test DataFrames. Missing values must be filled in prior to data input. The component builds model using train data and the built model is then applied on test data. The model itself is saved to a file (in pickle format) on disk by name of 'autofeat_model.pkl'. Feature engineering can only be on numeric features. Target column should also be numeric. Feature generation takes time as feature selection process is also involved. Number of feature generation steps is an important parameter that decides the number of features. More the number of steps, more the number of features, more the possibility of overfitting. Outputs from the component are train and test data with newly created features. Another output is the autofeat model built on train data. Given the model output, you can also use the component 'Autofeat Apply' for feature generation on test data. The component uses python autofeat library along with numpy and pandas. For more about 'autofeat' library, please see this paper: https://arxiv.org/pdf/1901.07329.pdf OR github site: https://github.com/cod3licious/autofeat . The autofeat project is Copyright (c) 2016 by its authors and released under MIT License (https://github.com/cod3licious/autofeat/blob/master/LICENSE).

Component details

Input ports
  1. Type: Table
    trainData
    train data: Feed here data that will be used for training the feature generator. Normalized data would be preferable. Missing values need to be filled in before feeding here. Data should also include target column.
  2. Type: Table
    testData
    test data: Feed here test data. Normalized data would be preferable. Missing values need to be filled in before feeding here. Data should also include target column.
Output ports
  1. Type: Table
    TrainEngineeredFeatures
    Output train data with generated features and features already present in the dataframe.
  2. Type: Python
    Trained Model
    Outputs autofeat model.
  3. Type: Table
    TestEngineeredFeatures
    Output test data with generated features and features already present in the dataframe.

Used extensions & nodes

Created with KNIME Analytics Platform version 4.4.2
  • Go to item
    KNIME Base nodesTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.4.2

    knime
  • Go to item
    KNIME Python Integration

    KNIME AG, Zurich, Switzerland

    Version 4.4.2

    knime
  • Go to item
    KNIME Quick FormsTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.4.2

    knime

This component does not have nodes, extensions, nested components and related workflows

Legal

By using or downloading the component, you agree to our terms and conditions.

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • Courses + Certification
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more about KNIME Business Hub
© 2025 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Data Processing Agreement
  • Credits