Enterprise DNA
O Open Source Observability medium

FeatureTools

by Community

An open source python library for automated feature engineering

F

OSS

FeatureTools

Added 1 June 2026

#automated-feature-engineering #automated-machine-learning #automl #data-science #feature-engineering #machine-learning #python #scikit-learn

Overview

FeatureTools is an open source Python library for automated feature engineering. It uses deep feature synthesis to transform relational and time series data into features for machine learning. The library handles common data transformations and aggregations automatically.

Best for

Best for
Data scientists and ML engineers working with structured tabular data

Use cases

  • Building predictive features from transactional data
  • Automating time-based feature creation from event logs
  • Transforming multiple relational tables into a single feature matrix

Notes

FeatureTools is an open source Python library for automated feature engineering. It uses deep feature synthesis to transform relational and time series data into features for machine learning. The library handles common data transformations and aggregations automatically.

7,655 stars on GitHub. Last updated 2026-02-03. Licensed BSD-3-Clause.

Use cases

  • Building predictive features from transactional data
  • Automating time-based feature creation from event logs
  • Transforming multiple relational tables into a single feature matrix

Pros

  • Open source with strong community support
  • Reduces manual feature engineering time
  • Integrates well with pandas and scikit-learn

Cons

  • Can generate many irrelevant features requiring pruning
  • Performance may degrade with very large datasets
  • Complex to configure for non-standard data schemas

Indexed from awesome-llmops and enriched against its public facts.

Pros

  • Open source with strong community support
  • Reduces manual feature engineering time
  • Integrates well with pandas and scikit-learn

Cons

  • Can generate many irrelevant features requiring pruning
  • Performance may degrade with very large datasets
  • Complex to configure for non-standard data schemas