PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Feature Engineering Python Packages

Python packages with the GitHub topic feature-engineering. Sorted by relevance, with stars and monthly downloads.
feature-engine
feature-engine

Feature engineering and selection open-source Python library compatible with sklearn.

315K 2K 342
alteryx
featuretools

An open source python library for automated feature engineering

245K 8K 908
apache
sf-hamilton

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

186K 2K 188
upgini
upgini

Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs

38K 349 26
apache
apache-hamilton

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

36K 2K 188
Microsoft
nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

31K 14K 2K
EpistasisLab
tpot

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

30K 10K 2K
fraunhoferportugal
tsfel

An intuitive library to extract features from time series.

27K 1K 156
winedarksea
autots

Automated Time Series Forecasting

24K 1K 123
mdefrance
autocarver

Automatic optimal discretization pipeline

16K 10 0
dagworks-inc
sf-hamilton-sdk

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

15K 2K 188
predict-idlab
tsflex

Flexible time series feature extraction & processing

13K 438 28
dagworks-inc
sf-hamilton-ui

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

11K 2K 188
alteryx
evalml

EvalML is an AutoML library written in python.

10K 848 93
AutoViML
featurewiz

Use advanced feature engineering strategies and select best features from your data set with a single line of code. Created by Ram Seshadri. Collaborators welcome.

10K 678 99
ThomasBury
arfs

All Relevant Feature Selection

9K 143 15
MatsMoll
aligned

The DBT of ML, as Aligned describes data dependencies in ML systems, and reduce technical data debt

9K 61 2
mljar
mljar-supervised

Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation

9K 3K 434
dagworks-inc
sf-hamilton-lsp

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

8K 2K 188
NVIDIA-Merlin
nvtabular

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

8K 1K 149
pixeltable
pixeltable

Declarative and Incremental Backend for Multimodal AI Applications

7K 2K 213
gmrukwa
divik

Divisive Intelligent K-Means algorithm (DiviK) for joint feature selection and clustering of heavily multidimensional data.

7K 14 6
alibaba
feathub-nightly

FeatHub - A stream-batch unified feature store for real-time machine learning

6K 348 60
SimonBlanke
hyperactive

A unified interface for optimization algorithms and experiments

5K 552 74
    • Data from PyPI, GitHub, ClickHouse, and BigQuery