PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Data Transformation Python Packages

Python packages with the GitHub topic data-transformation. Sorted by relevance, with stars and monthly downloads.
mahmoud
glom

☄️ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! ☄️

17.2M 2K 72
daq-tools
commons-codec

Data decoding, encoding, conversion, and translation utilities.

7K 2 2
bruin-data
bruin-sdk

Bruin Python SDK — eliminate boilerplate in Bruin Python assets

7K 6 0
productml
blurr-dev

Data aggregation pipeline for running real-time predictive models

5K 4 0
ironmussa
optimuspyspark

:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

4K 2K 232
panodata
tikray

A compact data transformation engine.

4K 1 0
kmatarese
glide

Easy ETL

4K 17 2
scottroberts140
dsr-feature-eng-ml

Machine learning model evaluation and feature engineering framework with hyperparameter tuning, data balancing, and feature importance analysis.

4K 1 0
azukds
tubular

Python package implementing ML feature engineering and pre-processing for polars or pandas dataframes.

3K 100 27
jhd3197
tukuy

Tukuy is a robust, extensible data transformation library that leverages a flexible plugin system. It simplifies the manipulation, validation, and extraction of data across multiple formats (text, HTML, JSON, dates, numbers, and more), making it an ideal tool for building data pipelines and cleaning workflows.

1K 3 0
globaldothealth
adtl

Another data transformation language

1K 2 1
MatheusGiacomo
dataforge-dfg

Data Forge is a high-performance, CLI-first data integration tool designed to streamline the lifecycle of data from ingestion to transformation. Built with Python, it provides a robust framework for handling both ETL and ELT workflows with a focus on automation, reliability, and developer experience.

981 1 0
jinyoung4478
xtl-py

Three Excels, one transform — the Excel-to-Excel template language where the spreadsheet itself is the template.

870 1 2
Org-EthereaLogic
etherealogic-aetheriaforge

Databricks-native intelligent data transformation engine — coherence-scored Bronze/Silver/Gold with entity resolution and temporal reconciliation in a single deployable product.

665 1 0
hi-primus
pyoptimus

:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

620 2K 233
artemlops
customer-segmentation-toolkit

Data transformations toolkit made from jupyter notebook: https://www.kaggle.com/fabiendaniel/customer-segmentation

556 1 0
Cydra-Tech
smelt-ai

LLM-powered structured data transformation. Batch process rows through any LLM, get back strictly typed Pydantic models.

502 2 0
productml
blurr

Data aggregation pipeline for running real-time predictive models

469 4 0
ityutin
df-and-order

Using df-and-order your interactions with dataframes become very clean and predictable.

392 3 2
mikeAdamss
tidychef

Python framework for transforming tabulated data with visual relationships into tidy data

391 1 1
brotherzhafif
pythistic

Frequency Table Conversion, Descriptive Statistics and Data Transformation Calculation Tool in Python

244 3 0
enram
vptstools

Tools to work with vertical profile time series.

237 4 1
jameshuh
dataflow-converter

A powerful data format conversion tool - CSV/JSON/XML/Excel/YAML/TSV converter with batch processing and field mapping

173 0 0
chigwell
data-convertible

A new package that provides a structured and reliable way to process user input related to common developer utilities such as JSON, Base64, URL, and hash operations. It uses an LLM to interpret user r

166 1 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery