PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Data Pipeline Python Packages

Python packages with the GitHub topic data-pipeline. Sorted by relevance, with stars and monthly downloads.
olirice
flupy

Fluent data pipelines for python and your shell

1.4M 195 15
elementary-data
elementary-data

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

1.3M 2K 217
pydoit
doit

CLI task management & automation tool

743K 2K 190
airbytehq
airbyte-source-declarative-manifest

Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.

324K 21K 5K
pipeline-tools
gusty

Making DAG construction easier

52K 285 13
xorq-labs
xorq

Composable expressions for data

52K 510 27
bruin-data
ingestr

ingestr is a CLI tool to copy data between any databases with a single command seamlessly.

51K 3K 119
airbytehq
airbyte-source-facebook-marketing

Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.

23K 21K 5K
airbytehq
airbyte-source-google-ads

Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.

20K 21K 5K
airbytehq
airbyte-source-s3

Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.

19K 21K 5K
airbytehq
airbyte-source-salesforce

Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.

16K 21K 5K
rocky-data
dagster-rocky

Rust SQL transformation engine with branches, replay, column-level lineage, compile-time type safety, and per-model cost attribution. Single static binary; adapters for Databricks, Snowflake, BigQuery, DuckDB. Apache 2.0.

15K 252 10
InfuseAI
piperider-nightly

Code review for data in dbt

15K 494 23
airbytehq
airbyte-source-github

Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.

14K 21K 5K
airbytehq
airbyte-source-shopify

Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.

12K 21K 5K
airbytehq
airbyte-source-google-sheets

Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.

11K 21K 5K
ryan-evans-git
ematix-flow

Move data between databases, files, and streams from Python. 5.87× faster than PySpark.

10K 0 0
airbytehq
airbyte-source-zendesk-support

Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.

10K 21K 5K
airbytehq
airbyte-source-faker

Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.

9K 21K 5K
airbytehq
airbyte-source-google-drive

Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.

9K 21K 5K
airbytehq
airbyte-source-bing-ads

Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.

8K 21K 5K
airbytehq
airbyte-source-google-analytics-data-api

Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.

8K 21K 5K
airbytehq
airbyte-source-gcs

Source implementation for Gcs.

8K 21K 5K
sparkfish
augraphy

Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes

8K 536 61
    • Data from PyPI, GitHub, ClickHouse, and BigQuery