PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Etl Framework Python Packages

Python packages with the GitHub topic etl-framework. Sorted by relevance, with stars and monthly downloads.
apache
sf-hamilton

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

188K 2K 188
apache
apache-hamilton

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

34K 2K 188
legout
flowerpower

Simple Workflow Framework based on Hamilton

15K 24 1
dagworks-inc
sf-hamilton-sdk

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

15K 2K 188
pathwaycom
pathway

Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

12K 63K 2K
dagworks-inc
sf-hamilton-ui

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

11K 2K 188
dagworks-inc
sf-hamilton-lsp

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

8K 2K 188
dotflow-io
dotflow

🎲 Dotflow turns an idea into flow! — Lightweight Python library for execution pipelines

4K 7 8
amsdal
amsdal-glue-connections

A Python library for querying multiple databases simultaneously through a unified interface, enabling data virtualization without moving data.

4K 4 0
Mmodarre
lakehouse-plumber

The Metadata Driven framework for Databricks Lakeflow Declarative Pipelines (formerly Delta Live Tables). Metadata framework that generates production ready Pyspark code for Lakeflow Declarative Pipelines

3K 59 10
quintoandar
butterfree

A tool for building feature stores.

3K 318 38
amsdal
amsdal-glue-core

A Python library for querying multiple databases simultaneously through a unified interface, enabling data virtualization without moving data.

3K 4 0
amsdal
amsdal-glue

AMSDAL Glue is a Python interface providing high-level abstraction for interacting with multiple databases simultaneously, simplifying the development and maintenance process.

2K 4 0
crate
cratedb-fivetran-destination

CrateDB Fivetran Destination connector, for loading data into CrateDB.

1K 0 0
socialpoint-labs
sqlbucket

Lightweight library to write, orchestrate and test your SQL ETL. Writing ETL with data integrity in mind.

1K 74 9
RLado
canonada

Canonada is a data science framework that helps you build production-ready streaming pipelines for data processing in Python

1K 1 2
usc-isi-i2
kgtk

Knowledge Graph Toolkit

1K 419 62
datacoolie
datacoolie

Metadata-driven ETL framework for portable data pipelines across Polars, Spark, Fabric, Databricks, and AWS.

947 8 0
ContextData
vector-etl

Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications

713 108 19
amsdal
amsdal-glue-sql-parser

AMSDAL Glue is a Python interface providing high-level abstraction for interacting with multiple databases simultaneously, simplifying the development and maintenance process.

700 4 0
dataforgelabs
dataforge-core

Command line compiler for dataforge core projects

645 58 2
geopython
stetl

Stetl, Streaming ETL, is a lightweight geospatial processing and ETL framework written in Python.

632 88 33
dagworks-inc
sf-hamilton-contrib

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

411 2K 188
pyprogrammerblog
tiny-blocks

Tiny Blocks to build large and complex data pipelines!

396 3 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery