PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Data Lineage Python Packages

Python packages with the GitHub topic data-lineage. Sorted by relevance, with stars and monthly downloads.
reata
sqllineage

SQL Lineage Analysis Tool powered by Python

1.6M 2K 276
elementary-data
elementary-data

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

1.3M 2K 217
open-metadata
openmetadata-ingestion

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

426K 14K 2K
laminlabs
lamindb

Open-source data lakehouse for biology. Context and memory for datasets and models at scale, across infrastructure. Query, trace & validate with a lineage-native lakehouse that supports bio-formats, registries & ontologies. 🍊YC S22

96K 264 25
open-metadata
openmetadata-managed-apis

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

28K 14K 2K
datajoint
datajoint

Relational Workflows: where database schemas define executable data pipelines.

22K 192 96
laminlabs
lamindb-core

Open-source data lakehouse for biology. Context and memory for datasets and models at scale, across infrastructure. Query, trace & validate with a lineage-native lakehouse that supports bio-formats, registries & ontologies. 🍊YC S22

22K 264 25
rocky-data
dagster-rocky

Rust SQL transformation engine with branches, replay, column-level lineage, compile-time type safety, and per-model cost attribution. Single static binary; adapters for Databricks, Snowflake, BigQuery, DuckDB. Apache 2.0.

15K 252 10
vmware
quickstart-vdk

One framework to develop, deploy and operate data workflows with Python and SQL.

10K 481 67
vmware
vdk-core

Versatile Data Kit SDK Core

3K 481 67
manasdutta04
openblame

Local-first AI investigation CLI for OpenMetadata data pipelines.

3K 2 0
grai-io
grai-schemas

No description available

2K 314 20
grai-io
grai-client

No description available

2K 314 20
data-drift
driftdb

Metrics Observability & Troubleshooting

2K 331 12
vmware
vdk-jupyterlab-extension

One framework to develop, deploy and operate data workflows with Python and SQL.

2K 481 67
open-metadata
openmetadata-ingestion-core

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

2K 14K 2K
open-metadata
openmetadata-airflow-managed-apis

Airflow REST APIs to create and manage DAGS

1K 14K 2K
kishanraj41
autolineage

Automatic ML data lineage tracking — zero manual logging

1K 3 0
slidoapp
dbt-superset-lineage

Make dbt docs and Apache Superset talk to one another

1K 157 22
grai-io
grai-source-dbt

No description available

1K 314 20
data-drift
datagit

Metrics Observability & Troubleshooting

1K 331 12
vmware
vdk-control-cli

One framework to develop, deploy and operate data workflows with Python and SQL.

937 481 67
vmware
vdk-lineage-model

VDK Lineage Model plugin defines common lineage model and classes used for managing lineageinformation in other VDK plugins.

879 481 67
tokern
data-lineage

Generate and Visualize Data Lineage from query history

768 327 45
    • Data from PyPI, GitHub, ClickHouse, and BigQuery