PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Data Quality Checks Python Packages

Python packages with the GitHub topic data-quality-checks. Sorted by relevance, with stars and monthly downloads.
open-metadata
openmetadata-ingestion

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

426K 14K 2K
polyaxon
traceml

Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.

134K 533 47
mouradmourafiq
pandas-summary

Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.

112K 533 47
polyaxon
datatile

Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.

112K 533 47
canimus
cuallee

Possibly the fastest DataFrame-agnostic quality check library in town.

103K 246 22
open-metadata
openmetadata-managed-apis

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

28K 14K 2K
re-data
re-data

re_data - fix data issues before your users & CEO would discover them 😊

5K 2K 125
open-metadata
openmetadata-ingestion-core

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

2K 14K 2K
open-metadata
openmetadata-airflow-managed-apis

Airflow REST APIs to create and manage DAGS

1K 14K 2K
socialpoint-labs
sqlbucket

Lightweight library to write, orchestrate and test your SQL ETL. Writing ETL with data integrity in mind.

1K 74 9
dqops
dqops

DQOps Data Quality Operations Center

1K 192 36
vahid110
dbt-dqlens

Data quality for dbt, without writing tests. Auto-generates dbt tests by profiling your database.

1K 3 0
maltzsama
sumeh

Sumeh — Unified Data Quality Framework Sumeh is a unified data quality validation framework supporting multiple backends (PySpark, Dask, Polars, DuckDB, Pandas) with centralized rule configuration.

897 4 0
scienxlab
redflag

Safety net for machine learning pipelines.

665 21 6
arpitg1304
forge-robotics

Convert between robotics dataset formats (RLDS, LeRobot v2/v3, Zarr, HDF5, Rosbag). Inspect, visualize, and analyze datasets. Works with HuggingFace Hub. Built for OpenVLA, Octo, LeRobot, and Diffusion Policy workflows.

625 115 12
weiser-ai
weiser-ai

Data Quality made simple.

451 2 0
ecmwf
grib-check

A tool that validates project-specific conventions of GRIB files

389 0 2
sumanthprabhu
dqc-toolkit

Data Quality Check for Machine Learning

360 7 0
acracker
data-watchtower

数据质量检查工具, 用于诊断数据的问题

321 0 1
open-metadata
openmetadata-sqlalchemy-bigquery

SQLAlchemy dialect for BigQuery by OpenMetadata

264 14K 2K
realdatadriven
etlx-wrapper

Python wrapper for ETLX CLI to run ETL workflows from Python

246 43 3
Ezzaldin97
qprofiler

profile tabular datasets, manage automatic validation for new datasets, automatic handling for quality issues.

220 0 0
litedatum
validatelite

ValidateLite: A lightweight CLI for database schema validation and data quality checks. Ideal for CI/CD, ETL, and data pipelines.

160 3 0
AmmarYasser455
geoqa

GeoQA: Geospatial Data Quality Assessment . one-liner profiling, quality scoring, interactive reports

147 3 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery