PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Big Data Analytics Python Packages

Python packages with the GitHub topic big-data-analytics. Sorted by relevance, with stars and monthly downloads.
ydataai
ydata-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

2M 14K 2K
ydataai
pandas-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

648K 14K 2K
lithops-cloud
lithops

A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀

181K 365 122
v6d-io
vineyard

vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)

33K 951 133
v6d-io
vineyard-bdist

vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)

21K 951 133
v6d-io
vineyard-ml

vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)

18K 951 133
Data-Centric-AI-Community
fg-data-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

10K 14K 2K
v6d-io
airflow-provider-vineyard

Vineyard provider for apache-airflow

3K 951 133
v6d-io
vineyard-io

vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)

2K 951 133
OwenOrcan
yirabot

YiraBot: Simplifying Web Scraping for All. A user-friendly tool for developers and enthusiasts, offering command-line ease and Python integration. Ideal for research, SEO, and data collection.

1K 17 0
v6d-io
vineyard-dask

Vineyard integration with Dask

968 951 133
v6d-io
vineyard-llm

vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)

834 951 133
pywren
pywren-ibm-cloud

A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀

827 365 122
Data-Centric-AI-Community
datakit-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

820 14K 2K
Data-Centric-AI-Community
test-new-data-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

748 14K 2K
pandas-profiling
haiqv-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

391 14K 2K
v6d-io
vineyard-kedro

vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)

343 951 133
v6d-io
vineyard-migrate

Object migration drivers for vineyard

303 951 133
rapticore
rapticoressvc

Rapticore SSVC

257 9 1
v6d-io
vineyard-pyspark

vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)

246 951 133
theoliverlear
crypto-trader-analysis

Analysis and ML components for the Crypto Trader platform (news sentiment, training, utilities).

207 1 0
v6d-io
vineyard-ray

vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)

41 951 133
    • Data from PyPI, GitHub, ClickHouse, and BigQuery