PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Spark Streaming Python Packages

Python packages with the GitHub topic spark-streaming. Sorted by relevance, with stars and monthly downloads.
databrickslabs
databricks-labs-dqx

Databricks framework to validate Data Quality of pySpark DataFrames and Tables

5.5M 414 113
databrickslabs
dbldatagen

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines

287K 465 94
HashLoad
freeza-offset

Spark stream consumption commit in kafka consumer group

1K 16 1
prophecy-io
prophecy-spark-ai

High-performance AI/ML library for Spark to build and deploy your LLM applications in production.

197 51 15
    • Data from PyPI, GitHub, ClickHouse, and BigQuery