PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Near Duplicate Python Packages

Python packages with the GitHub topic near-duplicate. Sorted by relevance, with stars and monthly downloads.
KenObata
distributed-curator

Partition-aware MinHash LSH deduplication for large-scale text data curation on Apache Spark

2K 1 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery