PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Semantic Deduplication Python Packages

Python packages with the GitHub topic semantic-deduplication. Sorted by relevance, with stars and monthly downloads.
MinishLab
semhash

Fast Multimodal Semantic Deduplication & Filtering

73K 924 56
NVIDIA
invisible-rabbit

Scalable data pre processing and curation toolkit for LLMs

176 2K 267
NVIDIA
invisible-unicorn

Scalable data pre processing and curation toolkit for LLMs

85 2K 267
NVIDIA
lava-ray

Scalable data pre processing and curation toolkit for LLMs

1 2K 267
    • Data from PyPI, GitHub, ClickHouse, and BigQuery