PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Data Matching Python Packages

Python packages with the GitHub topic data-matching. Sorted by relevance, with stars and monthly downloads.
J535D165
recordlinkage

A powerful and modular toolkit for record linkage and duplicate detection in Python

4.6M 1K 153
moj-analytical-services
splink

Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends

782K 2K 236
RobinL
fuzzymatcher

Record linking package that fuzzy matches two Python pandas dataframes using sqlite3 fts4

11K 286 60
maxharlow
csvmatch

🔎 Finds fuzzy matches between CSV files

8K 191 21
AI-team-UoA
pyjedai

An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.

2K 94 13
vintasoftware
entity-embed

PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.

908 161 16
maxharlow
textmatch

🔎 Finds fuzzy matches between datasets

361 17 0
ihmeuw
person-linkage-case-study

Person linkage case study for PyPI.

218 3 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery