PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Entity Resolution Python Packages

Python packages with the GitHub topic entity-resolution. Sorted by relevance, with stars and monthly downloads.
J535D165
recordlinkage

A powerful and modular toolkit for record linkage and duplicate detection in Python

4.6M 1K 153
moj-analytical-services
splink

Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends

782K 2K 236
dedupeio
dedupe

:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

76K 4K 570
maxharlow
csvmatch

πŸ”Ž Finds fuzzy matches between CSV files

8K 191 21
benzsevern
goldenmatch

Polyglot entity-resolution + data-quality toolkit. Zero-config auto-config (negative-evidence + Path Y) hits DQbench composite 91.04 (T3 53.8% β†’ 85.5%). Holds 0.96 DBLP-ACM, 0.94 Febrl3, 0.97 NCVR. GoldenCheck β†’ GoldenFlow β†’ GoldenMatch β†’ GoldenPipe. MCP per package, multi-arch containers, Airflow DAGs, browser workbench.

7K 44 6
data61
anonlink

Python implementation of anonymous linkage using cryptographic linkage keys

7K 74 8
SkyeAv
tablassert

Extract knowledge assertions from tabular data into NCATS Translator-compliant KGX NDJSON β€” declaratively, with entity resolution and quality control built in.

5K 5 0
zinggAI
zingg

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

5K 1K 168
cangyuanli
floof

Fuzzymatching made easy

4K 5 0
Picovoice
pvrhino

On-device Speech-to-Intent engine powered by deep learning

4K 701 95
raphschlatt
ads-and

NAND-based author name disambiguation for SAO/NASA ADS publication metadata

2K 1 0
fritshermans
deduplipy

End-to-end deduplication solution

2K 82 8
AI-team-UoA
pyjedai

An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.

2K 94 13
benzsevern
goldenpipe

Polyglot entity-resolution + data-quality toolkit. Zero-config auto-config (negative-evidence + Path Y) hits DQbench composite 91.04 (T3 53.8% β†’ 85.5%). Holds 0.96 DBLP-ACM, 0.94 Febrl3, 0.97 NCVR. GoldenCheck β†’ GoldenFlow β†’ GoldenMatch β†’ GoldenPipe. MCP per package, multi-arch containers, Airflow DAGs, browser workbench.

2K 0 0
benzsevern
goldenflow

Polyglot entity-resolution + data-quality toolkit. Zero-config auto-config (negative-evidence + Path Y) hits DQbench composite 91.04 (T3 53.8% β†’ 85.5%). Holds 0.96 DBLP-ACM, 0.94 Febrl3, 0.97 NCVR. GoldenCheck β†’ GoldenFlow β†’ GoldenMatch β†’ GoldenPipe. MCP per package, multi-arch containers, Airflow DAGs, browser workbench.

2K 1 0
pmart123
cymbology

Identifies and validates financial security ids such as Sedol, Cusip, Isin numbers.

2K 15 1
NickCrews
mismo

The SQL/Ibis powered sklearn of record linkage.

1K 23 4
ihmeuw
easylink

A tool that allows users to build and run highly configurable record linkage/entity resolution pipelines.

933 11 0
vintasoftware
entity-embed

PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.

908 161 16
usc-isi-i2
rltk

Record Linkage ToolKit

818 112 22
Picovoice
pvrhinodemo

On-device Speech-to-Intent engine powered by deep learning

752 701 95
Org-EthereaLogic
etherealogic-aetheriaforge

Databricks-native intelligent data transformation engine β€” coherence-scored Bronze/Silver/Gold with entity resolution and temporal reconciliation in a single deployable product.

665 1 0
DerwenAI
strwythura

Strwythura: construct an entity-resolved knowledge graph from structured data sources and unstructured content sources, implementing an ontology pipeline, plus context engineering for optimizing AI application outcomes within a specific domain. This produces a Streamlit app, with MLOps instrumentation.

602 225 24
databricks-industry-solutions
databricks-arc

Low effort linking and easy de-duplication. Databricks ARC provides a simple, automated, lakehouse integrated entity resolution solution for intra and inter data linking.

559 53 22
    • Data from PyPI, GitHub, ClickHouse, and BigQuery