near-duplicate-detection
ISCC: International Standard Content Code
Python library for detecting near duplicate texts in a corpus at scale.