shingling
Remove duplicate documents via popular algorithms such as SimHash, SpotSig, Shingling, etc.