PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Language Identification Python Packages

Python packages with the GitHub topic language-identification. Sorted by relevance, with stars and monthly downloads.
pemistahl
lingua-language-detector

The most accurate natural language detection library for Python, suitable for short text and mixed-language text

1.7M 2K 60
searxng
fasttext-predict

fasttext with wheels and no external dependency, but only the predict method (<1MB)

1M 19 9
adbar
py3langid

Faster, modernized fork of the language identification tool langid.py

241K 62 8
adbar
simplemma

Simple multilingual lemmatizer for Python, especially useful for speed and efficiency

120K 199 15
DoodleBears
split-lang

✨ Split text by languages (e.g. 你喜欢看アニメ吗 -> 你喜欢看 | アニメ | 吗) for NLP tasks (e.g. parse, TTS). Powered by fasttext and budoux

24K 74 11
nitotm
eld

Fast and accurate natural language detection. Detector written in Python. Nito-ELD, ELD.

6K 21 3
currentsapi
fastlangid

fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-hant)

3K 43 9
rosette-api
rosette-api

Babel Street Analytics Client Library for Python

2K 38 37
cisnlp
glotscript

[LREC 2024] 🖋 Resource and Tool for Writing System Identification

2K 21 2
DoodleBears
langdetect-py

Port of Google's language-detection library to Python.

2K 1 0
mbanon
fastspell

Targetted language identifier, based on FastText and Hunspell.

1K 38 5
textpipe
textpipe

textpipe: clean and extract metadata from text

752 302 25
mbanon
fastspell-dictionaries

Targetted language identifier, based on FastText and Hunspell.

663 38 5
rmalouf
polars-whichlang

Language identification plugin for polars

274 2 0
py-lidbox
lidbox

End-to-end spoken language identification out of the box.

240 48 13
hiredscorelabs
seqtolang

Multi Langauge Documents Langauge identification

219 28 3
sagorbrur
codeswitch

CodeSwitch is a NLP tool, can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed data.

204 37 5
bung87
whatlangid

This project is build on top of whatthelang and langid

147 3 0
fievelk
pylade

PyLaDe - Language Detection tool.

133 6 1
UBC-NLP
afrolid

AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages..

97 39 11
jonathandunn
geolid

Geographically-informed language identification

73 7 1
    • Data from PyPI, GitHub, ClickHouse, and BigQuery