PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Ocr Recognition Python Packages

Python packages with the GitHub topic ocr-recognition. Sorted by relevance, with stars and monthly downloads.
opendataloader-project
opendataloader-pdf

PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.

134K 21K 2K
run-llama
liteparse

A fast, helpful, and open-source document parser

37K 5K 341
reactor-no8
neots

NeoTextSynthesizer is a high-performance OCR training data generator.

14K 1 0
bropines
chrome-lens-py

Library to use Google Lens OCR for free, via API used in Chromium on python.

5K 62 8
rtr46
meikiocr

high-speed, high-accuracy, local ocr for japanese video games

5K 77 3
StabRise
scaledp

ScaleDP is an Open-Source extension of Apache Spark for Document Processing

3K 18 1
opendataloader-project
langchain-opendataloader-pdf

A LangChain integration for OpenDataLoader PDF

3K 33 4
LATIS-DocumentAI-Group
documentai-std

The main standards for Latis Document AI project

2K 3 0
clovaai
synthtiger

Official Implementation of SynthTIGER (Synthetic Text Image Generator), ICDAR 2021

1K 573 109
gnana70
ocr-tamil

OCR Tamil is a powerful tool that can detect and recognize text in Tamil images with high accuracy on Natural Scenes

989 88 16
emedvedev
aocr

A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

731 1K 251
Anish-M-code
pdftotext3

A simple pdftotext conversion tool for Windows 8.1/10/11 and FEDORA/UBUNTU/DEBIAN/ARCH based linux distros using poppler-utils and Google's tesseract-ocr.

469 22 2
Joshuaatanu
ocr-genai-beta

This Python package allows you to perform OCR on images using Tesseract and interpret the results using generative AI models like OpenAI's GPT ,Anthropic's claude and Google's Gemini.

331 4 0
ianzhao05
textshot

Python tool for grabbing text via screenshot

278 2K 255
digidigital
ocrtestdata

Generate large amounts of image-based PDF test data for file-based OCR and Document Management Solutions.

265 0 0
StabRise
pyspark-pdf

PDF DataSource for Apache Spark, allow to read PDF files directly to the DataFrame and ocr it

235 81 4
nuhmanpk
pyplatex

Simple , Scalable and Ready to use ANPR package for Automatic Number Plate Recognition

229 6 0
microsoft
genalog

Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

229 355 34
VerisimilitudeX
ocr-pdf2txt

Use Optical Character Recognition technology to convert scanned PDFs into TXT files locally.

207 1 0
M4cs
twohundrediq

HQ Trivia Bot for Windows Using LonelyScreen

177 4 1
tjkessler
tesseract-positional

Tool to save positional OCR data to a text file

126 0 0
snakers4
silero-ocr

Simple optical character recognition (OCR) by Silero

109 0 0
hanifabd
pisahkan-ktp

Python Package for Information Extraction and Segmentation - Segmentasi KTP Indonesia - Indonesian ID Card - Information Segmentation

62 10 3
    • Data from PyPI, GitHub, ClickHouse, and BigQuery