PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Ocr Python Packages

Python packages with the GitHub topic ocr. Sorted by relevance, with stars and monthly downloads.
pymupdf
pymupdf

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

79.5M 10K 726
run-llama
llama-cloud

Python SDK for OCR and document parsing in the cloud with LlamaParse

21.3M 29 7
Unstructured-IO
unstructured

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

5.4M 15K 1K
pymupdf
pymupdfb

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

4.8M 10K 726
jaidedai
easyocr

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

3M 29K 4K
RapidAI
rapidocr

📄 Awesome OCR multiple programing languages toolkits based on ONNX Runtime, OpenVINO, MNN, PaddlePaddle, TensorRT and PyTorch.

2.5M 7K 634
PaddlePaddle
paddleocr

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

2.2M 78K 10K
RapidAI
rapidocr-onnxruntime

📄 Awesome OCR multiple programing languages toolkits based on ONNX Runtime, OpenVINO, MNN, PaddlePaddle, TensorRT and PyTorch.

1.1M 7K 634
ocrmypdf
ocrmypdf

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

783K 34K 2K
robocorp
rpaframework

Collection of open-source libraries and tools for Robotic Process Automation (RPA), designed to be used with both Robot Framework and Python

673K 2K 270
robocorp
rpaframework-core

Collection of open-source libraries and tools for Robotic Process Automation (RPA), designed to be used with both Robot Framework and Python

517K 2K 270
robocorp
rpaframework-pdf

Collection of open-source libraries and tools for Robotic Process Automation (RPA), designed to be used with both Robot Framework and Python

490K 2K 270
sirfz
tesserocr

A Python wrapper for the tesseract-ocr API

367K 2K 259
mindee
python-doctr

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

318K 6K 644
opendatalab
mineru

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

305K 64K 5K
sml2h3
ddddocr

带带弟弟 通用验证码识别OCR pypi版

268K 14K 2K
Layout-Parser
layoutparser

A Unified Toolkit for Deep Learning Based Document Image Analysis

170K 6K 534
opendataloader-project
opendataloader-pdf

PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.

134K 21K 2K
felixdittrich92
onnxtr

OnnxTR a docTR (Document Text Recognition) library Onnx pipeline wrapper - for seamless, high-performing & accessible OCR

93K 180 18
bpwhelan
gamesentenceminer

An immersion toolkit for learning Languages through games and other visual media.

79K 631 35
opendatalab
magic-pdf

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

76K 64K 5K
breezedeus
cnocr

CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scenarios and can be used directly after installation. 【基于 PyTorch/MXNet 的中文/英文 OCR Python 包。】

74K 4K 538
faustomorales
keras-ocr

A packaged and flexible version of the CRAFT text detector and Keras CRNN recognition model.

61K 1K 372
breezedeus
cnstd

CnSTD: 基于 PyTorch/MXNet 的 中文/英文 场景文字检测(Scene Text Detection)、数学公式检测(Mathematical Formula Detection, MFD)、篇章分析(Layout Analysis)的Python3 包

57K 791 115
    • Data from PyPI, GitHub, ClickHouse, and BigQuery