PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Pdf2markdown Python Packages

Python packages with the GitHub topic pdf2markdown. Sorted by relevance, with stars and monthly downloads.
PaddlePaddle
paddleocr

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

2.2M 78K 10K
AdemBoukhris457
doctra

📄🔍 Parse, extract, and analyze documents with ease 📄🔍

897 205 33
PaddlePaddle
fadoudou2

Awesome OCR toolkits based on PaddlePaddle(8.6M ultra-lightweight pre-trained model, support training and deployment among server, mobile, embedded and IoT devices)

481 78K 10K
OpenDCAI
flash-mineru

Fast Inference Architecture for MinerU

465 53 7
PaddlePaddle
paddleocrwordleveldetection

Awesome OCR toolkits based on PaddlePaddle (8.6M ultra-lightweight pre-trained model, support training and deployment among server, mobile, embeded and IoT devices

296 78K 10K
PaddlePaddle
langchain-paddleocr

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

231 78K 10K
PaddlePaddle
je-paddleocr

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

231 78K 10K
PaddlePaddle
ppocrlabel-japan

PPOCRLabelv2 is a semi-automatic graphic annotation tool suitable for OCR field, with built-in PP-OCR model to automatically detect and re-recognize data. It is written in Python3 and PyQT5, supporting rectangular box, table, irregular text and key information annotation modes. Annotations can be directly used for the training of PP-OCR detection and recognition models.

163 78K 10K
svretina
nougat-mcp

MCP server for Meta's nougat-ocr. Instruct your agent to convert academic papers to Markdown files with high mathematical accuracy

126 3 0
PaddlePaddle
paddleocr-fagougou

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

1 78K 10K
PaddlePaddle
fadoudou

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

1 78K 10K
    • Data from PyPI, GitHub, ClickHouse, and BigQuery