PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Pdf To Json Python Packages

Python packages with the GitHub topic pdf-to-json. Sorted by relevance, with stars and monthly downloads.
docling-project
docling

Get your documents ready for gen AI

7.2M 60K 4K
Unstructured-IO
unstructured

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

5.4M 15K 1K
docling-project
docling-slim

Get your documents ready for gen AI

1.1M 60K 4K
graphlit
graphlit-client

Python client library for Graphlit Platform

25K 20 3
NameetP
pdfmux

PDF extraction that checks its own work. #2 reading order accuracy — zero AI, zero GPU, zero cost.

3K 63 7
Unstructured-IO
unstructured-cpu

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

3K 15K 1K
nanonets
docstrange

Extract and convert data from any document, images, pdfs, word doc, ppt or URL into multiple formats (Markdown, JSON, CSV, HTML) with intelligent structured data extraction and advanced OCR.

2K 1K 132
Hugues-DTANKOUO
olgadoc

Four formats. One engine. PDF, DOCX, XLSX, HTML → Markdown and typed JSON, 15–40× faster than equivalent-quality OSS. Rust core with strictly-typed Python bindings.

1K 8 0
DS4SD
docling-google-ocr

Get your documents ready for gen AI

232 60K 4K
docling-project
docling-enhanced

SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for powering downstream workflows such as gen AI applications.

95 60K 4K
DS4SD
extended-docling

Get your documents ready for gen AI

88 60K 4K
docling-project
mseep-docling

Get your documents ready for gen AI

77 60K 4K
    • Data from PyPI, GitHub, ClickHouse, and BigQuery