PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Document Conversion Python Packages

Python packages with the GitHub topic document-conversion. Sorted by relevance, with stars and monthly downloads.
reclamador
document-clipper

A set of utility classes and functions to process documents with Python

2K 4 3
iyulab
unhwp

High-performance Rust library for converting Korean HWP/HWPX documents to Markdown, plain text, and JSON with streaming API and .NET/Python bindings.

2K 9 2
nanonets
llm-data-converter

Convert any document format into LLM-ready data format (markdown) with advanced intelligent document processing capabilities powered by pre-trained models.

1K 7 1
psjinx
html2latex

Convert WYSIWYG HTML to LaTeX with typed ASTs, full table support, and 100% test coverage

992 17 13
herrkaefer
anything2md

Convert documents to Markdown using Cloudflare Workers AI toMarkdown.

374 1 0
nanonets
document-data-extractor

Convert any document format into LLM-ready data format (markdown) with advanced intelligent document processing capabilities powered by pre-trained models.

363 7 1
faizkhairi
file2md

Convert PDF and DOCX to clean, grep-friendly Markdown for AI/IDE workflows

151 0 0
scivision
loutils

Headless document conversion and printing using LibreOffice or Microsoft Office

151 28 0
gonzalopezgil
docx2md-cli

High-fidelity Word (.docx) to Markdown converter. Preserves tables (vMerge), footnotes, field codes, bibliography, lists, and images where Pandoc and others fall short.

150 1 0
markdownbridge
markdownbridge

Python SDK for the MarkdownBridge OCR API — convert documents, images, and PDFs to Markdown with one line of code

115 0 0
YashKasare21
docstream

Professional document conversion library (PDF ↔ LaTeX)

104 2 1
marimo-marine23
xlmelt

Convert complex Excel files into AI-readable JSON/HTML

78 0 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery