PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Pdf Extractor Pretrain Python Packages

Python packages with the GitHub topic pdf-extractor-pretrain. Sorted by relevance, with stars and monthly downloads.
opendatalab
mineru

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

305K 64K 5K
opendatalab
magic-pdf

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

76K 64K 5K
opendatalab
mineru-selfhosted-mcp

MCP bridge for a self-hosted MinerU API

5K 64K 5K
Kubenew
pdf2struct

`pdf2struct` extracts structured JSON from PDF documents.

392 1 0
opendatalab
xh-pdf-parser

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

141 64K 5K
opendatalab
lazyllm-magic-pdf

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

124 64K 5K
    • Data from PyPI, GitHub, ClickHouse, and BigQuery