PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Docx Python Packages

Python packages with the GitHub topic docx. Sorted by relevance, with stars and monthly downloads.
nolze
msoffcrypto-tool

Python tool and library for decrypting and encrypting MS Office files using passwords or other keys

8.8M 616 91
docling-project
docling

Get your documents ready for gen AI

7.2M 60K 4K
Unstructured-IO
unstructured

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

5.4M 15K 1K
docling-project
docling-slim

Get your documents ready for gen AI

1.1M 60K 4K
pqzx
htmldocx

Convert html to docx

954K 87 60
opendatalab
mineru

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

305K 64K 5K
dfop02
html-for-docx

Convert html to docx

222K 61 15
opendatalab
magic-pdf

Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.

76K 64K 5K
yobix-ai
extractous

Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.

34K 2K 95
shloktech
md2docx-python

Simple and straight forward Python utility that converts a Microsoft Word document (`.docx`) to a Markdown file (`.md`) and vice versa. It supports multiple Markdown elements, including headings, bold and italic text, both unordered and ordered lists, and many more.

14K 48 5
shcherbak-ai
contextgem

ContextGem: Effortless LLM extraction from documents

13K 2K 156
Luizhcrs
template-engine-ia

Document normalization engine — learn a template from examples and convert any document automatically via LLM

12K 1 0
u9401066
asset-aware-mcp

Asset-Aware MCP Server — AI Agent precisely accesses tables, figures, sections from PDFs + .docx round-trip editing (DFM) with 46 tools / 13 resources, segmentation export, layout overlay, OCR preprocessing, knowledge graph (LightRAG)

11K 0 1
BramAlkema
openxml-audit

Validate Office files in pure Python with Open XML SDK parity, pytest fixtures, and CI hooks.

8K 2 0
badbye
docxpy

A pure python based utility to extract text and images from docx files.

6K 5 4
explosion
spacy-layout

📚 Process PDFs, Word documents and more with spaCy

5K 901 64
opendatalab
mineru-selfhosted-mcp

MCP bridge for a self-hosted MinerU API

5K 64K 5K
rocklambros
any2md

Convert PDF, DOCX, HTML, and TXT files — or web pages by URL — to clean, LLM-optimized Markdown with YAML frontmatter.

5K 18 4
SecurityRonin
docx-mcp-server

MCP server for reading and editing Word (.docx) documents with track changes, comments, footnotes, and structural validation

5K 11 5
ykarapazar
word-mcp-live

The only MCP server that edits Word documents while they're open — 114 tools, live editing, tracked changes, per-action undo

4K 86 23
sunholo-data
ailang-parse

Universal document parsing and generation in AILANG. Deterministic Office (DOCX/PPTX/XLSX) extraction, AI-powered PDF/image parsing, 9-format document generation.

4K 0 1
henrihapponen
docxedit

Edit Word (.docx) documents effortlessly without changing the original formatting.

3K 23 3
turulomio
unogenerator

Libreoffice files generator programmatically with python and Libreoffice server instances

3K 15 0
Unstructured-IO
unstructured-cpu

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

3K 15K 1K
    • Data from PyPI, GitHub, ClickHouse, and BigQuery