PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Structured Data Python Packages

Python packages with the GitHub topic structured-data. Sorted by relevance, with stars and monthly downloads.
promplate
partial-json-parser

Parse partial JSON generated by LLM

6.7M 130 9
autogluon
autogluon-tabular

Fast and Accurate ML in 3 Lines of Code

1.2M 10K 1K
autogluon
autogluon-core

Fast and Accurate ML in 3 Lines of Code

1.2M 10K 1K
autogluon
autogluon-features

Fast and Accurate ML in 3 Lines of Code

1.1M 10K 1K
autogluon
autogluon-common

Fast and Accurate ML in 3 Lines of Code

1.1M 10K 1K
autogluon
autogluon-timeseries

Fast and Accurate ML in 3 Lines of Code

512K 10K 1K
autogluon
autogluon

Fast and Accurate ML in 3 Lines of Code

504K 10K 1K
autogluon
autogluon-multimodal

Fast and Accurate ML in 3 Lines of Code

447K 10K 1K
BoundaryML
baml-py

The AI framework that adds the engineering to prompt engineering (Python/TS/Ruby/Java/C#/Rust/Go compatible)

400K 8K 423
google
langextract

A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

265K 37K 3K
lfoppiano
streamlit-pdf-viewer

Streamlit PDF viewer

230K 194 21
autogluon
autogluon-vision

Fast and Accurate ML in 3 Lines of Code

95K 10K 1K
autogluon
autogluon-text

Fast and Accurate ML in 3 Lines of Code

71K 10K 1K
awslabs
autogluon-extra

Fast and Accurate ML in 3 Lines of Code

29K 10K 1K
awslabs
autogluon-mxnet

Fast and Accurate ML in 3 Lines of Code

29K 10K 1K
auriti-labs
geo-optimizer-skill

GEO (Generative Engine Optimization) toolkit — audit, optimize, and make websites visible to AI search engines (ChatGPT, Perplexity, Claude, Gemini). Based on Princeton KDD 2024 research.

8K 424 47
autogluon
autogluon-eda

AutoML for Image, Text, and Tabular Data

8K 10K 1K
AIMLPM
markcrawl

Fast Python web crawler for RAG and AI ingestion. Extracts clean Markdown from any site for LLMs and vector stores.

7K 2 0
harumiWeb
exstruct

Conversion from Excel to structured JSON (tables, shapes, charts) for LLM/RAG pipelines, and autonomous Excel reading/writing by AI agents via CLI and MCP integration.

4K 145 22
cleanlab
cleanlab-studio

Client interface for all things Cleanlab Studio

4K 32 10
argrelay
argrelay

A data server to CLI tools with attribute search & Tab-completion in Bash shell

4K 23 1
osllmai
indox

The Indox Ecosystem offers integrated AI tools for data workflows. Our four components (IndoxArcg, IndoxMiner, IndoxJudge, and IndoxGen) enhance AI applications with advanced retrieval, extraction, evaluation, and generation capabilities, supporting multiple document formats and LLM providers.

2K 19 2
emcf
thepipe-api

Get clean data from tricky documents, powered by VLMs.

2K 2K 99
BigFoot3
pythia-geo

The oracle that tells you how AIs see your site — GEO/AEO audit CLI for Python

2K 0 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery