PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Structured Data Extraction Python Packages

Python packages with the GitHub topic structured-data-extraction. Sorted by relevance, with stars and monthly downloads.
nanonets
llm-data-converter

Convert any document format into LLM-ready data format (markdown) with advanced intelligent document processing capabilities powered by pre-trained models.

1K 7 1
abdo-Mansour
axetract

Low-Cost Cross-Domain Web Structured Information Extraction using specialized LoRA adapters.

364 15 0
nanonets
document-data-extractor

Convert any document format into LLM-ready data format (markdown) with advanced intelligent document processing capabilities powered by pre-trained models.

363 7 1
chigwell
financial-parser

A new package is designed to analyze financial news headlines and extract key structured information such as company names, financial targets, timeframes, and goal updates from text inputs. It simplif

340 1 0
chigwell
text-snippet-summarizer

A new package facilitates extracting a concise, structured summary from user-provided news headlines or brief texts by utilizing pattern matching and LLM interactions. This tool aims to help researche

172 1 0
chigwell
textract-io

A new package designed to facilitate structured extraction of key information from scientific or factual text inputs, enabling precise summaries, data extraction, or categorization based on user promp

141 1 0
chigwell
dns-insight-extractor

This new package facilitates extracting structured insights from text-based content related to domain-specific issues, such as analyzing DNS blocking reports. Given unstructured text describing networ

131 1 0
chigwell
resume-yaml-builder

A new package that leverages language models to transform structured YAML data into well-formatted resume PDFs. Users provide their resume details in YAML format, and the package extracts key informat

108 1 0
chigwell
iac-summarizer

A new package that analyzes technical arguments and extracts structured summaries from text discussions about infrastructure-as-code practices. It takes user-provided text (such as forum posts, articl

98 1 0
msoedov
validex

Simplifies the retrieval, extraction, and training of structured data from various unstructured sources.

64 144 14
    • Data from PyPI, GitHub, ClickHouse, and BigQuery