PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Robots Txt Python Packages

Python packages with the GitHub topic robots-txt. Sorted by relevance, with stars and monthly downloads.
scrapy
protego

A pure-Python robots.txt parser with support for modern conventions.

7.4M 87 30
GateNLP
ultimate-sitemap-parser

Ultimate Website Sitemap Parser

191K 251 76
eliasdabbas
advertools

advertools - online marketing productivity and analysis tools

168K 1K 242
simonw
datasette-block-robots

Datasette plugin that blocks robots and crawlers using robots.txt

4K 7 0
nzrsky
fast-robotstxt

Fast and modern robots.txt parser and matcher (C++20). Fork of Google's library with zero-copy parsing (~30% faster) and RFC 9309 compliance fixes.

3K 5 0
OwenOrcan
yirabot

YiraBot: Simplifying Web Scraping for All. A user-friendly tool for developers and enthusiasts, offering command-line ease and Python integration. Ideal for research, SEO, and data collection.

1K 17 0
benwebber
texting-robots-py

Python binding for Texting Robots

1K 0 0
jwmorley73
jwm-robotstxt

Provides python access to Googles parser for robot.txt files as used by their GoogleBot webscraper.

1K 1 0
jatsu
django-cs-robots

A django app to change robots.txt from the admin panel without using the database.

594 0 0
sspoisk
agent-readiness-cli

Score any URL for AI-agent readiness — llms.txt, JSON-LD, AI-bot robots.txt, canonical, MCP, meta, sitemap

426 0 0
meysam81
sitemap-harvester

Crawl sitemap of a given website and export metadata of its pages recursively into CSV format.

339 5 0
beb7
greenflare

SEO Web Crawler and Analysis Tool

313 195 21
hanselhansel
context-linter

LLM readiness linter for websites. Audits robots.txt, llms.txt, Schema.org, and content density on a 0-100 scale. Includes MCP server. Published on PyPI: pip install context-cli.

312 3 1
KnuckleheadsClub
rbdt

rbdt is a python library (written in rust) for parsing robots.txt files for large scale batch processing.

240 0 0
serpwings
pyrobotstxt

pyrobotstxt: Python Package for robots.txt Files

198 4 0
alexjc
weboptout

Opt-Out tool to check Copyright reservations in a way that even machines can understand.

174 194 1
simplecto
sitemap-grabber

A python library to recursively crawl every sitemap.xml for a website. Also handles robots.txt and other well-knowns.

164 1 0
William-Fernandes252
astel

An asyncronous web crawling library for Python.

116 0 0
hanselhansel
aeo-cli

Agentic Engine Optimization CLI — audit URLs for AI crawler readiness

2 3 1
    • Data from PyPI, GitHub, ClickHouse, and BigQuery