PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Web Crawling Python Packages

Python packages with the GitHub topic web-crawling. Sorted by relevance, with stars and monthly downloads.
apify
crawlee

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

536K 9K 742
omkarcloud
botasaurus

The All in One Framework to Build Undefeatable Scrapers

37K 5K 424
omkarcloud
bota

The All in One Framework to Build Undefeatable Scrapers

30K 5K 424
omkarcloud
javascript-fixes

The All in One Framework to Build Undefeatable Scrapers

28K 5K 424
omkarcloud
botasaurus-humancursor

The All in One Framework to Build Undefeatable Scrapers

28K 5K 424
omkarcloud
botasaurus-server

The All in One Framework to Build Undefeatable Scrapers

3K 5K 424
omkarcloud
bose

The Ultimate Web Scraping Framework

648 4K 385
Changwanseo
genmine

GenBank Record downloader for taxonomists

596 8 0
INNOVINATI
microwler

A micro-framework for asynchronous deep crawls and web scraping with Python

417 13 1
michellepellon
pricetag

Extract price and currency information from unstructured text. Pure Python library for e-commerce data extraction and web scraping.

405 0 0
thesp0nge
nightcrawler-mitm

A python program that crawls a website and tries to stress it, polluting forms with bogus data

294 26 1
r0botsorg
agentsearchcli

Give any AI agent the ability to search, crawl, and extract the web.

265 2 0
mike-gee
webtranspose

Web scraping API for building AI applications.

230 40 2
alyakhtar
katastrophe

Command Line Tool to download torrents

220 82 12
Thordata
thordata-firecrawl

Thordata Crawl – Turn any website into AI-ready data with a single API.

200 2 0
heleusbrands
insite

A lightning fast tool for crawling websites and compiling PDFs of their pages

148 1 0
sadiuysal
iflow-mcp-sadiuysal-crawl4ai-mcp-server

🕷️ A lightweight Model Context Protocol (MCP) server that exposes Crawl4AI web scraping and crawling capabilities as tools for AI agents. Similar to Firecrawl's API but self-hosted and free. Perfect for integrating web scraping into your AI workflows with OpenAI Agents SDK, Cursor, Claude Code, and other MCP-compatible tools.

147 88 11
innovinati
scrapy-googlechat

Send crawl reports from Scrapy spiders to Google Chat

131 1 1
omkarcloud
pg-cache-storage

The All in One Framework to Build Undefeatable Scrapers

126 5K 424
William-Fernandes252
astel

An asyncronous web crawling library for Python.

116 0 0
HuberTRoy
seen

Supported JavaScript Web crawling framework for everyone.

98 13 3
ZeroCool940711
new-frontera

A scalable frontier for web crawlers

72 0 0
omkarcloud
sqlite-cache-storage

The All in One Framework to Build Undefeatable Scrapers

62 5K 424
    • Data from PyPI, GitHub, ClickHouse, and BigQuery