PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Web Crawler Python Packages

Python packages with the GitHub topic web-crawler. Sorted by relevance, with stars and monthly downloads.
firecrawl
firecrawl-py

🔥 Search, scrape, and clean the web for AI agents.

6.8M 121K 7K
firecrawl
firecrawl

🔥 Search, scrape, and clean the web for AI agents.

971K 121K 7K
apify
crawlee

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

536K 9K 742
ScrapeGraphAI
scrapegraph-py

Official Python SDK for the ScrapeGraph AI API. Smart scraping, search, crawling, markdownify, agentic browser automation, scheduled jobs, and structured data extraction

318K 76 14
scrapfly
scrapfly-sdk

Official Python SDK for the Scrapfly platform: web scraping, screenshots, AI extraction, crawling, and a remote anti-bot browser. Integrates with Scrapy, LlamaIndex, and LangChain.

256K 55 15
jpjacobpadilla
stealth-requests

Undetected web-scraping & seamless HTML parsing in Python!

45K 470 48
kreuzberg-dev
kreuzcrawl

High-performance web crawling engine with bindings for 11 languages

10K 97 12
jpjacobpadilla
search-ai-core

Search the web with advanced filters and LLM-friendly output formats!

8K 57 2
us
crw

Fast, lightweight Firecrawl alternative in Rust. Web scraper, crawler & search API with MCP server for AI agents. Drop-in Firecrawl-compatible API (/v1/scrape, /v1/crawl, /v1/search). 2.3x faster than Tavily, 1.5x faster than Firecrawl in 1K-URL benchmarks. 6 MB RAM, single binary. Self-host or use managed cloud.

6K 89 5
spider-rs
spider-rs

Spider ported to Python

5K 107 17
abo123456789
redis-queue-tool

Distributed task redisqueue(最简单python分布式函数调度框架)

2K 65 19
MoonyFringers
ladon-crawl

A Python framework for building structured, resumable web crawlers — designed for domains where data quality matters.

1K 5 1
abo123456789
leek

Distributed task redisqueue(最简单python分布式函数调度框架)

1K 65 19
MehmetYukselSekeroglu
hivewebcrawler

Python 3.x Web Crawler, Images, Urls, Emails, Phone numbers

1K 3 1
Algebra-FUN
wereadscan

扫描“微信读书”已购图书并下载本地PDF的爬虫

876 992 173
dotnetpower
infomesh

🕸️ Fully decentralized P2P search engine for LLMs — no API key, no billing, forever free. MCP-native web search via Kademlia DHT + libp2p + FTS5. Production stable ✅ → pip install infomesh

857 3 0
GeiserX
wayback-archive

Download complete websites from the Wayback Machine with full asset preservation for offline viewing

776 10 6
rivermont
spidy-web-crawler

The simple, easy to use command line web crawler.

659 354 69
BHM-Bob
mbapy

BA_PY: Optimize Your Workflow with Python!

658 3 1
imyourboyroy
web-scraper-toolkit

A powerful, standalone web scraping toolkit using Playwright and various parsers.

519 5 2
Kochat-framework
kochat

Opensource Korean chatbot framework

429 462 186
MoonyFringers
ladon-hackernews

Hacker News adapter for the Ladon crawler framework.

409 0 1
GoncaloMark
cobweb-lnx

CobWeb is a Python library for web scraping. The library consists of two classes: Spider and Scraper.

380 39 2
gicornachini
bolsa

Biblioteca feita em python com o objetivo de facilitar o acesso a dados de seus investimentos na bolsa de valores(B3/CEI).

341 65 18
    • Data from PyPI, GitHub, ClickHouse, and BigQuery