PyRank
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About

Wayback Machine Python Packages

Python packages with the GitHub topic wayback-machine. Sorted by relevance, with stars and monthly downloads.
akamhy
waybackpy

Wayback Machine API interface & a command-line tool

2.5M 579 40
ArchiveBox
archivebox

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

7K 28K 2K
webis-de
archive-query-log

Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web Archives.

3K 34 0
pjsier
scrapy-wayback-middleware

Scrapy middleware for submitting URLs to the Internet Archive Wayback Machine

3K 11 3
muchdogesec
history4feed

Creates a complete full text historical archive for an RSS or ATOM feed.

2K 133 5
alonebeast002
beastcrypt

JS & Source Map Secret Scanner - hunt exposed API keys, tokens & internal paths from live targets ☕

2K 0 0
bitdruid
pywaybackup

Query and download archive.org as simple as possible.

1K 112 23
Barabazs
archivooor

Archivooor is a Python package for interacting with the archive.org API.

1K 3 1
agude
wayback-machine-archiver

A Python script to submit web pages to the Wayback Machine for archiving.

1K 85 12
eggplants
wbsv

Throw all URIs in a page on to Wayback Machine from CLI.

793 14 5
GeiserX
wayback-archive

Download complete websites from the Wayback Machine with full asset preservation for offline viewing

776 10 6
melon-dog
wayback-utils

Wayback Machine utils (web.archive.org)

590 0 0
BGforgeNet
yawbdl

A tool to download pages from Internet Archive.

546 21 4
sbaack
archive-md-urls

Turn URLs in Markdown files into archive.org snapshots

544 7 2
claromes
waybacktweets

Archived tweets from the Wayback Machine

498 201 50
Own-Data-Privateer
hoardy-web

Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewing, replay, mirroring, data scraping, and/or indexing. Your own personal private Wayback Machine that can also archive HTTP POST requests and responses, as well as most other HTTP-level data.

496 127 10
sangaline
scrapy-wayback-machine

A Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.

460 122 32
sangaline
wayback-machine-scraper

A command-line utility for scraping Wayback Machine snapshots from archive.org.

398 477 81
kenlhlui
pyarchiveit

A Python library to interact with the Archive-It's API

381 0 0
jfilter
get-wayback-machine

Fetch a URL via the latest Wayback Machine snapshot

240 5 1
connor-marchand
gau-python

This library gets urls from AlienVault's Open Threat Exchange, the Wayback Machine, and Common Crawl. Inspired by Corbin Leo's gau

222 3 0
shuuji3
twilog-web-archiver

Archive twilog website month list pages into Internet Archive Wayback Machine

219 3 0
GeiserX
wayback-diff

Intelligent web page comparison tool with Wayback Machine support and visual regression testing

208 1 0
Own-Data-Privateer
hoardy-web-sas

A simple archiving server for the `Hoardy-Web` Web Extension browser add-on.

199 127 10
    • Data from PyPI, GitHub, ClickHouse, and BigQuery