Wayback Machine Python Packages

waybackpy

Wayback Machine API interface & a command-line tool

5M 592 40

archivebox

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

72K 28K 2K

history4feed

Creates a complete full text historical archive for an RSS or ATOM feed.

3K 137 5

archive-query-log

📜 The Archive Query Log.

3K 35 1

scrapy-wayback-middleware

Scrapy middleware for submitting URLs to the Internet Archive Wayback Machine

2K 11 3

hoardy-web

Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewing, replay, mirroring, data scraping, and/or indexing. Your own personal private Wayback Machine that can also archive HTTP POST requests and responses, as well as most other HTTP-level data.

1K 132 10

pywaybackup

Query and download archive.org as simple as possible.

989 119 25

wayback-machine-archiver

A Python script to submit web pages to the Wayback Machine for archiving.

927 86 13

waybacktweets

Retrieves archived tweets CDX data from the Wayback Machine, performs necessary parsing, and saves the data.

917 202 54

hoardy-web-sas

798 132 10

archivooor

Archivooor is a Python package for interacting with the archive.org API.

647 3 1

archive-md-urls

Turn URLs in Markdown files into archive.org snapshots

609 7 2

pyarchiveit

A Python library to interact with the Internet Archive's Archive-It Account API (https://support.archive-it.org/hc/en-us/articles/360032747311-Access-your-account-with-the-Archive-It-Partner-API)

583 0 0