Hunt down the secrets from the WebArchives for Fun and Profit
-
Updated
Dec 8, 2022 - Python
Hunt down the secrets from the WebArchives for Fun and Profit
Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewing, replay, mirroring, data scraping, and/or indexing. Your own personal private Wayback Machine that can also archive HTTP POST requests and responses, as well as most other HTTP-level data.
Summarize web archive capture index (CDX) files.
🗄 Save an archived copy of websites from Pocket/Pinboard/Bookmarks/RSS. Outputs HTML, PDFs, and more...
A tool for collection archival slivers of the web and web archives
A Tool to Summarize Web Archive Holdings
Miscellaneous utility scripts
A mirror of The Huddle magazine
Crawls the web to generate a huge dataset for training
Add a description, image, and links to the web-archive topic page so that developers can more easily learn about it.
To associate your repository with the web-archive topic, visit your repo's landing page and select "manage topics."