Bookmark Archiver: Python script that archives all of your bookmarks on the Internet Archive

Jan 22, 2022 1 min read

bookmarkarchiver

Python script that archives all of your bookmarks on the Internet Archive. Supports all major browsers.

bookmarkarchiver uses the official Save Page Now API. Anonymous users are limited to 4,000 requests per day, which should be enough to save around 200 websites. If you create a free account to the Internet Archive and log in with Chrome, Chromium, or Firefox, your single-day request limit increases to 100,000 and you should be able to save approximately 5000 websites.

As for dependencies, bookmarkarchiver uses Richard Penman’s browsercookie module with this patch applied. It also uses the Python requests library.

Usage

To use bookmarkarchiver, you need a bookmark file. You can get one by exporting them from a browser—instructions are online.

$ pip3 -r requirements.txt
$ python3 bookmarkarchiver.py --help
usage: bookmarkarchiver.py [-h] [--capture_all] [--capture_outlinks] [--capture_screenshot] [--delay_wb_availability] [--force_get]
                           [--skip_first_archive] [--email_result]
                           bookmark_file

Archives your bookmarks with the Wayback Machine.

positional arguments:
  bookmark_file         A Netscape format bookmarks file

optional arguments:
  -h, --help            show this help message and exit
  --capture_all, -a     Don't capture error pages
  --capture_outlinks, -o
                        Capture all outlinks
  --capture_screenshot, -s
                        Capture a screenshot
  --delay_wb_availability, -d
                        Delay uploading capture
  --force_get, -g       Force a GET request
  --skip_first_archive, -f
                        Don't find old captures
  --email_result, -e    Email results to user

To-Do

publish as a pip package
summary of capture status
archive to other archivers
track down mysterious crashes

GitHub

View Github

John was the first writer to have joined pythonawesome.com. He has since then inculcated very effective writing and reviewing culture at pythonawesome which rivals have found impossible to imitate.

Bookmark Archiver: Python script that archives all of your bookmarks on the Internet Archive

bookmarkarchiver

Usage

To-Do

GitHub

John

TelegramDB: A library that uses your telegram account as a database

How to TDD and testing using aws cdk for python

bookmarkarchiver

Usage

To-Do

GitHub

TelegramDB: A library that uses your telegram account as a database

How to TDD and testing using aws cdk for python

You might also like...