Throw all URIs in a page on to Wayback Machine from CLI.


Keywords
cli, wayback-machine, webpage
License
MIT
Install
pip install wbsv==0.3.4

Documentation

wbsv

PyPI version Maintainability

wbsv(stands for "WayBack machine SavepageNow") is…

CLI tool for saving webpage on Wayback Machine forever. Enables you to save all URIs in a webpage forever on Wayback Machine.

Install

pip install wbsv

DEMO

demo.gif

Run & Examples

Help

$ wbsv -h
usage: wbsv [-h] [-r times] [-t] [-l level] [-O] [-V] [url ...]

CLI tool for save webpage on Wayback Machine forever.
Save webpage and one 's all URI(s) on Wayback Machine.

positional arguments:
  url                      Saving pages in order. (default: None)

optional arguments:
  -h, --help               show this help message and exit
  -r times, --retry times  Set a retry limit on failed save. (default: 3)
  -t, --only_target        Save just same domain of target (default: False)
  -l level, --level level  Set maximum recursion depth (default: 1)
  -O, --own                Only URLs with the same domain as target (default: False)
  -V, --version            show program's version number and exit

If you don't give the URL,
interactive mode will be launched.
(To quit interactive mode,
type "end", "exit", "exit()",
"break", "bye", ":q" or "finish".)

Interactive mode

$ wbsv
[[Input a target url (ex: https://google.com)]]
>>> https://www.u.tsukuba.ac.jp
[+]Target: ['https://www.u.tsukuba.ac.jp']
[+]61 URI(s) found.
[01/60]: <NOW> https://web.archive.org/web/20200412020015/https://www.u.tsukuba.ac.jp/password/
[02/60]: <FAIL> https://www.u.tsukuba.ac.jp/info_lit/tebiki.html
[03/60]: <NOW> https://web.archive.org/web/20200412020026/https://www.u.tsukuba.ac.jp/account/
...
[58/60]: <NOW> https://web.archive.org/web/20200412022608/https://www.u.tsukuba.ac.jp/phishing/
[59/60]: <FAIL> https://www.u.tsukuba.ac.jp/wordpress/wp-content/uploads/note_usingcomputerrooms.png
[60/60]: <NOW> https://web.archive.org/web/20200412022640/https://www.u.tsukuba.ac.jp/
[+]FIN!: ['https://www.u.tsukuba.ac.jp']
[+]ALL: 60, SAVE: 57, PAST: 0, FAIL: 3
>>>

From stdin

$ wbsv https://tsumanne.net
[+]Target: ['https://tsumanne.net']
[+]4 URI(s) found.
[1/4]: <NOW> https://web.archive.org/web/20200412022931/https://tsumanne.net/si/
[2/4]: <NOW> https://web.archive.org/web/20200412022935/https://tsumanne.net/
[3/4]: <NOW> https://web.archive.org/web/20200412022938/https://tsumanne.net/my/
[4/4]: <NOW> https://web.archive.org/web/20200412022949/https://tsumanne.net/ct/
[+]FIN!: ['https://tsumanne.net']
[+]ALL: 4, SAVE: 4, PAST: 0, FAIL: 0
$

Search links recurcively

wbsv https://programming-place.net/ppp/contents/c/index.html -l 2

Increase limit of retry

wbsv https://tsumanne.net -r 10

LISENCE

MIT

Author

eggplants (haruna)