Chandere2

An asynchronous image/file downloader and thread archiver for Futaba-styled imageboards, such as 4chan and 8chan.


Keywords
downloader archiver imageboard, 4chan, archiver, downloader, imageboard
License
GPL-3.0+
Install
pip install Chandere2==2.4.1

Documentation

Chandere2

A utility programmed and maintained by Jakob.

A better image/file downloader and thread archiver for Futaba-styled imageboards, such as 4chan.

Chandere2 is an asynchronous rewrite of Chandere 1.0. It runs on all versions of Python newer than 3.5.

Chandere2 is free software, licensed under the GNU General Public License.

Build Status PyPI Downloads License

Primary Features

  • Able to scrape from multiple boards and threads at once.
  • Offers official support for 4chan, 8chan and Lainchan.
  • Capable of archiving to a Sqlite3 database, as well as plaintext.

Installation

Currently, the most reliable way to install Chandere2 is through Pip.

# It is recommended that you use the latest version of pip and setuptools when installing Chandere.
$ pip install --upgrade pip setuptools

$ pip install --upgrade chandere2

Tutorial

Chandere2 only really requires one argument to run. The following command will attempt to make a connection to http://boards.4chan.org/g/ and show the response headers.

$ chandere2 /g/
CONNECTED: a.4cdn.org/g/threads.json
...

Accessing multiple boards at once is just as simple, just add another one as an argument.

$ chandere2 /g/ /3/
...

A specific thread can also be specified by placing the thread number after the board.

$ chandere2 /g/51971506

Now with the basics of specifying where to scrape from, we can actually use the tool. The default mode of operation is "test connection", but we can do more than that. To download every file in a board/thread use the "-d" or "--download" argument.

$ chandere2 /g/51971506 -d
...

This will download everything into the current working directory, though. Maybe we don't want that. We can specify the output path with the "-o" or "--output" parameter.

$ chandere2 /g/51971506 -d -o Stallman

Pretty neat, but maybe we're a lainon and don't care much for 4chan. The imageboard can be specified with -i. An alias can be used if it is listed by the "--list-imageboards" parameter.

$ chandere2 --list-imageboards
Available Imageboard Aliases: lainchan, 4chan
$ chandere2 /cyb/ -d -o Cyberpunk -i lainchan

Options

Documentation

  • -h, --help | Display a list of available command-line flags.
  • -v, --version | Display the version of Chandere2 that is currently installed.
  • --list-imageboards | List available imageboard aliases.

Scraping

  • targets | Pairs of a board and optionally a thread to connect to. If a thread is not specified, Chandere2 will attempt to scrape the entire board.
  • -i, --imageboard | Specify the imageboard to connect to. Aliases are listed with "--list-imageboards"
  • -d, --download | Crawl for and download all of the files in a board/thread.
  • -a, --archive | Crawl for and archive all of the posts in a board/thread.
  • --continuous | If Chandere2 is run with this flag, it will attempt to continuously refresh and check for new posts until a SIGINT is received, rather than quitting as soon as the task is done.
  • --ssl | Use HTTPS if available. Chandere does not attempt to verify the signature of the server it is connecting to.
  • --nocap | Will not attempt to limit the number of concurrent connections. Please do not use this unless you know what you're doing.

Output

  • --debug | Indicates that every log message should be shown during runtime. This is helpful when opening a bug report.
  • -o, --output | Designates the output directory if Chandere is operating in File Downloading mode, or the file to output to if Chandere is operating in Archiving mode. Defaults to the current working directory.
  • --output-format | Specify the format that output should be put into. Can be either "plaintext" or "sqlite".

TODO

TODO