tululu-offline

Read it tululu.org without Internet


Keywords
beautifulsoup4, bs4, parser, poetry, requests, tqdm
License
MIT
Install
pip install tululu-offline==1.0.2

Documentation

Save the category of site tululu.org offline

Books library restyle

Description

Maintainability Build Status Coverage Status Platform Python_versions GitHub wemake-python-styleguide

The program downloads from tululu.org books in text format and their covers. The following information is also downloaded to the json file:

  • title;
  • author;
  • image path;
  • book path;
  • comments;
  • genres.

After downloading the necessary data, the offline version of the site will be generated (you can see an example here).

Table of content

Installation

Install using poetry:

git clone https://github.com/velivir/tululu-offline
cd tululu-offline
make install

How to use

poetry run python3 tululu_offline/app.py [OPTIONS]

Options

  • category_url - the category url tululu.org;
  • --start_page - which page to start downloading;
  • --end_page - on which page to finish downloading;
  • --dest_folder - path to the directory with parsing results: pictures, books, JSON;
  • --skip_txt - do not download books;
  • --skip_imgs - do not download images;
  • --json_path - specify your path to *.json file with results;
  • --number_of_books_per_page - number of books per page.

Example run

Run the script with the necessary parameters. For example:

poetry run python3 tululu_offline/app.py http://tululu.org/l55/ --start_page 1 --end_page 3 --skip_txt true --skip_imgs true --number_of_books_per_page 15

The first page of the library will be available at pages/index1.html.

For developers

How to install with dev dependencies

Install using poetry:

git clone https://github.com/velivir/tululu-offline
cd tululu-offline
make install_dev

Start render website

Run the file render_website.py with the following options:

  • category_url - the category url tululu.org;
  • --dest_folder - path to the directory with parsing results: pictures, books, JSON;
  • --json_path - specify your path to *.json file with results;
  • --number_of_books_per_page - number of books per page.

Example:

poetry run python3 tululu_offline/render_website.py http://tululu.org/l55/ --number_of_books_per_page 10 --json_path result/books.json --dest_folder result

How to run lint files

make lint

How to run tests

make test

License

Tululu-offline is licensed under the MIT License. See LICENSE for more information.

Project goal

The code is written for educational purposes in an online course for web developers dvmn.org.