scrape_files is a tool to help scrape things online to your local machine. Currently, it supports scraping and converting htmls to well-formatted markdowns for easy reading as well as scraping and downloading images of various formats in a web page.

Scraping htmls to your local machine

The html parsing logic is similar to a browser's easyread extension's, which trims off all the unnecessary decorations from a web page, only keeping the title and the article content. The main difference is that the file is downloaded as pretty formatted markdown.

Also support scraping links under the <p> tag in the current page concurrently.

Terminal usage:

scrape html <url>     # specify a url for scraping
scrape html <url> -d  # specify a directory name for saving files in current folder
scrape html <url> -l  # specify a level: 1 by default for the current page; 2 for links in the current page

Scraping images to your local machine

Images are scraped and downloaded concurrently. Supported formats: jpg, png, gif, svg, jpeg, webp; defaults to all supported formats.

Terminal usage:

scrape image <url>     # specify a url for scraping
scrape image <url> -d  # specify a diretory name for saving files in current folder 
scrape image <url> -f  # specify image formats separated with space

Installation

pip install scrape_files

scrape_files
Release 0.1.5

Release 0.1.5

0.2.0

0.1.5

0.1.4

0.1.3

0.1.2

0.1.1

0.1.0

Documentation

Scraping htmls to your local machine

Scraping images to your local machine

Installation

Stats

Development practices

Releases

Contributors

scrape_files Release 0.1.5

Release 0.1.5 Toggle Dropdown 0.2.0 0.1.5 0.1.4 0.1.3 0.1.2 0.1.1 0.1.0

Documentation

Scraping htmls to your local machine

Scraping images to your local machine

Installation

Stats

Development practices

Releases

Contributors

scrape_files
Release 0.1.5

Release 0.1.5

0.2.0

0.1.5

0.1.4

0.1.3

0.1.2

0.1.1

0.1.0