bing-hashing-image-downloader

Python library to download bulk images from Bing.com without repeated images


Keywords
bing, images, scraping, image, download, bulk, downloader, sha1, machine-learning
License
MIT
Install
pip install bing-hashing-image-downloader==22.8.24.1326

Documentation

GitHub top language GitHub Hits

Bing Hashing Image Downloader


Python library to download bulk of images from Bing.com. The downloader will obtain a sha1 hash of each image as it downloads and compare it to a hash of previous downloads before saving it to ensure no duplicate images are saved.
This package uses async url, which makes it very fast while downloading.

Disclaimer

This program lets you download tons of images from Bing. Please do not download or use any image that violates its copyright terms.

Installation

git clone https://github.com/korreckj328/bing_hashing_image_downloader
cd bing_hashing_image_downloader
pip install .

or

pip install bing-hashing-image-downloader

Usage

from bing_hashing_image_downloader import downloader

downloader.download(query_string, limit=100, output_dir='dataset', adult_filter_off=True,
                    timeout=60, size='none' verbose=True)

query_string : String to be searched.
limit : (optional, default is 100) Number of images to download.
output_dir : (optional, default is 'dataset') Name of output dir.
adult_filter_off : (optional, default is True) Enable of disable adult filteration.
timeout : (optional, default is 60) timeout for connection in seconds.
filter : (optional, default is "") filter, choose from [line, photo, clipart, gif, transparent]
size : (optional, default is 'none') or a tuple can be entered to resize the image.
verbose : (optional, default is True) Enable downloaded message.

You can also test the programm by runnning test.py


Thanks

Thanks to the original author of this program. It formed the base for these improvements that I made for my own use. You can see the original here: https://github.com/gurugaurav/bing_image_downloader