Image Scraper
Simple python module for scraping images from the web, created for AI development.
Features
- scrape images from google.com and duckduckgo.com
- search duplicated and eliminate them.
- allow to create complex databases from the engine top search of supplied keyword.
- use tor network with firefox for scraping. (optional)
Basic Usage
>> imageCrawler.py -k cats dogs
>> Select by number the queries to ignore:
>> ( 0 ) cats
>> ( 1 ) cats with hats
>> 1
>> Start with cats download 4000 at engines\cats
>> 100%
>> Select by number the queries to ignore:
>> ( 0 ) dogs
>> ( 1 ) dogs with hats
>> 1
>> Start with dogs download 4000 at engines\dogs
>> 100%
>> Searching duplicated...
>> END
Results:
\engines
\cats
\ keys.json
\ +4000 images files
\dogs
\ keys.json
\ +4000 images files
Installing
(1) Install.
- Firefox
- TorBrowser (OPTIONAL).
(2) Download and add to path.
check
geckodriver combability(3) Run this command.
pip install engineCrawler