CrawlerFriend

A light weight crawler which gives search results in HTML form or in Dictionary form, given urls and keywords.


Keywords
crawler, python-crawler, python-scraper, python27, scrapper
License
MIT
Install
pip install CrawlerFriend==1.0.11

Documentation

CrawlerFriend

A light weight Web Crawler that supports Python 2.7 which gives search results in HTML form or in Dictionary form given URLs and Keywords. If you regularly visit a few websites and look for a few keywords then this python package will automate the task for you and return the result in a HTML file in your web browser.

Installation

pip install CrawlerFriend

How to use?

All Result in HTML

import CrawlerFriend

urls = ["http://www.goal.com/","http://www.skysports.com/football","https://www.bbc.com/sport/football"]
keywords = ["Ronaldo","Liverpool","Salah","Real Madrid","Arsenal","Chelsea","Man United","Man City"]

crawler = CrawlerFriend.Crawler(urls, keywords)
crawler.crawl()
crawler.get_result_in_html()

The above code will open the following HTML document in Browser

All Result in Dictionary

result_dict = crawler.get_result()

Changing Default Arguments

CrawlerFriend uses four HTML tags 'title', 'h1', 'h2', 'h3' and max_link_limit = 50 by default for searching. But it can be changed by passing arguments to the constructor:

crawler = CrawlerFriend.Crawler(urls, keywords, max_link_limit=200, tags=['p','h4'])
crawler.crawl()