SouperScraper

A simple web scraper base that combines BeautifulSoup and Selenium to scrape dynamic websites.

Setup

Install with pip

pip install souperscraper

Download the appropriate ChromeDriver for your Chrome version using getchromedriver.py (command below) or manually from the ChromeDriver website.

To find your Chrome version, go to chrome://settings/help in your browser.

getchromedriver

Create a new SouperScaper object using the path to your ChromeDriver

from souperscraper import SouperScraper

scraper = SouperScraper('/path/to/your/chromedriver')

Start scraping using BeautifulSoup and/or Selenium methods

scraper.goto('https://github.com/LucasFaudman')

# Use BeautifulSoup to search for and extract content
# by accessing the scraper's 'soup' attribute
# or with the 'soup_find' / 'soup_find_all' methods
repos = scraper.soup.find_all('span', class_='repo')
for repo in repos:
    repo_name = repo.text
    print(repo_name)

# Use Selenium to interact with the page such as clicking buttons
# or filling out forms by accessing the scraper's
# find_element_by_* / find_elements_by_* / wait_for_* methods
repos_tab = scraper.find_element_by_css_selector("a[data-tab-item='repositories']")
repos_tab.click()

search_input = scraper.wait_for_visibility_of_element_located_by_id('your-repos-filter')
search_input.send_keys('souper-scraper')
search_input.submit()

souperscraper
Release 1.0.2

Release 1.0.2

1.0.2

1.0.1

1.0.0

Documentation

SouperScraper

Setup

BeautifulSoup Reference

Selenium Reference

Stats

Development practices

Releases

Contributors

souperscraper Release 1.0.2

Release 1.0.2 Toggle Dropdown 1.0.2 1.0.1 1.0.0

Documentation

SouperScraper

Setup

BeautifulSoup Reference

Selenium Reference

Stats

Development practices

Releases

Contributors

souperscraper
Release 1.0.2

Release 1.0.2

1.0.2

1.0.1

1.0.0