campbells

A condensed web scraping library.


Keywords
HTML, XML, parse, soup, beautifulsoup4, web-scraping
License
MIT
Install
pip install campbells==0.3.0

Documentation

Campbells 🥫

A condensed web scraping library.

Install • Examples

Adapted from beautifulsoup4's inner package, then linted, refactored, reduced, and seasoned to taste.

Development

To run pre-commit checks and tests:

pre-commit run --all-files && pdm run python -m pytest

Examples

To parse a string as HTML, your recipe should call for CampbellsSoup:

from campbells import CampbellsSoup

html_str = "<html><body><p>Hello world!</p></body></html>"
soup = CampbellsSoup(html_str)

Installation

Campbells is available on PyPi:

pip install campbells

The dependencies needed to use html5lib and lxml parsers are not installed by default. They can be installed with:

  • pip install campbells[html5lib] to be able to use html5lib.
    • Pros: closest to how browsers parses web pages, very lenient, creates valid HTML5.
    • Cons: slowest parser.
  • pip install campbells[lxml] to be able to use lxml.
    • Pros: fastest parser.
    • Cons: heavier dependency (C extension).