Chopper

Chopper is a tool to extract elements from HTML by preserving ancestors and CSS rules.

Compatible with Python >= 3.8

Installation

pip install chopper

Full documentation

http://chopper.readthedocs.org/en/latest/

Quick start

from chopper.extractor import Extractor

HTML = """
<html>
  <head>
    <title>Test</title>
  </head>
  <body>
    <div id="header"></div>
    <div id="main">
      <div class="iwantthis">
        HELLO WORLD
        <a href="/nope">Do not want</a>
      </div>
    </div>
    <div id="footer"></div>
  </body>
</html>
"""

CSS = """
div { border: 1px solid black; }
div#main { color: blue; }
div.iwantthis { background-color: red; }
a { color: green; }
div#footer { border-top: 2px solid red; }
"""

extractor = Extractor.keep('//div[@class="iwantthis"]').discard('//a')
html, css = extractor.extract(HTML, CSS)

The result is :

>>> html
"""
<html>
  <body>
    <div id="main">
      <div class="iwantthis">
        HELLO WORLD
      </div>
    </div>
  </body>
</html>"""

>>> css
"""
div{border:1px solid black;}
div#main{color:blue;}
div.iwantthis{background-color:red;}
"""

chopper
Release 0.6.0

Release 0.6.0

0.6.0

0.5.0

0.1.10

0.4.8

0.4.7

0.4.6

0.4.5

0.4.4

0.4.3

0.4.2

Documentation

Chopper

Installation

Full documentation

Quick start

Stats

Development practices

Releases

Contributors

chopper Release 0.6.0

Release 0.6.0 Toggle Dropdown 0.6.0 0.5.0 0.1.10 0.4.8 0.4.7 0.4.6 0.4.5 0.4.4 0.4.3 0.4.2

Documentation

Chopper

Installation

Full documentation

Quick start

Stats

Development practices

Releases

Contributors

chopper
Release 0.6.0

Release 0.6.0

0.6.0

0.5.0

0.1.10

0.4.8

0.4.7

0.4.6

0.4.5

0.4.4

0.4.3

0.4.2