tidy-page

html text parser,get the content form html page


Keywords
html, parser, python2, spider
License
MIT
Install
pip install tidy-page==0.1.1

Documentation

# tidy_page It is a html parser.Given a html document,It can get the content from the document. 给定一个网页提取网页中的正文内容和标题,用于网页解析、内容提取