Webpage_Textual_Extraction
an uniform webpage extraction algorithm
Requirement
Python 3.5, requests, bs4
How to use
- add the links you want to extract into pool.txt
- set the encoding you want
- run main.py
Extract main textual information from HTML.
pip install pextract==0.3
an uniform webpage extraction algorithm
Python 3.5, requests, bs4