wepana

An analyzer for web page content powered by Python.


License
MIT
Install
pip install wepana==0.2.0

Documentation

wepana: A Web Page Analyzer

Python License MIT Release

 __      _____ _ __   __ _ _ __   __ _
 \ \ /\ / / _ \ '_ \ / _` | '_ \ / _` |
  \ V  V /  __/ |_) | (_| | | | | (_| |
   \_/\_/ \___| .__/ \__,_|_| |_|\__,_|
              |_|

Wepana is an analyzer for web page content powered by Python.

Requirement

  • Python 3

Features

  • Auto load content from url.
  • Load content from file.
  • Load content from string value.
  • Get image urls.
  • Get html link src target urls.
  • Get meta information.
  • Get keyword information.

Usage

Load

Import analyzer

from wepana import WebPageAnalyzer

Load from url.

# load with init
site = WebPageAnalyzer(url='http://github.com')
# load before init
site.connect('http://github.com')

Load from file.

analyzer = WebPageAnalyzer()
analyzer.read_file('/path/to/the/file.html')

Load from text.

analyzer = WebPageAnalyzer()
analyzer.read_text('text content')

Check analyzer is ready.

analyzer.read()

Reset analyzer.

analyzer.reset()

Analyze

Get title.

analyzer.get_title()

Get keywords.

analyzer.get_keywords()

Get images.

analyzer.get_images()

Get links.

analyzer.get_links()

Contributing

  1. Fork it.
  2. Create your feature branch. ($ git checkout feature/my-feature-branch)
  3. Commit your changes. ($ git commit -am 'What feature I just added.')
  4. Push to the branch. ($ git push origin feature/my-feature-branch)
  5. Create a new Pull Request

Authors

@Mervin

License

The MIT License (MIT). For detail see LICENSE.