This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world" malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents (or streams).


Keywords
HTML, crawler, linq, parse, spider, hap, html-parser, htmlagilitypack, xpath
License
MIT
Install
Install-Package HtmlAgilityPack -Version 1.11.19

Documentation

Library Powered By

This library is powered by Entity Framework Extensions

Entity Framework Extensions

What's Html Agility Pack (HAP)?

It is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (No need to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant of "real world" malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents (or streams).

NuGet: https://www.nuget.org/packages/HtmlAgilityPack/

Useful links

Contribute

The best way to contribute is by spreading the word about the library:

  • Blog it
  • Comment it
  • Star it
  • Share it

A HUGE THANKS for your help.

More Projects

To view all our free and paid projects, visit our website ZZZ Projects.