August - a html-to-text converter
August is an html to text converter specifically intended for producing
text versions of HTML emails.
Getting Started
Install using PIP
$ pip3 install august
Then, import it and run convert
>>> import august
>>>
>>> html = '<p>I\'m <em>so</em> excited to try this</p>'
>>> print(august.convert(html, width=20))
I'm /so/ excited to
try this
Known issues
- There's a few tags that are still not yet supported (which could
benefit from some support) like <pre>, <var>, <tt>, and probably
a bunch that I forgot. These are not commonly seen in emails so they
are not high priority
- There's no CSS support currently. Some support will probably happen
sometime, but it's still unclear what is worth implementing.
Alternatives
-
html2text: Coverts HTML into markdown, and supports a bazillion options.
It's a great project if you want to produce markdown; but markdown, because
it's designed to be turned into HTML, has a little more noise than is
strictly necessary, and the header formatting is pretty unclear.
-
html-to-text: Converts HTML to text. Javascript/node project.