htmlpurifier

A silly but useful hack for sanitizing untrusted HTML input in Django forms via the PHP HTML Purifier library


License
Other
Install
pip install htmlpurifier==1.0.1

Documentation

django-htmlpurifier

A silly but useful hack for sanitizing untrusted HTML input in Django forms via PHP's HTML Purifier

Prerequisites

Django (tested with 1.3)
Python 2.6+
PHP 5.0.5+ w/ CLI
HTML Purifier library - standard, lite, or PEAR distribution
NOTE: HTMLPurifier.auto.php should be in your PHP path.
A willingness to mix Python/Django with PHP

Installation

Add 'htmlpurifier' to INSTALLED_APPS in your project's settings.py.

Configuration

django-htmlpurifier is ready to use out-of-the-box, assuming that you are happy with the PHP HTML Purifier's default settings.

Otherwise, django-htmlpurifier currently supports the following settings, definable in settings.py:
HTMLPURIFIER_SCRIPT_PATH: An absolute path to a custom PHP processing script (see below)

Usage

Using django-htmlpurifier is very simple.

Import the htmlpurifier module:
import htmlpurifier

Define a field that should be sanitized:
MyHtmlField = htmlpurifier.HTMLField()
or perhaps
MyHtmlField = htmlpurifier.HTMLField(widget=forms.TextArea)

HTMLField() is a subclass of Django's CharField.

If for some reason you don't want to deal with forms and just need a low-level function that HTML-purifies a string input, do:
htmlpurifier.purify(my_html_str)

Writing your own PHP processing script

django-htmlpurifier ships with a very basic PHP script that provides an interface to the PHP HTML Purifier library using its default settings. If the default settings do not float your boat, please refer to the HTML Purifier configuration reference.