html-escape

Escape a string for use in HTML


License
ISC
Install
bower install html-escape

Documentation

html-escape.js

Escape a string for use in HTML.

This implementation is different in that it has a second parameter, targetContext. Knowing the target context lets it:

  1. Avoid bugs and vulnerabilities by making clear where the output is safe to use. (It cannot be made safe to use everywhere.)
  2. Produce smaller output by leaving characters only unsafe in other contexts unescaped.

Example

import htmlEscape from 'html-escape/main';

htmlEscape('<p>\'" </p>', 'text'   ); // &lt;p>'" &lt;/p>
htmlEscape('<p>\'" </p>', 'valueDQ'); // <p>'&quot; </p>
htmlEscape('<p>\'" </p>', 'valueSQ'); // <p>&#39;" </p>
htmlEscape('<p>\'" </p>', 'valueUQ'); // &lt;p&gt;&#39;&quot;&#32;&lt;/p&gt;

API

htmlEscape(str, targetContext)

Returns a representation of str that will evaluate to str in HTML context targetContext. The output also will not produce parse errors, except if str contains U+0000 NULL, which is passed through unmodified. (The parser handles nulls by replacing them with U+FFFD REPLACEMENT CHARACTER, generating a parse error, and proceeding.)

str
The string to be escaped.
targetContext

The HTML context in which the output will be used. One of:

  • 'text': within element text (not within tags), except script and style elements.
  • 'valueDQ': within double-quoted attribute values
  • 'valueSQ': within single-quoted attribute values
  • 'valueUQ': within unquoted attribute values

HTML does not allow the escaping of arbitrary strings in any other context.

Escaping script element content

HTML-escaping does not work in script elements, so this is out of scope. It's not really in-scope anywhere, though, so here's some advice on how to go about it:

  • The only HTML control in script elements is <. It must be followed by /script or script to create a problem - the / and > are not necessary. (The effect of <script is odd but very dangerous: it's interpretted by JavaScript as-is but prevents the next </script from closing the script element.)
  • You can validate script tag content by checking if it contains </script or <script, but not fix it automatically, unless:
  • If an offending < is within a JavaScript string literal, you can JavaScript-escape it by replacing it with \x3c. Be aware, though, that this can theoretically create problems with JavaScript that reads its own source code, via innerHTML, textContent, Function#toString, etc.