html-escape.js
Escape a string for use in HTML.
This implementation is different in that it has a second parameter,
targetContext
. Knowing the target context lets it:
- Avoid bugs and vulnerabilities by making clear where the output is safe to use. (It cannot be made safe to use everywhere.)
- Produce smaller output by leaving characters only unsafe in other contexts unescaped.
Example
import htmlEscape from 'html-escape/main';
htmlEscape('<p>\'" </p>', 'text' ); // <p>'" </p>
htmlEscape('<p>\'" </p>', 'valueDQ'); // <p>'" </p>
htmlEscape('<p>\'" </p>', 'valueSQ'); // <p>'" </p>
htmlEscape('<p>\'" </p>', 'valueUQ'); // <p>'" </p>
API
htmlEscape(str, targetContext)
-
Returns a representation of
str
that will evaluate tostr
in HTML contexttargetContext
. The output also will not produce parse errors, except ifstr
contains U+0000 NULL, which is passed through unmodified. (The parser handles nulls by replacing them with U+FFFD REPLACEMENT CHARACTER, generating a parse error, and proceeding.)str
- The string to be escaped.
targetContext
-
The HTML context in which the output will be used. One of:
-
'text'
: within element text (not within tags), except script and style elements. -
'valueDQ'
: within double-quoted attribute values -
'valueSQ'
: within single-quoted attribute values -
'valueUQ'
: within unquoted attribute values
HTML does not allow the escaping of arbitrary strings in any other context.
-
Escaping script element content
HTML-escaping does not work in script elements, so this is out of scope. It's not really in-scope anywhere, though, so here's some advice on how to go about it:
- The only HTML control in script elements is
<
. It must be followed by/script
orscript
to create a problem - the/
and>
are not necessary. (The effect of<script
is odd but very dangerous: it's interpretted by JavaScript as-is but prevents the next</script
from closing the script element.) - You can validate script tag content by checking if it contains
</script
or<script
, but not fix it automatically, unless: - If an offending
<
is within a JavaScript string literal, you can JavaScript-escape it by replacing it with\x3c
. Be aware, though, that this can theoretically create problems with JavaScript that reads its own source code, viainnerHTML
,textContent
,Function#toString
, etc.