Markdown Abstract Syntax Tree.
⚠️ MDAST, the pluggable markdown parser, was recently separated from this project and given a new name: remark. See its documentation to read more about what changed and how to migrate »
MDAST discloses markdown as an abstract syntax tree. Abstract means not all information is stored in this tree and an exact replica of the original document cannot be re-created. Syntax Tree means syntax is present in the tree, thus an exact syntactic document can be re-created.
MDAST is a subset of unist, and implemented by remark.
This document describes version 2.0.0 of MDAST. Changelog ».
AST
Root
Root
(Parent
) houses all nodes.
interface Root <: Parent {
type: "root";
}
Paragraph
Paragraph
(Parent
) represents a unit of discourse dealing
with a particular point or idea.
interface Paragraph <: Parent {
type: "paragraph";
}
For example, the following markdown:
Alpha bravo charlie.
Yields:
{
"type": "paragraph",
"children": [{
"type": "text",
"value": "Alpha bravo charlie."
}]
}
Blockquote
Blockquote
(Parent
) represents a quote.
interface Blockquote <: Parent {
type: "blockquote";
}
For example, the following markdown:
> Alpha bravo charlie.
Yields:
{
"type": "blockquote",
"children": [{
"type": "paragraph",
"children": [{
"type": "text",
"value": "Alpha bravo charlie."
}]
}]
}
Heading
Heading
(Parent
), just like with HTML, with a level greater
than or equal to 1, lower than or equal to 6.
interface Heading <: Parent {
type: "heading";
depth: 1 <= uint32 <= 6;
}
For example, the following markdown:
# Alpha
Yields:
{
"type": "heading",
"depth": 1,
"children": [{
"type": "text",
"value": "Alpha"
}]
}
Code
Code
(Text
) occurs at block level (see
InlineCode
for code spans). Code
sports a language
tag (when using GitHub Flavoured Markdown fences with a flag, null
otherwise).
interface Code <: Text {
type: "code";
lang: string | null;
}
For example, the following markdown:
foo()
Yields:
{
"type": "code",
"lang": null,
"value": "foo()"
}
InlineCode
InlineCode
(Text
) occurs inline (see Code
for
blocks). Inline code does not sport a lang
attribute.
interface InlineCode <: Text {
type: "inlineCode";
}
For example, the following markdown:
`foo()`
Yields:
{
"type": "inlineCode",
"value": "foo()"
}
YAML
YAML
(Text
) can occur at the start of a document, and
contains embedded YAML data.
interface YAML <: Text {
type: "yaml";
}
For example, the following markdown:
---
foo: bar
---
Yields:
{
"type": "yaml",
"value": "foo: bar"
}
HTML
HTML
(Text
) contains embedded HTML.
interface HTML <: Text {
type: "html";
}
For example, the following markdown:
<div>
Yields:
{
"type": "html",
"value": "<div>"
}
List
List
(Parent
) contains ListItem
s.
The start
property contains the starting number of the list when
ordered: true
; null
otherwise.
When all list items have loose: false
, the list’s loose
property is also
false
. Otherwise, loose: true
.
interface List <: Parent {
type: "list";
ordered: true | false;
start: uint32 | null;
loose: true | false;
}
For example, the following markdown:
1. [x] foo
Yields:
{
"type": "list",
"ordered": true,
"start": 1,
"loose": false,
"children": [{
"type": "listItem",
"loose": false,
"checked": true,
"children": [{
"type": "paragraph",
"children": [{
"type": "text",
"value": "foo",
}]
}]
}]
}
ListItem
ListItem
(Parent
) is a child of a List
.
Loose ListItem
s often contain more than one block-level elements.
A checked property exists on ListItem
s, set to true
(when checked),
false
(when unchecked), or null
(when not containing a checkbox).
See Task Lists on GitHub for information.
interface ListItem <: Parent {
type: "listItem";
loose: true | false;
checked: true | false | null;
}
For an example, see the definition of List
.
Table
Table
(Parent
) represents tabular data, with alignment.
Its children are TableRow
s, the first of which acts as
a table header row.
table.align
represents the alignment of columns.
interface Table <: Parent {
type: "table";
align: [alignType];
}
enum alignType {
"left" | "right" | "center" | null;
}
For example, the following markdown:
| foo | bar |
| :-- | :-: |
| baz | qux |
Yields:
{
"type": "table",
"align": ["left", "center"],
"children": [
{
"type": "tableRow",
"children": [
{
"type": "tableCell",
"children": [{
"type": "text",
"value": "foo"
}]
},
{
"type": "tableCell",
"children": [{
"type": "text",
"value": "bar"
}]
}
]
},
{
"type": "tableRow",
"children": [
{
"type": "tableCell",
"children": [{
"type": "text",
"value": "baz"
}]
},
{
"type": "tableCell",
"children": [{
"type": "text",
"value": "qux"
}]
}
]
}
]
}
TableRow
TableRow
(Parent
). Its children are always
TableCell
.
interface TableRow <: Parent {
type: "tableRow";
}
For an example, see the definition of Table
.
TableCell
TableCell
(Parent
). Contains a single tabular field.
interface TableCell <: Parent {
type: "tableCell";
}
For an example, see the definition of Table
.
ThematicBreak
A ThematicBreak
(Node
) represents a break in content,
often shown as a horizontal rule, or by two HTML section elements.
interface ThematicBreak <: Node {
type: "thematicBreak";
}
For example, the following markdown:
***
Yields:
{
"type": "thematicBreak"
}
Break
Break
(Node
) represents an explicit line break.
interface Break <: Node {
type: "break";
}
For example, the following markdown (interpuncts represent spaces):
foo··
bar
Yields:
{
"type": "paragraph",
"children": [
{
"type": "text",
"value": "foo"
},
{
"type": "break"
},
{
"type": "text",
"value": "bar"
}
]
}
Emphasis
Emphasis
(Parent
) represents slight emphasis.
interface Emphasis <: Parent {
type: "emphasis";
}
For example, the following markdown:
*alpha* _bravo_
Yields:
{
"type": "paragraph",
"children": [
{
"type": "emphasis",
"children": [{
"type": "text",
"value": "alpha"
}]
},
{
"type": "text",
"value": " "
},
{
"type": "emphasis",
"children": [{
"type": "text",
"value": "bravo"
}]
}
]
}
Strong
Strong
(Parent
) represents strong emphasis.
interface Strong <: Parent {
type: "strong";
}
For example, the following markdown:
**alpha** __bravo__
Yields:
{
"type": "paragraph",
"children": [
{
"type": "strong",
"children": [{
"type": "text",
"value": "alpha"
}]
},
{
"type": "text",
"value": " "
},
{
"type": "strong",
"children": [{
"type": "text",
"value": "bravo"
}]
}
]
}
Delete
Delete
(Parent
) represents text ready for removal.
interface Delete <: Parent {
type: "delete";
}
For example, the following markdown:
~~alpha~~
Yields:
{
"type": "delete",
"children": [{
"type": "text",
"value": "alpha"
}]
}
Link
Link
(Parent
) represents the humble hyperlink.
interface Link <: Parent {
type: "link";
title: string | null;
url: string;
}
For example, the following markdown:
[alpha](http://example.com "bravo")
Yields:
{
"type": "link",
"title": "bravo",
"url": "http://example.com",
"children": [{
"type": "text",
"value": "alpha"
}]
}
Image
Image
(Node
) represents the figurative figure.
interface Image <: Node {
type: "image";
title: string | null;
alt: string | null;
url: string;
}
For example, the following markdown:
![alpha](http://example.com/favicon.ico "bravo")
Yields:
{
"type": "image",
"title": "bravo",
"url": "http://example.com",
"alt": "alpha"
}
Footnote
Footnote
(Parent
) represents an inline marker, whose
content relates to the document but is outside its flow.
interface Footnote <: Parent {
type: "footnote";
}
For example, the following markdown:
[^alpha bravo]
Yields:
{
"type": "footnote",
"children": [{
"type": "text",
"value": "alpha bravo"
}]
}
LinkReference
LinkReference
(Parent
) represents a humble hyperlink,
its url
and title
defined somewhere else in the document by a
Definition
.
referenceType
is needed to detect if a reference was meant as a
reference ([foo][]
) or just unescaped brackets ([foo]
).
interface LinkReference <: Parent {
type: "linkReference";
identifier: string;
referenceType: referenceType;
}
enum referenceType {
"shortcut" | "collapsed" | "full";
}
For example, the following markdown:
[alpha][bravo]
Yields:
{
"type": "linkReference",
"identifier": "bravo",
"referenceType": "full",
"children": [{
"type": "text",
"value": "alpha"
}]
}
ImageReference
ImageReference
(Node
) represents a figurative figure,
its url
and title
defined somewhere else in the document by a
Definition
.
referenceType
is needed to detect if a reference was meant as a
reference (![foo][]
) or just unescaped brackets (![foo]
).
See LinkReference
for the definition of referenceType
.
interface ImageReference <: Node {
type: "imageReference";
identifier: string;
referenceType: referenceType;
alt: string | null;
}
For example, the following markdown:
![alpha][bravo]
Yields:
{
"type": "imageReference",
"identifier": "bravo",
"referenceType": "full",
"alt": "alpha"
}
FootnoteReference
FootnoteReference
(Node
) is like Footnote
,
but its content is already outside the documents flow: placed in a
FootnoteDefinition
.
interface FootnoteReference <: Node {
type: "footnoteReference";
identifier: string;
}
For example, the following markdown:
[^alpha]
Yields:
{
"type": "footnoteReference",
"identifier": "alpha"
}
Definition
Definition
(Node
) represents the definition (i.e., location
and title) of a LinkReference
or an
ImageReference
.
interface Definition <: Node {
type: "definition";
identifier: string;
title: string | null;
url: string;
}
For example, the following markdown:
[alpha]: http://example.com
Yields:
{
"type": "definition",
"identifier": "alpha",
"title": null,
"url": "http://example.com"
}
FootnoteDefinition
FootnoteDefinition
(Parent
) represents the definition
(i.e., content) of a FootnoteReference
.
interface FootnoteDefinition <: Parent {
type: "footnoteDefinition";
identifier: string;
}
For example, the following markdown:
[^alpha]: bravo and charlie.
Yields:
{
"type": "footnoteDefinition",
"identifier": "alpha",
"children": [{
"type": "paragraph",
"children": [{
"type": "text",
"value": "bravo and charlie."
}]
}]
}
TextNode
TextNode
(Text
) represents everything that is just text.
Note that its type
property is text
, but it is different from
Text
.
interface TextNode <: Text {
type: "text";
}
For example, the following markdown:
Alpha bravo charlie.
Yields:
{
"type": "text",
"value": "Alpha bravo charlie."
}
Related
License
MIT © Titus Wormer