Markdown Abstract Syntax Tree format


Keywords
ast, markdown, syntax-tree, unist
License
MIT
Install
npm install mdast@0.26.1

Documentation

MDAST

Markdown Abstract Syntax Tree.


⚠️

MDAST, the pluggable markdown parser, was recently separated from this project and given a new name: remark. See its documentation to read more about what changed and how to migrate »

MDAST discloses markdown as an abstract syntax tree. Abstract means not all information is stored in this tree and an exact replica of the original document cannot be re-created. Syntax Tree means syntax is present in the tree, thus an exact syntactic document can be re-created.

MDAST is a subset of unist, and implemented by remark.

This document describes version 2.0.0 of MDAST. Changelog ».

AST

Root

Root (Parent) houses all nodes.

interface Root <: Parent {
  type: "root";
}

Paragraph

Paragraph (Parent) represents a unit of discourse dealing with a particular point or idea.

interface Paragraph <: Parent {
  type: "paragraph";
}

For example, the following markdown:

Alpha bravo charlie.

Yields:

{
  "type": "paragraph",
  "children": [{
    "type": "text",
    "value": "Alpha bravo charlie."
  }]
}

Blockquote

Blockquote (Parent) represents a quote.

interface Blockquote <: Parent {
  type: "blockquote";
}

For example, the following markdown:

> Alpha bravo charlie.

Yields:

{
  "type": "blockquote",
  "children": [{
    "type": "paragraph",
    "children": [{
      "type": "text",
      "value": "Alpha bravo charlie."
    }]
  }]
}

Heading

Heading (Parent), just like with HTML, with a level greater than or equal to 1, lower than or equal to 6.

interface Heading <: Parent {
  type: "heading";
  depth: 1 <= uint32 <= 6;
}

For example, the following markdown:

# Alpha

Yields:

{
  "type": "heading",
  "depth": 1,
  "children": [{
    "type": "text",
    "value": "Alpha"
  }]
}

Code

Code (Text) occurs at block level (see InlineCode for code spans). Code sports a language tag (when using GitHub Flavoured Markdown fences with a flag, null otherwise).

interface Code <: Text {
  type: "code";
  lang: string | null;
}

For example, the following markdown:

    foo()

Yields:

{
  "type": "code",
  "lang": null,
  "value": "foo()"
}

InlineCode

InlineCode (Text) occurs inline (see Code for blocks). Inline code does not sport a lang attribute.

interface InlineCode <: Text {
  type: "inlineCode";
}

For example, the following markdown:

`foo()`

Yields:

{
  "type": "inlineCode",
  "value": "foo()"
}

YAML

YAML (Text) can occur at the start of a document, and contains embedded YAML data.

interface YAML <: Text {
  type: "yaml";
}

For example, the following markdown:

---
foo: bar
---

Yields:

{
  "type": "yaml",
  "value": "foo: bar"
}

HTML

HTML (Text) contains embedded HTML.

interface HTML <: Text {
  type: "html";
}

For example, the following markdown:

<div>

Yields:

{
  "type": "html",
  "value": "<div>"
}

List

List (Parent) contains ListItems.

The start property contains the starting number of the list when ordered: true; null otherwise.

When all list items have loose: false, the list’s loose property is also false. Otherwise, loose: true.

interface List <: Parent {
  type: "list";
  ordered: true | false;
  start: uint32 | null;
  loose: true | false;
}

For example, the following markdown:

1. [x] foo

Yields:

{
  "type": "list",
  "ordered": true,
  "start": 1,
  "loose": false,
  "children": [{
    "type": "listItem",
    "loose": false,
    "checked": true,
    "children": [{
      "type": "paragraph",
      "children": [{
        "type": "text",
        "value": "foo",
      }]
    }]
  }]
}

ListItem

ListItem (Parent) is a child of a List.

Loose ListItems often contain more than one block-level elements.

A checked property exists on ListItems, set to true (when checked), false (when unchecked), or null (when not containing a checkbox). See Task Lists on GitHub for information.

interface ListItem <: Parent {
  type: "listItem";
  loose: true | false;
  checked: true | false | null;
}

For an example, see the definition of List.

Table

Table (Parent) represents tabular data, with alignment. Its children are TableRows, the first of which acts as a table header row.

table.align represents the alignment of columns.

interface Table <: Parent {
  type: "table";
  align: [alignType];
}
enum alignType {
  "left" | "right" | "center" | null;
}

For example, the following markdown:

| foo | bar |
| :-- | :-: |
| baz | qux |

Yields:

{
  "type": "table",
  "align": ["left", "center"],
  "children": [
    {
      "type": "tableRow",
      "children": [
        {
          "type": "tableCell",
          "children": [{
            "type": "text",
            "value": "foo"
          }]
        },
        {
          "type": "tableCell",
          "children": [{
            "type": "text",
            "value": "bar"
          }]
        }
      ]
    },
    {
      "type": "tableRow",
      "children": [
        {
          "type": "tableCell",
          "children": [{
            "type": "text",
            "value": "baz"
          }]
        },
        {
          "type": "tableCell",
          "children": [{
            "type": "text",
            "value": "qux"
          }]
        }
      ]
    }
  ]
}

TableRow

TableRow (Parent). Its children are always TableCell.

interface TableRow <: Parent {
  type: "tableRow";
}

For an example, see the definition of Table.

TableCell

TableCell (Parent). Contains a single tabular field.

interface TableCell <: Parent {
  type: "tableCell";
}

For an example, see the definition of Table.

ThematicBreak

A ThematicBreak (Node) represents a break in content, often shown as a horizontal rule, or by two HTML section elements.

interface ThematicBreak <: Node {
  type: "thematicBreak";
}

For example, the following markdown:

***

Yields:

{
  "type": "thematicBreak"
}

Break

Break (Node) represents an explicit line break.

interface Break <: Node {
  type: "break";
}

For example, the following markdown (interpuncts represent spaces):

foo··
bar

Yields:

{
  "type": "paragraph",
  "children": [
    {
      "type": "text",
      "value": "foo"
    },
    {
      "type": "break"
    },
    {
      "type": "text",
      "value": "bar"
    }
  ]
}

Emphasis

Emphasis (Parent) represents slight emphasis.

interface Emphasis <: Parent {
  type: "emphasis";
}

For example, the following markdown:

*alpha* _bravo_

Yields:

{
  "type": "paragraph",
  "children": [
    {
      "type": "emphasis",
      "children": [{
        "type": "text",
        "value": "alpha"
      }]
    },
    {
      "type": "text",
      "value": " "
    },
    {
      "type": "emphasis",
      "children": [{
        "type": "text",
        "value": "bravo"
      }]
    }
  ]
}

Strong

Strong (Parent) represents strong emphasis.

interface Strong <: Parent {
  type: "strong";
}

For example, the following markdown:

**alpha** __bravo__

Yields:

{
  "type": "paragraph",
  "children": [
    {
      "type": "strong",
      "children": [{
        "type": "text",
        "value": "alpha"
      }]
    },
    {
      "type": "text",
      "value": " "
    },
    {
      "type": "strong",
      "children": [{
        "type": "text",
        "value": "bravo"
      }]
    }
  ]
}

Delete

Delete (Parent) represents text ready for removal.

interface Delete <: Parent {
  type: "delete";
}

For example, the following markdown:

~~alpha~~

Yields:

{
  "type": "delete",
  "children": [{
    "type": "text",
    "value": "alpha"
  }]
}

Link

Link (Parent) represents the humble hyperlink.

interface Link <: Parent {
  type: "link";
  title: string | null;
  url: string;
}

For example, the following markdown:

[alpha](http://example.com "bravo")

Yields:

{
  "type": "link",
  "title": "bravo",
  "url": "http://example.com",
  "children": [{
    "type": "text",
    "value": "alpha"
  }]
}

Image

Image (Node) represents the figurative figure.

interface Image <: Node {
  type: "image";
  title: string | null;
  alt: string | null;
  url: string;
}

For example, the following markdown:

![alpha](http://example.com/favicon.ico "bravo")

Yields:

{
  "type": "image",
  "title": "bravo",
  "url": "http://example.com",
  "alt": "alpha"
}

Footnote

Footnote (Parent) represents an inline marker, whose content relates to the document but is outside its flow.

interface Footnote <: Parent {
  type: "footnote";
}

For example, the following markdown:

[^alpha bravo]

Yields:

{
  "type": "footnote",
  "children": [{
    "type": "text",
    "value": "alpha bravo"
  }]
}

LinkReference

LinkReference (Parent) represents a humble hyperlink, its url and title defined somewhere else in the document by a Definition.

referenceType is needed to detect if a reference was meant as a reference ([foo][]) or just unescaped brackets ([foo]).

interface LinkReference <: Parent {
  type: "linkReference";
  identifier: string;
  referenceType: referenceType;
}
enum referenceType {
  "shortcut" | "collapsed" | "full";
}

For example, the following markdown:

[alpha][bravo]

Yields:

{
  "type": "linkReference",
  "identifier": "bravo",
  "referenceType": "full",
  "children": [{
    "type": "text",
    "value": "alpha"
  }]
}

ImageReference

ImageReference (Node) represents a figurative figure, its url and title defined somewhere else in the document by a Definition.

referenceType is needed to detect if a reference was meant as a reference (![foo][]) or just unescaped brackets (![foo]). See LinkReference for the definition of referenceType.

interface ImageReference <: Node {
  type: "imageReference";
  identifier: string;
  referenceType: referenceType;
  alt: string | null;
}

For example, the following markdown:

![alpha][bravo]

Yields:

{
  "type": "imageReference",
  "identifier": "bravo",
  "referenceType": "full",
  "alt": "alpha"
}

FootnoteReference

FootnoteReference (Node) is like Footnote, but its content is already outside the documents flow: placed in a FootnoteDefinition.

interface FootnoteReference <: Node {
  type: "footnoteReference";
  identifier: string;
}

For example, the following markdown:

[^alpha]

Yields:

{
  "type": "footnoteReference",
  "identifier": "alpha"
}

Definition

Definition (Node) represents the definition (i.e., location and title) of a LinkReference or an ImageReference.

interface Definition <: Node {
  type: "definition";
  identifier: string;
  title: string | null;
  url: string;
}

For example, the following markdown:

[alpha]: http://example.com

Yields:

{
  "type": "definition",
  "identifier": "alpha",
  "title": null,
  "url": "http://example.com"
}

FootnoteDefinition

FootnoteDefinition (Parent) represents the definition (i.e., content) of a FootnoteReference.

interface FootnoteDefinition <: Parent {
  type: "footnoteDefinition";
  identifier: string;
}

For example, the following markdown:

[^alpha]: bravo and charlie.

Yields:

{
  "type": "footnoteDefinition",
  "identifier": "alpha",
  "children": [{
    "type": "paragraph",
    "children": [{
      "type": "text",
      "value": "bravo and charlie."
    }]
  }]
}

TextNode

TextNode (Text) represents everything that is just text. Note that its type property is text, but it is different from Text.

interface TextNode <: Text {
  type: "text";
}

For example, the following markdown:

Alpha bravo charlie.

Yields:

{
  "type": "text",
  "value": "Alpha bravo charlie."
}

Related

License

MIT © Titus Wormer