| spec.txt | spec.txt | |||
|---|---|---|---|---|
| --- | --- | |||
| title: CommonMark Spec | title: CommonMark Spec | |||
| author: John MacFarlane | author: John MacFarlane | |||
| version: 0.22 | version: 0.23 | |||
| date: 2015-08-23 | date: 2015-12-29 | |||
| license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)' | license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)' | |||
| ... | ... | |||
| # Introduction | # Introduction | |||
| ## What is Markdown? | ## What is Markdown? | |||
| Markdown is a plain text format for writing structured documents, | Markdown is a plain text format for writing structured documents, | |||
| based on conventions used for indicating formatting in email and | based on conventions used for indicating formatting in email and | |||
| usenet posts. It was developed in 2004 by John Gruber, who wrote | usenet posts. It was developed in 2004 by John Gruber, who wrote | |||
| skipping to change at line 39 | skipping to change at line 39 | |||
| 1. How much indentation is needed for a sublist? The spec says that | 1. How much indentation is needed for a sublist? The spec says that | |||
| continuation paragraphs need to be indented four spaces, but is | continuation paragraphs need to be indented four spaces, but is | |||
| not fully explicit about sublists. It is natural to think that | not fully explicit about sublists. It is natural to think that | |||
| they, too, must be indented four spaces, but `Markdown.pl` does | they, too, must be indented four spaces, but `Markdown.pl` does | |||
| not require that. This is hardly a "corner case," and divergences | not require that. This is hardly a "corner case," and divergences | |||
| between implementations on this issue often lead to surprises for | between implementations on this issue often lead to surprises for | |||
| users in real documents. (See [this comment by John | users in real documents. (See [this comment by John | |||
| Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).) | Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).) | |||
| 2. Is a blank line needed before a block quote or header? | 2. Is a blank line needed before a block quote or heading? | |||
| Most implementations do not require the blank line. However, | Most implementations do not require the blank line. However, | |||
| this can lead to unexpected results in hard-wrapped text, and | this can lead to unexpected results in hard-wrapped text, and | |||
| also to ambiguities in parsing (note that some implementations | also to ambiguities in parsing (note that some implementations | |||
| put the header inside the blockquote, while others do not). | put the heading inside the blockquote, while others do not). | |||
| (John Gruber has also spoken [in favor of requiring the blank | (John Gruber has also spoken [in favor of requiring the blank | |||
| lines](http://article.gmane.org/gmane.text.markdown.general/2146).) | lines](http://article.gmane.org/gmane.text.markdown.general/2146).) | |||
| 3. Is a blank line needed before an indented code block? | 3. Is a blank line needed before an indented code block? | |||
| (`Markdown.pl` requires it, but this is not mentioned in the | (`Markdown.pl` requires it, but this is not mentioned in the | |||
| documentation, and some implementations do not require it.) | documentation, and some implementations do not require it.) | |||
| ``` markdown | ``` markdown | |||
| paragraph | paragraph | |||
| code? | code? | |||
| skipping to change at line 88 | skipping to change at line 88 | |||
| [here](http://article.gmane.org/gmane.text.markdown.general/2554).) | [here](http://article.gmane.org/gmane.text.markdown.general/2554).) | |||
| 5. Can list markers be indented? Can ordered list markers be right-aligned? | 5. Can list markers be indented? Can ordered list markers be right-aligned? | |||
| ``` markdown | ``` markdown | |||
| 8. item 1 | 8. item 1 | |||
| 9. item 2 | 9. item 2 | |||
| 10. item 2a | 10. item 2a | |||
| ``` | ``` | |||
| 6. Is this one list with a horizontal rule in its second item, | 6. Is this one list with a thematic break in its second item, | |||
| or two lists separated by a horizontal rule? | or two lists separated by a thematic break? | |||
| ``` markdown | ``` markdown | |||
| * a | * a | |||
| * * * * * | * * * * * | |||
| * b | * b | |||
| ``` | ``` | |||
| 7. When list markers change from numbers to bullets, do we have | 7. When list markers change from numbers to bullets, do we have | |||
| two lists or one? (The Markdown syntax description suggests two, | two lists or one? (The Markdown syntax description suggests two, | |||
| but the perl scripts and many other implementations produce one.) | but the perl scripts and many other implementations produce one.) | |||
| skipping to change at line 131 | skipping to change at line 131 | |||
| ``` | ``` | |||
| 10. What are the precedence rules between block-level and inline-level | 10. What are the precedence rules between block-level and inline-level | |||
| structure? For example, how should the following be parsed? | structure? For example, how should the following be parsed? | |||
| ``` markdown | ``` markdown | |||
| - `a long code span can contain a hyphen like this | - `a long code span can contain a hyphen like this | |||
| - and it can screw things up` | - and it can screw things up` | |||
| ``` | ``` | |||
| 11. Can list items include section headers? (`Markdown.pl` does not | 11. Can list items include section headings? (`Markdown.pl` does not | |||
| allow this, but does allow blockquotes to include headers.) | allow this, but does allow blockquotes to include headings.) | |||
| ``` markdown | ``` markdown | |||
| - # Heading | - # Heading | |||
| ``` | ``` | |||
| 12. Can list items be empty? | 12. Can list items be empty? | |||
| ``` markdown | ``` markdown | |||
| * a | * a | |||
| * | * | |||
| skipping to change at line 327 | skipping to change at line 327 | |||
| ## Insecure characters | ## Insecure characters | |||
| For security reasons, the Unicode character `U+0000` must be replaced | For security reasons, the Unicode character `U+0000` must be replaced | |||
| with the replacement character (`U+FFFD`). | with the replacement character (`U+FFFD`). | |||
| # Blocks and inlines | # Blocks and inlines | |||
| We can think of a document as a sequence of | We can think of a document as a sequence of | |||
| [blocks](@block)---structural elements like paragraphs, block | [blocks](@block)---structural elements like paragraphs, block | |||
| quotations, lists, headers, rules, and code blocks. Some blocks (like | quotations, lists, headings, rules, and code blocks. Some blocks (like | |||
| block quotes and list items) contain other blocks; others (like | block quotes and list items) contain other blocks; others (like | |||
| headers and paragraphs) contain [inline](@inline) content---text, | headings and paragraphs) contain [inline](@inline) content---text, | |||
| links, emphasized text, images, code, and so on. | links, emphasized text, images, code, and so on. | |||
| ## Precedence | ## Precedence | |||
| Indicators of block structure always take precedence over indicators | Indicators of block structure always take precedence over indicators | |||
| of inline structure. So, for example, the following is a list with | of inline structure. So, for example, the following is a list with | |||
| two items, not a list with one item containing a code span: | two items, not a list with one item containing a code span: | |||
| . | . | |||
| - `one | - `one | |||
| - two` | - two` | |||
| . | . | |||
| <ul> | <ul> | |||
| <li>`one</li> | <li>`one</li> | |||
| <li>two`</li> | <li>two`</li> | |||
| </ul> | </ul> | |||
| . | . | |||
| This means that parsing can proceed in two steps: first, the block | This means that parsing can proceed in two steps: first, the block | |||
| structure of the document can be discerned; second, text lines inside | structure of the document can be discerned; second, text lines inside | |||
| paragraphs, headers, and other block constructs can be parsed for inline | paragraphs, headings, and other block constructs can be parsed for inline | |||
| structure. The second step requires information about link reference | structure. The second step requires information about link reference | |||
| definitions that will be available only at the end of the first | definitions that will be available only at the end of the first | |||
| step. Note that the first step requires processing lines in sequence, | step. Note that the first step requires processing lines in sequence, | |||
| but the second can be parallelized, since the inline parsing of | but the second can be parallelized, since the inline parsing of | |||
| one block element does not affect the inline parsing of any other. | one block element does not affect the inline parsing of any other. | |||
| ## Container blocks and leaf blocks | ## Container blocks and leaf blocks | |||
| We can divide blocks into two types: | We can divide blocks into two types: | |||
| [container block](@container-block)s, | [container block](@container-block)s, | |||
| which can contain other blocks, and [leaf block](@leaf-block)s, | which can contain other blocks, and [leaf block](@leaf-block)s, | |||
| which cannot. | which cannot. | |||
| # Leaf blocks | # Leaf blocks | |||
| This section describes the different kinds of leaf block that make up a | This section describes the different kinds of leaf block that make up a | |||
| Markdown document. | Markdown document. | |||
| ## Horizontal rules | ## Thematic breaks | |||
| A line consisting of 0-3 spaces of indentation, followed by a sequence | A line consisting of 0-3 spaces of indentation, followed by a sequence | |||
| of three or more matching `-`, `_`, or `*` characters, each followed | of three or more matching `-`, `_`, or `*` characters, each followed | |||
| optionally by any number of spaces, forms a | optionally by any number of spaces, forms a | |||
| [horizontal rule](@horizontal-rule). | [thematic break](@thematic-break). | |||
| . | . | |||
| *** | *** | |||
| --- | --- | |||
| ___ | ___ | |||
| . | . | |||
| <hr /> | <hr /> | |||
| <hr /> | <hr /> | |||
| <hr /> | <hr /> | |||
| . | . | |||
| skipping to change at line 492 | skipping to change at line 492 | |||
| a------ | a------ | |||
| ---a--- | ---a--- | |||
| . | . | |||
| <p>_ _ _ _ a</p> | <p>_ _ _ _ a</p> | |||
| <p>a------</p> | <p>a------</p> | |||
| <p>---a---</p> | <p>---a---</p> | |||
| . | . | |||
| It is required that all of the [non-whitespace character]s be the same. | It is required that all of the [non-whitespace character]s be the same. | |||
| So, this is not a horizontal rule: | So, this is not a thematic break: | |||
| . | . | |||
| *-* | *-* | |||
| . | . | |||
| <p><em>-</em></p> | <p><em>-</em></p> | |||
| . | . | |||
| Horizontal rules do not need blank lines before or after: | Thematic breaks do not need blank lines before or after: | |||
| . | . | |||
| - foo | - foo | |||
| *** | *** | |||
| - bar | - bar | |||
| . | . | |||
| <ul> | <ul> | |||
| <li>foo</li> | <li>foo</li> | |||
| </ul> | </ul> | |||
| <hr /> | <hr /> | |||
| <ul> | <ul> | |||
| <li>bar</li> | <li>bar</li> | |||
| </ul> | </ul> | |||
| . | . | |||
| Horizontal rules can interrupt a paragraph: | Thematic breaks can interrupt a paragraph: | |||
| . | . | |||
| Foo | Foo | |||
| *** | *** | |||
| bar | bar | |||
| . | . | |||
| <p>Foo</p> | <p>Foo</p> | |||
| <hr /> | <hr /> | |||
| <p>bar</p> | <p>bar</p> | |||
| . | . | |||
| If a line of dashes that meets the above conditions for being a | If a line of dashes that meets the above conditions for being a | |||
| horizontal rule could also be interpreted as the underline of a [setext | thematic break could also be interpreted as the underline of a [setext | |||
| header], the interpretation as a | heading], the interpretation as a | |||
| [setext header] takes precedence. Thus, for example, | [setext heading] takes precedence. Thus, for example, | |||
| this is a setext header, not a paragraph followed by a horizontal rule: | this is a setext heading, not a paragraph followed by a thematic break: | |||
| . | . | |||
| Foo | Foo | |||
| --- | --- | |||
| bar | bar | |||
| . | . | |||
| <h2>Foo</h2> | <h2>Foo</h2> | |||
| <p>bar</p> | <p>bar</p> | |||
| . | . | |||
| When both a horizontal rule and a list item are possible | When both a thematic break and a list item are possible | |||
| interpretations of a line, the horizontal rule takes precedence: | interpretations of a line, the thematic break takes precedence: | |||
| . | . | |||
| * Foo | * Foo | |||
| * * * | * * * | |||
| * Bar | * Bar | |||
| . | . | |||
| <ul> | <ul> | |||
| <li>Foo</li> | <li>Foo</li> | |||
| </ul> | </ul> | |||
| <hr /> | <hr /> | |||
| <ul> | <ul> | |||
| <li>Bar</li> | <li>Bar</li> | |||
| </ul> | </ul> | |||
| . | . | |||
| If you want a horizontal rule in a list item, use a different bullet: | If you want a thematic break in a list item, use a different bullet: | |||
| . | . | |||
| - Foo | - Foo | |||
| - * * * | - * * * | |||
| . | . | |||
| <ul> | <ul> | |||
| <li>Foo</li> | <li>Foo</li> | |||
| <li> | <li> | |||
| <hr /> | <hr /> | |||
| </li> | </li> | |||
| </ul> | </ul> | |||
| . | . | |||
| ## ATX headers | ## ATX headings | |||
| An [ATX header](@atx-header) | An [ATX heading](@atx-heading) | |||
| consists of a string of characters, parsed as inline content, between an | consists of a string of characters, parsed as inline content, between an | |||
| opening sequence of 1--6 unescaped `#` characters and an optional | opening sequence of 1--6 unescaped `#` characters and an optional | |||
| closing sequence of any number of unescaped `#` characters. | closing sequence of any number of unescaped `#` characters. | |||
| The opening sequence of `#` characters cannot be followed directly by a | The opening sequence of `#` characters must be followed by a | |||
| [non-whitespace character]. The optional closing sequence of `#`s must be | [space] or by the end of line. The optional closing sequence of `#`s must be | |||
| preceded by a [space] and may be followed by spaces only. The opening | preceded by a [space] and may be followed by spaces only. The opening | |||
| `#` character may be indented 0-3 spaces. The raw contents of the | `#` character may be indented 0-3 spaces. The raw contents of the | |||
| header are stripped of leading and trailing spaces before being parsed | heading are stripped of leading and trailing spaces before being parsed | |||
| as inline content. The header level is equal to the number of `#` | as inline content. The heading level is equal to the number of `#` | |||
| characters in the opening sequence. | characters in the opening sequence. | |||
| Simple headers: | Simple headings: | |||
| . | . | |||
| # foo | # foo | |||
| ## foo | ## foo | |||
| ### foo | ### foo | |||
| #### foo | #### foo | |||
| ##### foo | ##### foo | |||
| ###### foo | ###### foo | |||
| . | . | |||
| <h1>foo</h1> | <h1>foo</h1> | |||
| <h2>foo</h2> | <h2>foo</h2> | |||
| <h3>foo</h3> | <h3>foo</h3> | |||
| <h4>foo</h4> | <h4>foo</h4> | |||
| <h5>foo</h5> | <h5>foo</h5> | |||
| <h6>foo</h6> | <h6>foo</h6> | |||
| . | . | |||
| More than six `#` characters is not a header: | More than six `#` characters is not a heading: | |||
| . | . | |||
| ####### foo | ####### foo | |||
| . | . | |||
| <p>####### foo</p> | <p>####### foo</p> | |||
| . | . | |||
| At least one space is required between the `#` characters and the | At least one space is required between the `#` characters and the | |||
| header's contents, unless the header is empty. Note that many | heading's contents, unless the heading is empty. Note that many | |||
| implementations currently do not require the space. However, the | implementations currently do not require the space. However, the | |||
| space was required by the | space was required by the | |||
| [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py), | [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py), | |||
| and it helps prevent things like the following from being parsed as | and it helps prevent things like the following from being parsed as | |||
| headers: | headings: | |||
| . | . | |||
| #5 bolt | #5 bolt | |||
| #foobar | #hashtag | |||
| . | . | |||
| <p>#5 bolt</p> | <p>#5 bolt</p> | |||
| <p>#foobar</p> | <p>#hashtag</p> | |||
| . | . | |||
| This is not a header, because the first `#` is escaped: | A tab will not work: | |||
| . | ||||
| #→foo | ||||
| . | ||||
| <p>#→foo</p> | ||||
| . | ||||
| This is not a heading, because the first `#` is escaped: | ||||
| . | . | |||
| \## foo | \## foo | |||
| . | . | |||
| <p>## foo</p> | <p>## foo</p> | |||
| . | . | |||
| Contents are parsed as inlines: | Contents are parsed as inlines: | |||
| . | . | |||
| skipping to change at line 714 | skipping to change at line 722 | |||
| Spaces are allowed after the closing sequence: | Spaces are allowed after the closing sequence: | |||
| . | . | |||
| ### foo ### | ### foo ### | |||
| . | . | |||
| <h3>foo</h3> | <h3>foo</h3> | |||
| . | . | |||
| A sequence of `#` characters with anything but [space]s following it | A sequence of `#` characters with anything but [space]s following it | |||
| is not a closing sequence, but counts as part of the contents of the | is not a closing sequence, but counts as part of the contents of the | |||
| header: | heading: | |||
| . | . | |||
| ### foo ### b | ### foo ### b | |||
| . | . | |||
| <h3>foo ### b</h3> | <h3>foo ### b</h3> | |||
| . | . | |||
| The closing sequence must be preceded by a space: | The closing sequence must be preceded by a space: | |||
| . | . | |||
| skipping to change at line 743 | skipping to change at line 751 | |||
| . | . | |||
| ### foo \### | ### foo \### | |||
| ## foo #\## | ## foo #\## | |||
| # foo \# | # foo \# | |||
| . | . | |||
| <h3>foo ###</h3> | <h3>foo ###</h3> | |||
| <h2>foo ###</h2> | <h2>foo ###</h2> | |||
| <h1>foo #</h1> | <h1>foo #</h1> | |||
| . | . | |||
| ATX headers need not be separated from surrounding content by blank | ATX headings need not be separated from surrounding content by blank | |||
| lines, and they can interrupt paragraphs: | lines, and they can interrupt paragraphs: | |||
| . | . | |||
| **** | **** | |||
| ## foo | ## foo | |||
| **** | **** | |||
| . | . | |||
| <hr /> | <hr /> | |||
| <h2>foo</h2> | <h2>foo</h2> | |||
| <hr /> | <hr /> | |||
| skipping to change at line 766 | skipping to change at line 774 | |||
| . | . | |||
| Foo bar | Foo bar | |||
| # baz | # baz | |||
| Bar foo | Bar foo | |||
| . | . | |||
| <p>Foo bar</p> | <p>Foo bar</p> | |||
| <h1>baz</h1> | <h1>baz</h1> | |||
| <p>Bar foo</p> | <p>Bar foo</p> | |||
| . | . | |||
| ATX headers can be empty: | ATX headings can be empty: | |||
| . | . | |||
| ## | ## | |||
| # | # | |||
| ### ### | ### ### | |||
| . | . | |||
| <h2></h2> | <h2></h2> | |||
| <h1></h1> | <h1></h1> | |||
| <h3></h3> | <h3></h3> | |||
| . | . | |||
| ## Setext headers | ## Setext headings | |||
| A [setext header](@setext-header) | A [setext heading](@setext-heading) | |||
| consists of a line of text, containing at least one [non-whitespace character], | consists of a line of text, containing at least one [non-whitespace character], | |||
| with no more than 3 spaces indentation, followed by a [setext header | with no more than 3 spaces indentation, followed by a [setext heading | |||
| underline]. The line of text must be | underline]. The line of text must be | |||
| one that, were it not followed by the setext header underline, | one that, were it not followed by the setext heading underline, | |||
| would be interpreted as part of a paragraph: it cannot be | would be interpreted as part of a paragraph: it cannot be | |||
| interpretable as a [code fence], [ATX header][ATX headers], | interpretable as a [code fence], [ATX heading][ATX headings], | |||
| [block quote][block quotes], [horizontal rule][horizontal rules], | [block quote][block quotes], [thematic break][thematic breaks], | |||
| [list item][list items], or [HTML block][HTML blocks]. | [list item][list items], or [HTML block][HTML blocks]. | |||
| A [setext header underline](@setext-header-underline) is a sequence of | A [setext heading underline](@setext-heading-underline) is a sequence of | |||
| `=` characters or a sequence of `-` characters, with no more than 3 | `=` characters or a sequence of `-` characters, with no more than 3 | |||
| spaces indentation and any number of trailing spaces. If a line | spaces indentation and any number of trailing spaces. If a line | |||
| containing a single `-` can be interpreted as an | containing a single `-` can be interpreted as an | |||
| empty [list items], it should be interpreted this way | empty [list items], it should be interpreted this way | |||
| and not as a [setext header underline]. | and not as a [setext heading underline]. | |||
| The header is a level 1 header if `=` characters are used in the | The heading is a level 1 heading if `=` characters are used in the | |||
| [setext header underline], and a level 2 | [setext heading underline], and a level 2 | |||
| header if `-` characters are used. The contents of the header are the | heading if `-` characters are used. The contents of the heading are the | |||
| result of parsing the first line as Markdown inline content. | result of parsing the first line as Markdown inline content. | |||
| In general, a setext header need not be preceded or followed by a | In general, a setext heading need not be preceded or followed by a | |||
| blank line. However, it cannot interrupt a paragraph, so when a | blank line. However, it cannot interrupt a paragraph, so when a | |||
| setext header comes after a paragraph, a blank line is needed between | setext heading comes after a paragraph, a blank line is needed between | |||
| them. | them. | |||
| Simple examples: | Simple examples: | |||
| . | . | |||
| Foo *bar* | Foo *bar* | |||
| ========= | ========= | |||
| Foo *bar* | Foo *bar* | |||
| --------- | --------- | |||
| skipping to change at line 833 | skipping to change at line 841 | |||
| Foo | Foo | |||
| ------------------------- | ------------------------- | |||
| Foo | Foo | |||
| = | = | |||
| . | . | |||
| <h2>Foo</h2> | <h2>Foo</h2> | |||
| <h1>Foo</h1> | <h1>Foo</h1> | |||
| . | . | |||
| The header content can be indented up to three spaces, and need | The heading content can be indented up to three spaces, and need | |||
| not line up with the underlining: | not line up with the underlining: | |||
| . | . | |||
| Foo | Foo | |||
| --- | --- | |||
| Foo | Foo | |||
| ----- | ----- | |||
| Foo | Foo | |||
| skipping to change at line 868 | skipping to change at line 876 | |||
| --- | --- | |||
| . | . | |||
| <pre><code>Foo | <pre><code>Foo | |||
| --- | --- | |||
| Foo | Foo | |||
| </code></pre> | </code></pre> | |||
| <hr /> | <hr /> | |||
| . | . | |||
| The setext header underline can be indented up to three spaces, and | The setext heading underline can be indented up to three spaces, and | |||
| may have trailing spaces: | may have trailing spaces: | |||
| . | . | |||
| Foo | Foo | |||
| ---- | ---- | |||
| . | . | |||
| <h2>Foo</h2> | <h2>Foo</h2> | |||
| . | . | |||
| Four spaces is too much: | Four spaces is too much: | |||
| . | . | |||
| Foo | Foo | |||
| --- | --- | |||
| . | . | |||
| <p>Foo | <p>Foo | |||
| ---</p> | ---</p> | |||
| . | . | |||
| The setext header underline cannot contain internal spaces: | The setext heading underline cannot contain internal spaces: | |||
| . | . | |||
| Foo | Foo | |||
| = = | = = | |||
| Foo | Foo | |||
| --- - | --- - | |||
| . | . | |||
| <p>Foo | <p>Foo | |||
| = =</p> | = =</p> | |||
| skipping to change at line 922 | skipping to change at line 930 | |||
| Nor does a backslash at the end: | Nor does a backslash at the end: | |||
| . | . | |||
| Foo\ | Foo\ | |||
| ---- | ---- | |||
| . | . | |||
| <h2>Foo\</h2> | <h2>Foo\</h2> | |||
| . | . | |||
| Since indicators of block structure take precedence over | Since indicators of block structure take precedence over | |||
| indicators of inline structure, the following are setext headers: | indicators of inline structure, the following are setext headings: | |||
| . | . | |||
| `Foo | `Foo | |||
| ---- | ---- | |||
| ` | ` | |||
| <a title="a lot | <a title="a lot | |||
| --- | --- | |||
| of dashes"/> | of dashes"/> | |||
| . | . | |||
| <h2>`Foo</h2> | <h2>`Foo</h2> | |||
| <p>`</p> | <p>`</p> | |||
| <h2><a title="a lot</h2> | <h2><a title="a lot</h2> | |||
| <p>of dashes"/></p> | <p>of dashes"/></p> | |||
| . | . | |||
| The setext header underline cannot be a [lazy continuation | The setext heading underline cannot be a [lazy continuation | |||
| line] in a list item or block quote: | line] in a list item or block quote: | |||
| . | . | |||
| > Foo | > Foo | |||
| --- | --- | |||
| . | . | |||
| <blockquote> | <blockquote> | |||
| <p>Foo</p> | <p>Foo</p> | |||
| </blockquote> | </blockquote> | |||
| <hr /> | <hr /> | |||
| skipping to change at line 962 | skipping to change at line 970 | |||
| . | . | |||
| - Foo | - Foo | |||
| --- | --- | |||
| . | . | |||
| <ul> | <ul> | |||
| <li>Foo</li> | <li>Foo</li> | |||
| </ul> | </ul> | |||
| <hr /> | <hr /> | |||
| . | . | |||
| A setext header cannot interrupt a paragraph: | A setext heading cannot interrupt a paragraph: | |||
| . | . | |||
| Foo | Foo | |||
| Bar | Bar | |||
| --- | --- | |||
| Foo | Foo | |||
| Bar | Bar | |||
| === | === | |||
| . | . | |||
| skipping to change at line 997 | skipping to change at line 1005 | |||
| Bar | Bar | |||
| --- | --- | |||
| Baz | Baz | |||
| . | . | |||
| <hr /> | <hr /> | |||
| <h2>Foo</h2> | <h2>Foo</h2> | |||
| <h2>Bar</h2> | <h2>Bar</h2> | |||
| <p>Baz</p> | <p>Baz</p> | |||
| . | . | |||
| Setext headers cannot be empty: | Setext headings cannot be empty: | |||
| . | . | |||
| ==== | ==== | |||
| . | . | |||
| <p>====</p> | <p>====</p> | |||
| . | . | |||
| Setext header text lines must not be interpretable as block | Setext heading text lines must not be interpretable as block | |||
| constructs other than paragraphs. So, the line of dashes | constructs other than paragraphs. So, the line of dashes | |||
| in these examples gets interpreted as a horizontal rule: | in these examples gets interpreted as a thematic break: | |||
| . | . | |||
| --- | --- | |||
| --- | --- | |||
| . | . | |||
| <hr /> | <hr /> | |||
| <hr /> | <hr /> | |||
| . | . | |||
| . | . | |||
| skipping to change at line 1047 | skipping to change at line 1055 | |||
| . | . | |||
| > foo | > foo | |||
| ----- | ----- | |||
| . | . | |||
| <blockquote> | <blockquote> | |||
| <p>foo</p> | <p>foo</p> | |||
| </blockquote> | </blockquote> | |||
| <hr /> | <hr /> | |||
| . | . | |||
| If you want a header with `> foo` as its literal text, you can | If you want a heading with `> foo` as its literal text, you can | |||
| use backslash escapes: | use backslash escapes: | |||
| . | . | |||
| \> foo | \> foo | |||
| ------ | ------ | |||
| . | . | |||
| <h2>> foo</h2> | <h2>> foo</h2> | |||
| . | . | |||
| ## Indented code blocks | ## Indented code blocks | |||
| skipping to change at line 1189 | skipping to change at line 1197 | |||
| . | . | |||
| <pre><code>foo | <pre><code>foo | |||
| </code></pre> | </code></pre> | |||
| <p>bar</p> | <p>bar</p> | |||
| . | . | |||
| And indented code can occur immediately before and after other kinds of | And indented code can occur immediately before and after other kinds of | |||
| blocks: | blocks: | |||
| . | . | |||
| # Header | # Heading | |||
| foo | foo | |||
| Header | Heading | |||
| ------ | ------ | |||
| foo | foo | |||
| ---- | ---- | |||
| . | . | |||
| <h1>Header</h1> | <h1>Heading</h1> | |||
| <pre><code>foo | <pre><code>foo | |||
| </code></pre> | </code></pre> | |||
| <h2>Header</h2> | <h2>Heading</h2> | |||
| <pre><code>foo | <pre><code>foo | |||
| </code></pre> | </code></pre> | |||
| <hr /> | <hr /> | |||
| . | . | |||
| The first line can be indented more than four spaces: | The first line can be indented more than four spaces: | |||
| . | . | |||
| foo | foo | |||
| bar | bar | |||
| skipping to change at line 1357 | skipping to change at line 1365 | |||
| aaa | aaa | |||
| ~~~ | ~~~ | |||
| ~~~~ | ~~~~ | |||
| . | . | |||
| <pre><code>aaa | <pre><code>aaa | |||
| ~~~ | ~~~ | |||
| </code></pre> | </code></pre> | |||
| . | . | |||
| Unclosed code blocks are closed by the end of the document | Unclosed code blocks are closed by the end of the document | |||
| (or the enclosing [block quote] or [list item]): | (or the enclosing [block quote][block quotes] or [list item][list items]): | |||
| . | . | |||
| ``` | ``` | |||
| . | . | |||
| <pre><code></code></pre> | <pre><code></code></pre> | |||
| . | . | |||
| . | . | |||
| ````` | ````` | |||
| skipping to change at line 1977 | skipping to change at line 1985 | |||
| . | . | |||
| <style | <style | |||
| type="text/css"> | type="text/css"> | |||
| h1 {color:red;} | h1 {color:red;} | |||
| p {color:blue;} | p {color:blue;} | |||
| </style> | </style> | |||
| . | . | |||
| If there is no matching end tag, the block will end at the | If there is no matching end tag, the block will end at the | |||
| end of the document (or the enclosing [block quote] or | end of the document (or the enclosing [block quote][block quotes] | |||
| [list item]): | or [list item][list items]): | |||
| . | . | |||
| <style | <style | |||
| type="text/css"> | type="text/css"> | |||
| foo | foo | |||
| . | . | |||
| <style | <style | |||
| type="text/css"> | type="text/css"> | |||
| skipping to change at line 2536 | skipping to change at line 2544 | |||
| Foo | Foo | |||
| [bar]: /baz | [bar]: /baz | |||
| [bar] | [bar] | |||
| . | . | |||
| <p>Foo | <p>Foo | |||
| [bar]: /baz</p> | [bar]: /baz</p> | |||
| <p>[bar]</p> | <p>[bar]</p> | |||
| . | . | |||
| However, it can directly follow other block elements, such as headers | However, it can directly follow other block elements, such as headings | |||
| and horizontal rules, and it need not be followed by a blank line. | and thematic breaks, and it need not be followed by a blank line. | |||
| . | . | |||
| # [Foo] | # [Foo] | |||
| [foo]: /url | [foo]: /url | |||
| > bar | > bar | |||
| . | . | |||
| <h1><a href="/url">Foo</a></h1> | <h1><a href="/url">Foo</a></h1> | |||
| <blockquote> | <blockquote> | |||
| <p>bar</p> | <p>bar</p> | |||
| </blockquote> | </blockquote> | |||
| skipping to change at line 3400 | skipping to change at line 3408 | |||
| <pre><code>bar | <pre><code>bar | |||
| </code></pre> | </code></pre> | |||
| <p>baz</p> | <p>baz</p> | |||
| <blockquote> | <blockquote> | |||
| <p>bam</p> | <p>bam</p> | |||
| </blockquote> | </blockquote> | |||
| </li> | </li> | |||
| </ol> | </ol> | |||
| . | . | |||
| A list item that contains an indented code block will preserve | ||||
| empty lines within the code block verbatim, unless there are two | ||||
| or more empty lines in a row (since as described above, two | ||||
| blank lines end the list): | ||||
| . | ||||
| - Foo | ||||
| bar | ||||
| baz | ||||
| . | ||||
| <ul> | ||||
| <li> | ||||
| <p>Foo</p> | ||||
| <pre><code>bar | ||||
| baz | ||||
| </code></pre> | ||||
| </li> | ||||
| </ul> | ||||
| . | ||||
| . | ||||
| - Foo | ||||
| bar | ||||
| baz | ||||
| . | ||||
| <ul> | ||||
| <li> | ||||
| <p>Foo</p> | ||||
| <pre><code>bar | ||||
| </code></pre> | ||||
| </li> | ||||
| </ul> | ||||
| <pre><code> baz | ||||
| </code></pre> | ||||
| . | ||||
| Note that ordered list start numbers must be nine digits or less: | Note that ordered list start numbers must be nine digits or less: | |||
| . | . | |||
| 123456789. ok | 123456789. ok | |||
| . | . | |||
| <ol start="123456789"> | <ol start="123456789"> | |||
| <li>ok</li> | <li>ok</li> | |||
| </ol> | </ol> | |||
| . | . | |||
| skipping to change at line 3967 | skipping to change at line 4016 | |||
| <li> | <li> | |||
| <ol start="2"> | <ol start="2"> | |||
| <li>foo</li> | <li>foo</li> | |||
| </ol> | </ol> | |||
| </li> | </li> | |||
| </ul> | </ul> | |||
| </li> | </li> | |||
| </ol> | </ol> | |||
| . | . | |||
| A list item can contain a header: | A list item can contain a heading: | |||
| . | . | |||
| - # Foo | - # Foo | |||
| - Bar | - Bar | |||
| --- | --- | |||
| baz | baz | |||
| . | . | |||
| <ul> | <ul> | |||
| <li> | <li> | |||
| <h1>Foo</h1> | <h1>Foo</h1> | |||
| skipping to change at line 4778 | skipping to change at line 4827 | |||
| Escaped characters are treated as regular characters and do | Escaped characters are treated as regular characters and do | |||
| not have their usual Markdown meanings: | not have their usual Markdown meanings: | |||
| . | . | |||
| \*not emphasized* | \*not emphasized* | |||
| \<br/> not a tag | \<br/> not a tag | |||
| \[not a link](/foo) | \[not a link](/foo) | |||
| \`not code` | \`not code` | |||
| 1\. not a list | 1\. not a list | |||
| \* not a list | \* not a list | |||
| \# not a header | \# not a heading | |||
| \[foo]: /url "not a reference" | \[foo]: /url "not a reference" | |||
| . | . | |||
| <p>*not emphasized* | <p>*not emphasized* | |||
| <br/> not a tag | <br/> not a tag | |||
| [not a link](/foo) | [not a link](/foo) | |||
| `not code` | `not code` | |||
| 1. not a list | 1. not a list | |||
| * not a list | * not a list | |||
| # not a header | # not a heading | |||
| [foo]: /url "not a reference"</p> | [foo]: /url "not a reference"</p> | |||
| . | . | |||
| If a backslash is itself escaped, the following character is not: | If a backslash is itself escaped, the following character is not: | |||
| . | . | |||
| \\*emphasis* | \\*emphasis* | |||
| . | . | |||
| <p>\<em>emphasis</em></p> | <p>\<em>emphasis</em></p> | |||
| . | . | |||
| skipping to change at line 4872 | skipping to change at line 4921 | |||
| . | . | |||
| ``` foo\+bar | ``` foo\+bar | |||
| foo | foo | |||
| ``` | ``` | |||
| . | . | |||
| <pre><code class="language-foo+bar">foo | <pre><code class="language-foo+bar">foo | |||
| </code></pre> | </code></pre> | |||
| . | . | |||
| ## Entities | ## Entity and numeric character references | |||
| With the goal of making this standard as HTML-agnostic as possible, all | All valid HTML entity references and numeric character | |||
| valid HTML entities (except in code blocks and code spans) | references, except those occuring in code blocks, code spans, | |||
| are recognized as such and converted into Unicode characters before | and raw HTML, are recognized as such and treated as equivalent to the | |||
| they are stored in the AST. This means that renderers to formats other | corresponding Unicode characters. Conforming CommonMark parsers | |||
| than HTML need not be HTML-entity aware. HTML renderers may either escape | need not store information about whether a particular character | |||
| Unicode characters as entities or leave them as they are. (However, | was represented in the source using a Unicode character or | |||
| `"`, `&`, `<`, and `>` must always be rendered as entities.) | an entity reference. | |||
| [Named entities](@name-entities) consist of `&` + any of the valid | [Entity references](@entity-references) consist of `&` + any of the valid | |||
| HTML5 entity names + `;`. The | HTML5 entity names + `;`. The | |||
| [following document](https://html.spec.whatwg.org/multipage/entities.json) | document <https://html.spec.whatwg.org/multipage/entities.json> | |||
| is used as an authoritative source of the valid entity names and their | is used as an authoritative source for the valid entity | |||
| corresponding code points. | references and their corresponding code points. | |||
| . | . | |||
| & © Æ Ď | & © Æ Ď | |||
| ¾ ℋ ⅆ | ¾ ℋ ⅆ | |||
| ∲ ≧̸ | ∲ ≧̸ | |||
| . | . | |||
| <p> & © Æ Ď | <p> & © Æ Ď | |||
| ¾ ℋ ⅆ | ¾ ℋ ⅆ | |||
| ∲ ≧̸</p> | ∲ ≧̸</p> | |||
| . | . | |||
| [Decimal entities](@decimal-entities) | [Decimal numeric character | |||
| consist of `&#` + a string of 1--8 arabic digits + `;`. Again, these | references](@decimal-numeric-character-references) | |||
| entities need to be recognised and transformed into their corresponding | consist of `&#` + a string of 1--8 arabic digits + `;`. A | |||
| Unicode code points. Invalid Unicode code points will be replaced by | numeric character reference is parsed as the corresponding | |||
| Unicode character. Invalid Unicode code points will be replaced by | ||||
| the "unknown code point" character (`U+FFFD`). For security reasons, | the "unknown code point" character (`U+FFFD`). For security reasons, | |||
| the code point `U+0000` will also be replaced by `U+FFFD`. | the code point `U+0000` will also be replaced by `U+FFFD`. | |||
| . | . | |||
| # Ӓ Ϡ � � | # Ӓ Ϡ � � | |||
| . | . | |||
| <p># Ó’ Ï ï¿½ �</p> | <p># Ó’ Ï ï¿½ �</p> | |||
| . | . | |||
| [Hexadecimal entities](@hexadecimal-entities) consist of `&#` + either | [Hexadecimal numeric character | |||
| `X` or `x` + a string of 1-8 hexadecimal digits + `;`. They will also | references](@hexadecimal-numeric-character-references) consist of `&#` + | |||
| be parsed and turned into the corresponding Unicode code points in the | either `X` or `x` + a string of 1-8 hexadecimal digits + `;`. | |||
| AST. | They too are parsed as the corresponding Unicode character (this | |||
| time specified with a hexadecimal numeral instead of decimal). | ||||
| . | . | |||
| " ആ ಫ | " ആ ಫ | |||
| . | . | |||
| <p>" ആ ಫ</p> | <p>" ആ ಫ</p> | |||
| . | . | |||
| Here are some nonentities: | Here are some nonentities: | |||
| . | . | |||
|   &x; &#; &#x; &ThisIsWayTooLongToBeAnEntityIsntIt; &hi?; |   &x; &#; &#x; | |||
| &ThisIsWayTooLongToBeAnEntityIsntIt; &hi?; | ||||
| . | . | |||
| <p>&nbsp &x; &#; &#x; &ThisIsWayTooLongToBeAnEntityIsntIt; & | <p>&nbsp &x; &#; &#x; | |||
| amp;hi?;</p> | &ThisIsWayTooLongToBeAnEntityIsntIt; &hi?;</p> | |||
| . | . | |||
| Although HTML5 does accept some entities without a trailing semicolon | Although HTML5 does accept some entity references | |||
| (such as `©`), these are not recognized as entities here, because it | without a trailing semicolon (such as `©`), these are not | |||
| makes the grammar too ambiguous: | recognized here, because it makes the grammar too ambiguous: | |||
| . | . | |||
| © | © | |||
| . | . | |||
| <p>&copy</p> | <p>&copy</p> | |||
| . | . | |||
| Strings that are not on the list of HTML5 named entities are not | Strings that are not on the list of HTML5 named entities are not | |||
| recognized as entities either: | recognized as entity references either: | |||
| . | . | |||
| &MadeUpEntity; | &MadeUpEntity; | |||
| . | . | |||
| <p>&MadeUpEntity;</p> | <p>&MadeUpEntity;</p> | |||
| . | . | |||
| Entities are recognized in any context besides code spans or | Entity and numeric character references are recognized in any | |||
| code blocks, including raw HTML, URLs, [link title]s, and | context besides code spans or code blocks or raw HTML, including | |||
| [fenced code block] [info string]s: | URLs, [link title]s, and [fenced code block][] [info string]s: | |||
| . | . | |||
| <a href="öö.html"> | <a href="öö.html"> | |||
| . | . | |||
| <a href="öö.html"> | <a href="öö.html"> | |||
| . | . | |||
| . | . | |||
| [foo](/föö "föö") | [foo](/föö "föö") | |||
| . | . | |||
| skipping to change at line 4982 | skipping to change at line 5035 | |||
| . | . | |||
| ``` föö | ``` föö | |||
| foo | foo | |||
| ``` | ``` | |||
| . | . | |||
| <pre><code class="language-föö">foo | <pre><code class="language-föö">foo | |||
| </code></pre> | </code></pre> | |||
| . | . | |||
| Entities are treated as literal text in code spans and code blocks: | Entity and numeric character references are treated as literal | |||
| text in code spans and code blocks, and in raw HTML: | ||||
| . | . | |||
| `föö` | `föö` | |||
| . | . | |||
| <p><code>f&ouml;&ouml;</code></p> | <p><code>f&ouml;&ouml;</code></p> | |||
| . | . | |||
| . | . | |||
| föfö | föfö | |||
| . | . | |||
| <pre><code>f&ouml;f&ouml; | <pre><code>f&ouml;f&ouml; | |||
| </code></pre> | </code></pre> | |||
| . | . | |||
| . | ||||
| <a href="föfö"/> | ||||
| . | ||||
| <a href="föfö"/> | ||||
| . | ||||
| ## Code spans | ## Code spans | |||
| A [backtick string](@backtick-string) | A [backtick string](@backtick-string) | |||
| is a string of one or more backtick characters (`` ` ``) that is neither | is a string of one or more backtick characters (`` ` ``) that is neither | |||
| preceded nor followed by a backtick. | preceded nor followed by a backtick. | |||
| A [code span](@code-span) begins with a backtick string and ends with | A [code span](@code-span) begins with a backtick string and ends with | |||
| a backtick string of equal length. The contents of the code span are | a backtick string of equal length. The contents of the code span are | |||
| the characters between the two backtick strings, with leading and | the characters between the two backtick strings, with leading and | |||
| trailing spaces and [line ending]s removed, and | trailing spaces and [line ending]s removed, and | |||
| skipping to change at line 5269 | skipping to change at line 5329 | |||
| are a bit more complex than the ones given here.) | are a bit more complex than the ones given here.) | |||
| The following rules define emphasis and strong emphasis: | The following rules define emphasis and strong emphasis: | |||
| 1. A single `*` character [can open emphasis](@can-open-emphasis) | 1. A single `*` character [can open emphasis](@can-open-emphasis) | |||
| iff (if and only if) it is part of a [left-flanking delimiter run]. | iff (if and only if) it is part of a [left-flanking delimiter run]. | |||
| 2. A single `_` character [can open emphasis] iff | 2. A single `_` character [can open emphasis] iff | |||
| it is part of a [left-flanking delimiter run] | it is part of a [left-flanking delimiter run] | |||
| and either (a) not part of a [right-flanking delimiter run] | and either (a) not part of a [right-flanking delimiter run] | |||
| or (b) part of a [right-flanking delimeter run] | or (b) part of a [right-flanking delimiter run] | |||
| preceded by punctuation. | preceded by punctuation. | |||
| 3. A single `*` character [can close emphasis](@can-close-emphasis) | 3. A single `*` character [can close emphasis](@can-close-emphasis) | |||
| iff it is part of a [right-flanking delimiter run]. | iff it is part of a [right-flanking delimiter run]. | |||
| 4. A single `_` character [can close emphasis] iff | 4. A single `_` character [can close emphasis] iff | |||
| it is part of a [right-flanking delimiter run] | it is part of a [right-flanking delimiter run] | |||
| and either (a) not part of a [left-flanking delimiter run] | and either (a) not part of a [left-flanking delimiter run] | |||
| or (b) part of a [left-flanking delimeter run] | or (b) part of a [left-flanking delimiter run] | |||
| followed by punctuation. | followed by punctuation. | |||
| 5. A double `**` [can open strong emphasis](@can-open-strong-emphasis) | 5. A double `**` [can open strong emphasis](@can-open-strong-emphasis) | |||
| iff it is part of a [left-flanking delimiter run]. | iff it is part of a [left-flanking delimiter run]. | |||
| 6. A double `__` [can open strong emphasis] iff | 6. A double `__` [can open strong emphasis] iff | |||
| it is part of a [left-flanking delimiter run] | it is part of a [left-flanking delimiter run] | |||
| and either (a) not part of a [right-flanking delimiter run] | and either (a) not part of a [right-flanking delimiter run] | |||
| or (b) part of a [right-flanking delimeter run] | or (b) part of a [right-flanking delimiter run] | |||
| preceded by punctuation. | preceded by punctuation. | |||
| 7. A double `**` [can close strong emphasis](@can-close-strong-emphasis) | 7. A double `**` [can close strong emphasis](@can-close-strong-emphasis) | |||
| iff it is part of a [right-flanking delimiter run]. | iff it is part of a [right-flanking delimiter run]. | |||
| 8. A double `__` [can close strong emphasis] | 8. A double `__` [can close strong emphasis] | |||
| it is part of a [right-flanking delimiter run] | it is part of a [right-flanking delimiter run] | |||
| and either (a) not part of a [left-flanking delimiter run] | and either (a) not part of a [left-flanking delimiter run] | |||
| or (b) part of a [left-flanking delimeter run] | or (b) part of a [left-flanking delimiter run] | |||
| followed by punctuation. | followed by punctuation. | |||
| 9. Emphasis begins with a delimiter that [can open emphasis] and ends | 9. Emphasis begins with a delimiter that [can open emphasis] and ends | |||
| with a delimiter that [can close emphasis], and that uses the same | with a delimiter that [can close emphasis], and that uses the same | |||
| character (`_` or `*`) as the opening delimiter. There must | character (`_` or `*`) as the opening delimiter. There must | |||
| be a nonempty sequence of inlines between the open delimiter | be a nonempty sequence of inlines between the open delimiter | |||
| and the closing delimiter; these form the contents of the emphasis | and the closing delimiter; these form the contents of the emphasis | |||
| inline. | inline. | |||
| 10. Strong emphasis begins with a delimiter that | 10. Strong emphasis begins with a delimiter that | |||
| skipping to change at line 6512 | skipping to change at line 6572 | |||
| <p><a href="foo):">link</a></p> | <p><a href="foo):">link</a></p> | |||
| . | . | |||
| A link can contain fragment identifiers and queries: | A link can contain fragment identifiers and queries: | |||
| . | . | |||
| [link](#fragment) | [link](#fragment) | |||
| [link](http://example.com#fragment) | [link](http://example.com#fragment) | |||
| [link](http://example.com?foo=bar&baz#fragment) | [link](http://example.com?foo=3#frag) | |||
| . | . | |||
| <p><a href="#fragment">link</a></p> | <p><a href="#fragment">link</a></p> | |||
| <p><a href="http://example.com#fragment">link</a></p> | <p><a href="http://example.com#fragment">link</a></p> | |||
| <p><a href="http://example.com?foo=bar&baz#fragment">link</a></p> | <p><a href="http://example.com?foo=3#frag">link</a></p> | |||
| . | . | |||
| Note that a backslash before a non-escapable character is | Note that a backslash before a non-escapable character is | |||
| just a backslash: | just a backslash: | |||
| . | . | |||
| [link](foo\bar) | [link](foo\bar) | |||
| . | . | |||
| <p><a href="foo%5Cbar">link</a></p> | <p><a href="foo%5Cbar">link</a></p> | |||
| . | . | |||
| URL-escaping should be left alone inside the destination, as all | URL-escaping should be left alone inside the destination, as all | |||
| URL-escaped characters are also valid URL characters. HTML entities in | URL-escaped characters are also valid URL characters. Entity and | |||
| the destination will be parsed into the corresponding Unicode | numerical character references in the destination will be parsed | |||
| code points, as usual, and optionally URL-escaped when written as HTML. | into the corresponding Unicode code points, as usual. These may | |||
| be optionally URL-escaped when written as HTML, but this spec | ||||
| does not enforce any particular policy for rendering URLs in | ||||
| HTML or other formats. Renderers may make different decisions | ||||
| about how to escape or normalize URLs in the output. | ||||
| . | . | |||
| [link](foo%20bä) | [link](foo%20bä) | |||
| . | . | |||
| <p><a href="foo%20b%C3%A4">link</a></p> | <p><a href="foo%20b%C3%A4">link</a></p> | |||
| . | . | |||
| Note that, because titles can often be parsed as destinations, | Note that, because titles can often be parsed as destinations, | |||
| if you try to omit the destination and keep the title, you'll | if you try to omit the destination and keep the title, you'll | |||
| get unexpected results: | get unexpected results: | |||
| skipping to change at line 6561 | skipping to change at line 6625 | |||
| . | . | |||
| [link](/url "title") | [link](/url "title") | |||
| [link](/url 'title') | [link](/url 'title') | |||
| [link](/url (title)) | [link](/url (title)) | |||
| . | . | |||
| <p><a href="/url" title="title">link</a> | <p><a href="/url" title="title">link</a> | |||
| <a href="/url" title="title">link</a> | <a href="/url" title="title">link</a> | |||
| <a href="/url" title="title">link</a></p> | <a href="/url" title="title">link</a></p> | |||
| . | . | |||
| Backslash escapes and entities may be used in titles: | Backslash escapes and entity and numeric character references | |||
| may be used in titles: | ||||
| . | . | |||
| [link](/url "title \""") | [link](/url "title \""") | |||
| . | . | |||
| <p><a href="/url" title="title """>link</a></p> | <p><a href="/url" title="title """>link</a></p> | |||
| . | . | |||
| Nested balanced quotes are not allowed without escaping: | Nested balanced quotes are not allowed without escaping: | |||
| . | . | |||
| skipping to change at line 6589 | skipping to change at line 6654 | |||
| . | . | |||
| [link](/url 'title "and" title') | [link](/url 'title "and" title') | |||
| . | . | |||
| <p><a href="/url" title="title "and" title">link</a></p> | <p><a href="/url" title="title "and" title">link</a></p> | |||
| . | . | |||
| (Note: `Markdown.pl` did allow double quotes inside a double-quoted | (Note: `Markdown.pl` did allow double quotes inside a double-quoted | |||
| title, and its test suite included a test demonstrating this. | title, and its test suite included a test demonstrating this. | |||
| But it is hard to see a good rationale for the extra complexity this | But it is hard to see a good rationale for the extra complexity this | |||
| brings, since there are already many ways---backslash escaping, | brings, since there are already many ways---backslash escaping, | |||
| entities, or using a different quote type for the enclosing title---to | entity and numeric character references, or using a different | |||
| write titles containing double quotes. `Markdown.pl`'s handling of | quote type for the enclosing title---to write titles containing | |||
| titles has a number of other strange features. For example, it allows | double quotes. `Markdown.pl`'s handling of titles has a number | |||
| single-quoted titles in inline links, but not reference links. And, in | of other strange features. For example, it allows single-quoted | |||
| reference links but not inline links, it allows a title to begin with | titles in inline links, but not reference links. And, in | |||
| `"` and end with `)`. `Markdown.pl` 1.0.1 even allows titles with no closing | reference links but not inline links, it allows a title to begin | |||
| quotation mark, though 1.0.2b8 does not. It seems preferable to adopt | with `"` and end with `)`. `Markdown.pl` 1.0.1 even allows | |||
| a simple, rational rule that works the same way in inline links and | titles with no closing quotation mark, though 1.0.2b8 does not. | |||
| link reference definitions.) | It seems preferable to adopt a simple, rational rule that works | |||
| the same way in inline links and link reference definitions.) | ||||
| [Whitespace] is allowed around the destination and title: | [Whitespace] is allowed around the destination and title: | |||
| . | . | |||
| [link]( /uri | [link]( /uri | |||
| "title" ) | "title" ) | |||
| . | . | |||
| <p><a href="/uri" title="title">link</a></p> | <p><a href="/uri" title="title">link</a></p> | |||
| . | . | |||
| skipping to change at line 6728 | skipping to change at line 6794 | |||
| [foo<http://example.com/?search=](uri)> | [foo<http://example.com/?search=](uri)> | |||
| . | . | |||
| <p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search= ](uri)</a></p> | <p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search= ](uri)</a></p> | |||
| . | . | |||
| There are three kinds of [reference link](@reference-link)s: | There are three kinds of [reference link](@reference-link)s: | |||
| [full](#full-reference-link), [collapsed](#collapsed-reference-link), | [full](#full-reference-link), [collapsed](#collapsed-reference-link), | |||
| and [shortcut](#shortcut-reference-link). | and [shortcut](#shortcut-reference-link). | |||
| A [full reference link](@full-reference-link) | A [full reference link](@full-reference-link) | |||
| consists of a [link text], optional [whitespace], and a [link label] | consists of a [link text] immediately followed by a [link label] | |||
| that [matches] a [link reference definition] elsewhere in the document. | that [matches] a [link reference definition] elsewhere in the document. | |||
| A [link label](@link-label) begins with a left bracket (`[`) and ends | A [link label](@link-label) begins with a left bracket (`[`) and ends | |||
| with the first right bracket (`]`) that is not backslash-escaped. | with the first right bracket (`]`) that is not backslash-escaped. | |||
| Between these brackets there must be at least one [non-whitespace character]. | Between these brackets there must be at least one [non-whitespace character]. | |||
| Unescaped square bracket characters are not allowed in | Unescaped square bracket characters are not allowed in | |||
| [link label]s. A link label can have at most 999 | [link label]s. A link label can have at most 999 | |||
| characters inside the square brackets. | characters inside the square brackets. | |||
| One label [matches](@matches) | One label [matches](@matches) | |||
| skipping to change at line 6898 | skipping to change at line 6964 | |||
| . | . | |||
| [Foo | [Foo | |||
| bar]: /url | bar]: /url | |||
| [Baz][Foo bar] | [Baz][Foo bar] | |||
| . | . | |||
| <p><a href="/url">Baz</a></p> | <p><a href="/url">Baz</a></p> | |||
| . | . | |||
| There can be [whitespace] between the [link text] and the [link label]: | No [whitespace] is allowed between the [link text] and the | |||
| [link label]: | ||||
| . | . | |||
| [foo] [bar] | [foo] [bar] | |||
| [bar]: /url "title" | [bar]: /url "title" | |||
| . | . | |||
| <p><a href="/url" title="title">foo</a></p> | <p>[foo] <a href="/url" title="title">bar</a></p> | |||
| . | . | |||
| . | . | |||
| [foo] | [foo] | |||
| [bar] | [bar] | |||
| [bar]: /url "title" | [bar]: /url "title" | |||
| . | . | |||
| <p><a href="/url" title="title">foo</a></p> | <p>[foo] | |||
| <a href="/url" title="title">bar</a></p> | ||||
| . | . | |||
| This is a departure from John Gruber's original Markdown syntax | ||||
| description, which explicitly allows whitespace between the link | ||||
| text and the link label. It brings reference links in line with | ||||
| [inline link]s, which (according to both original Markdown and | ||||
| this spec) cannot have whitespace after the link text. More | ||||
| importantly, it prevents inadvertent capture of consecutive | ||||
| [shortcut reference link]s. If whitespace is allowed between the | ||||
| link text and the link label, then in the following we will have | ||||
| a single reference link, not two shortcut reference links, as | ||||
| intended: | ||||
| ``` markdown | ||||
| [foo] | ||||
| [bar] | ||||
| [foo]: /url1 | ||||
| [bar]: /url2 | ||||
| ``` | ||||
| (Note that [shortcut reference link]s were introduced by Gruber | ||||
| himself in a beta version of `Markdown.pl`, but never included | ||||
| in the official syntax description. Without shortcut reference | ||||
| links, it is harmless to allow space between the link text and | ||||
| link label; but once shortcut references are introduced, it is | ||||
| too dangerous to allow this, as it frequently leads to | ||||
| unintended results.) | ||||
| When there are multiple matching [link reference definition]s, | When there are multiple matching [link reference definition]s, | |||
| the first is used: | the first is used: | |||
| . | . | |||
| [foo]: /url1 | [foo]: /url1 | |||
| [foo]: /url2 | [foo]: /url2 | |||
| [bar][foo] | [bar][foo] | |||
| . | . | |||
| skipping to change at line 6980 | skipping to change at line 7075 | |||
| . | . | |||
| . | . | |||
| [foo][ref\[] | [foo][ref\[] | |||
| [ref\[]: /uri | [ref\[]: /uri | |||
| . | . | |||
| <p><a href="/uri">foo</a></p> | <p><a href="/uri">foo</a></p> | |||
| . | . | |||
| Note that in this example `]` is not backslash-escaped: | ||||
| . | ||||
| [bar\\]: /uri | ||||
| [bar\\] | ||||
| . | ||||
| <p><a href="/uri">bar\</a></p> | ||||
| . | ||||
| A [link label] must contain at least one [non-whitespace character]: | A [link label] must contain at least one [non-whitespace character]: | |||
| . | . | |||
| [] | [] | |||
| []: /uri | []: /uri | |||
| . | . | |||
| <p>[]</p> | <p>[]</p> | |||
| <p>[]: /uri</p> | <p>[]: /uri</p> | |||
| . | . | |||
| skipping to change at line 7007 | skipping to change at line 7112 | |||
| . | . | |||
| <p>[ | <p>[ | |||
| ]</p> | ]</p> | |||
| <p>[ | <p>[ | |||
| ]: /uri</p> | ]: /uri</p> | |||
| . | . | |||
| A [collapsed reference link](@collapsed-reference-link) | A [collapsed reference link](@collapsed-reference-link) | |||
| consists of a [link label] that [matches] a | consists of a [link label] that [matches] a | |||
| [link reference definition] elsewhere in the | [link reference definition] elsewhere in the | |||
| document, optional [whitespace], and the string `[]`. | document, followed by the string `[]`. | |||
| The contents of the first link label are parsed as inlines, | The contents of the first link label are parsed as inlines, | |||
| which are used as the link's text. The link's URI and title are | which are used as the link's text. The link's URI and title are | |||
| provided by the matching reference link definition. Thus, | provided by the matching reference link definition. Thus, | |||
| `[foo][]` is equivalent to `[foo][foo]`. | `[foo][]` is equivalent to `[foo][foo]`. | |||
| . | . | |||
| [foo][] | [foo][] | |||
| [foo]: /url "title" | [foo]: /url "title" | |||
| . | . | |||
| skipping to change at line 7039 | skipping to change at line 7144 | |||
| The link labels are case-insensitive: | The link labels are case-insensitive: | |||
| . | . | |||
| [Foo][] | [Foo][] | |||
| [foo]: /url "title" | [foo]: /url "title" | |||
| . | . | |||
| <p><a href="/url" title="title">Foo</a></p> | <p><a href="/url" title="title">Foo</a></p> | |||
| . | . | |||
| As with full reference links, [whitespace] is allowed | As with full reference links, [whitespace] is not | |||
| between the two sets of brackets: | allowed between the two sets of brackets: | |||
| . | . | |||
| [foo] | [foo] | |||
| [] | [] | |||
| [foo]: /url "title" | [foo]: /url "title" | |||
| . | . | |||
| <p><a href="/url" title="title">foo</a></p> | <p><a href="/url" title="title">foo</a> | |||
| []</p> | ||||
| . | . | |||
| A [shortcut reference link](@shortcut-reference-link) | A [shortcut reference link](@shortcut-reference-link) | |||
| consists of a [link label] that [matches] a | consists of a [link label] that [matches] a | |||
| [link reference definition] elsewhere in the | [link reference definition] elsewhere in the | |||
| document and is not followed by `[]` or a link label. | document and is not followed by `[]` or a link label. | |||
| The contents of the first link label are parsed as inlines, | The contents of the first link label are parsed as inlines, | |||
| which are used as the link's text. the link's URI and title | which are used as the link's text. the link's URI and title | |||
| are provided by the matching link reference definition. | are provided by the matching link reference definition. | |||
| Thus, `[foo]` is equivalent to `[foo][]`. | Thus, `[foo]` is equivalent to `[foo][]`. | |||
| skipping to change at line 7268 | skipping to change at line 7374 | |||
| . | . | |||
|  |  | |||
| . | . | |||
| <p><img src="/url" alt="" /></p> | <p><img src="/url" alt="" /></p> | |||
| . | . | |||
| Reference-style: | Reference-style: | |||
| . | . | |||
| ![foo] [bar] | ![foo][bar] | |||
| [bar]: /url | [bar]: /url | |||
| . | . | |||
| <p><img src="/url" alt="foo" /></p> | <p><img src="/url" alt="foo" /></p> | |||
| . | . | |||
| . | . | |||
| ![foo] [bar] | ![foo][bar] | |||
| [BAR]: /url | [BAR]: /url | |||
| . | . | |||
| <p><img src="/url" alt="foo" /></p> | <p><img src="/url" alt="foo" /></p> | |||
| . | . | |||
| Collapsed: | Collapsed: | |||
| . | . | |||
| ![foo][] | ![foo][] | |||
| skipping to change at line 7311 | skipping to change at line 7417 | |||
| The labels are case-insensitive: | The labels are case-insensitive: | |||
| . | . | |||
| ![Foo][] | ![Foo][] | |||
| [foo]: /url "title" | [foo]: /url "title" | |||
| . | . | |||
| <p><img src="/url" alt="Foo" title="title" /></p> | <p><img src="/url" alt="Foo" title="title" /></p> | |||
| . | . | |||
| As with full reference links, [whitespace] is allowed | As with reference links, [whitespace] is not allowed | |||
| between the two sets of brackets: | between the two sets of brackets: | |||
| . | . | |||
| ![foo] | ![foo] | |||
| [] | [] | |||
| [foo]: /url "title" | [foo]: /url "title" | |||
| . | . | |||
| <p><img src="/url" alt="foo" title="title" /></p> | <p><img src="/url" alt="foo" title="title" /> | |||
| []</p> | ||||
| . | . | |||
| Shortcut: | Shortcut: | |||
| . | . | |||
| ![foo] | ![foo] | |||
| [foo]: /url "title" | [foo]: /url "title" | |||
| . | . | |||
| <p><img src="/url" alt="foo" title="title" /></p> | <p><img src="/url" alt="foo" title="title" /></p> | |||
| skipping to change at line 7594 | skipping to change at line 7701 | |||
| A [single-quoted attribute value](@single-quoted-attribute-value) | A [single-quoted attribute value](@single-quoted-attribute-value) | |||
| consists of `'`, zero or more | consists of `'`, zero or more | |||
| characters not including `'`, and a final `'`. | characters not including `'`, and a final `'`. | |||
| A [double-quoted attribute value](@double-quoted-attribute-value) | A [double-quoted attribute value](@double-quoted-attribute-value) | |||
| consists of `"`, zero or more | consists of `"`, zero or more | |||
| characters not including `"`, and a final `"`. | characters not including `"`, and a final `"`. | |||
| An [open tag](@open-tag) consists of a `<` character, a [tag name], | An [open tag](@open-tag) consists of a `<` character, a [tag name], | |||
| zero or more [attributes](@attribute], optional [whitespace], an optional `/` | zero or more [attribute]s, optional [whitespace], an optional `/` | |||
| character, and a `>` character. | character, and a `>` character. | |||
| A [closing tag](@closing-tag) consists of the string `</`, a | A [closing tag](@closing-tag) consists of the string `</`, a | |||
| [tag name], optional [whitespace], and the character `>`. | [tag name], optional [whitespace], and the character `>`. | |||
| An [HTML comment](@html-comment) consists of `<!--` + *text* + `-->`, | An [HTML comment](@html-comment) consists of `<!--` + *text* + `-->`, | |||
| where *text* does not start with `>` or `->`, does not end with `-`, | where *text* does not start with `>` or `->`, does not end with `-`, | |||
| and does not contain `--`. (See the | and does not contain `--`. (See the | |||
| [HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).) | [HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).) | |||
| skipping to change at line 7662 | skipping to change at line 7769 | |||
| <a foo="bar" bam = 'baz <em>"</em>' | <a foo="bar" bam = 'baz <em>"</em>' | |||
| _boolean zoop:33=zoop:33 /> | _boolean zoop:33=zoop:33 /> | |||
| . | . | |||
| <p><a foo="bar" bam = 'baz <em>"</em>' | <p><a foo="bar" bam = 'baz <em>"</em>' | |||
| _boolean zoop:33=zoop:33 /></p> | _boolean zoop:33=zoop:33 /></p> | |||
| . | . | |||
| Custom tag names can be used: | Custom tag names can be used: | |||
| . | . | |||
| <responsive-image src="foo.jpg" /> | Foo <responsive-image src="foo.jpg" /> | |||
| <My-Tag> | ||||
| foo | ||||
| </My-Tag> | ||||
| . | . | |||
| <responsive-image src="foo.jpg" /> | <p>Foo <responsive-image src="foo.jpg" /></p> | |||
| <My-Tag> | ||||
| foo | ||||
| </My-Tag> | ||||
| . | . | |||
| Illegal tag names, not parsed as HTML: | Illegal tag names, not parsed as HTML: | |||
| . | . | |||
| <33> <__> | <33> <__> | |||
| . | . | |||
| <p><33> <__></p> | <p><33> <__></p> | |||
| . | . | |||
| skipping to change at line 7719 | skipping to change at line 7819 | |||
| . | . | |||
| <a href='bar'title=title> | <a href='bar'title=title> | |||
| . | . | |||
| <p><a href='bar'title=title></p> | <p><a href='bar'title=title></p> | |||
| . | . | |||
| Closing tags: | Closing tags: | |||
| . | . | |||
| </a> | </a></foo > | |||
| </foo > | ||||
| . | . | |||
| </a> | <p></a></foo ></p> | |||
| </foo > | ||||
| . | . | |||
| Illegal attributes in closing tag: | Illegal attributes in closing tag: | |||
| . | . | |||
| </a href="foo"> | </a href="foo"> | |||
| . | . | |||
| <p></a href="foo"></p> | <p></a href="foo"></p> | |||
| . | . | |||
| skipping to change at line 7785 | skipping to change at line 7883 | |||
| . | . | |||
| CDATA sections: | CDATA sections: | |||
| . | . | |||
| foo <![CDATA[>&<]]> | foo <![CDATA[>&<]]> | |||
| . | . | |||
| <p>foo <![CDATA[>&<]]></p> | <p>foo <![CDATA[>&<]]></p> | |||
| . | . | |||
| Entities are preserved in HTML attributes: | Entity and numeric character references are preserved in HTML | |||
| attributes: | ||||
| . | . | |||
| <a href="ö"> | foo <a href="ö"> | |||
| . | . | |||
| <a href="ö"> | <p>foo <a href="ö"></p> | |||
| . | . | |||
| Backslash escapes do not work in HTML attributes: | Backslash escapes do not work in HTML attributes: | |||
| . | . | |||
| <a href="\*"> | foo <a href="\*"> | |||
| . | . | |||
| <a href="\*"> | <p>foo <a href="\*"></p> | |||
| . | . | |||
| . | . | |||
| <a href="\""> | <a href="\""> | |||
| . | . | |||
| <p><a href="""></p> | <p><a href="""></p> | |||
| . | . | |||
| ## Hard line breaks | ## Hard line breaks | |||
| skipping to change at line 8017 | skipping to change at line 8116 | |||
| ## Overview {-} | ## Overview {-} | |||
| Parsing has two phases: | Parsing has two phases: | |||
| 1. In the first phase, lines of input are consumed and the block | 1. In the first phase, lines of input are consumed and the block | |||
| structure of the document---its division into paragraphs, block quotes, | structure of the document---its division into paragraphs, block quotes, | |||
| list items, and so on---is constructed. Text is assigned to these | list items, and so on---is constructed. Text is assigned to these | |||
| blocks but not parsed. Link reference definitions are parsed and a | blocks but not parsed. Link reference definitions are parsed and a | |||
| map of links is constructed. | map of links is constructed. | |||
| 2. In the second phase, the raw text contents of paragraphs and headers | 2. In the second phase, the raw text contents of paragraphs and headings | |||
| are parsed into sequences of Markdown inline elements (strings, | are parsed into sequences of Markdown inline elements (strings, | |||
| code spans, links, emphasis, and so on), using the map of link | code spans, links, emphasis, and so on), using the map of link | |||
| references constructed in phase 1. | references constructed in phase 1. | |||
| At each point in processing, the document is represented as a tree of | At each point in processing, the document is represented as a tree of | |||
| **blocks**. The root of the tree is a `document` block. The `document` | **blocks**. The root of the tree is a `document` block. The `document` | |||
| may have any number of other blocks as **children**. These children | may have any number of other blocks as **children**. These children | |||
| may, in turn, have other blocks as children. The last child of a block | may, in turn, have other blocks as children. The last child of a block | |||
| is normally considered **open**, meaning that subsequent lines of input | is normally considered **open**, meaning that subsequent lines of input | |||
| can alter its contents. (Blocks that are not open are **closed**.) | can alter its contents. (Blocks that are not open are **closed**.) | |||
| skipping to change at line 8080 | skipping to change at line 8179 | |||
| 2. Next, after consuming the continuation markers for existing | 2. Next, after consuming the continuation markers for existing | |||
| blocks, we look for new block starts (e.g. `>` for a block quote. | blocks, we look for new block starts (e.g. `>` for a block quote. | |||
| If we encounter a new block start, we close any blocks unmatched | If we encounter a new block start, we close any blocks unmatched | |||
| in step 1 before creating the new block as a child of the last | in step 1 before creating the new block as a child of the last | |||
| matched block. | matched block. | |||
| 3. Finally, we look at the remainder of the line (after block | 3. Finally, we look at the remainder of the line (after block | |||
| markers like `>`, list markers, and indentation have been consumed). | markers like `>`, list markers, and indentation have been consumed). | |||
| This is text that can be incorporated into the last open | This is text that can be incorporated into the last open | |||
| block (a paragraph, code block, header, or raw HTML). | block (a paragraph, code block, heading, or raw HTML). | |||
| Setext headers are formed when we detect that the second line of | Setext headings are formed when we detect that the second line of | |||
| a paragraph is a setext header line. | a paragraph is a setext heading line. | |||
| Reference link definitions are detected when a paragraph is closed; | Reference link definitions are detected when a paragraph is closed; | |||
| the accumulated text lines are parsed to see if they begin with | the accumulated text lines are parsed to see if they begin with | |||
| one or more reference link definitions. Any remainder becomes a | one or more reference link definitions. Any remainder becomes a | |||
| normal paragraph. | normal paragraph. | |||
| We can see how this works by considering how the tree above is | We can see how this works by considering how the tree above is | |||
| generated by four lines of Markdown: | generated by four lines of Markdown: | |||
| ``` markdown | ``` markdown | |||
| skipping to change at line 8192 | skipping to change at line 8291 | |||
| -> list_item | -> list_item | |||
| -> paragraph | -> paragraph | |||
| "aliquando id" | "aliquando id" | |||
| ``` | ``` | |||
| ## Phase 2: inline structure {-} | ## Phase 2: inline structure {-} | |||
| Once all of the input has been parsed, all open blocks are closed. | Once all of the input has been parsed, all open blocks are closed. | |||
| We then "walk the tree," visiting every node, and parse raw | We then "walk the tree," visiting every node, and parse raw | |||
| string contents of paragraphs and headers as inlines. At this | string contents of paragraphs and headings as inlines. At this | |||
| point we have seen all the link reference definitions, so we can | point we have seen all the link reference definitions, so we can | |||
| resolve reference links as we go. | resolve reference links as we go. | |||
| ``` tree | ``` tree | |||
| document | document | |||
| block_quote | block_quote | |||
| paragraph | paragraph | |||
| str "Lorem ipsum dolor" | str "Lorem ipsum dolor" | |||
| softbreak | softbreak | |||
| str "sit amet." | str "sit amet." | |||
| End of changes. 110 change blocks. | ||||
| 160 lines changed or deleted | 258 lines changed or added | |||
This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||