Couldn't find wdiff. Falling back to builtin diff colouring...
| spec.txt | spec.txt | |||
|---|---|---|---|---|
| --- | --- | |||
| title: CommonMark Spec | title: CommonMark Spec | |||
| author: John MacFarlane | author: John MacFarlane | |||
| version: 0.29 | version: '0.30' | |||
| date: '2019-04-06' | date: '2021-06-19' | |||
| license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)' | license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)' | |||
| ... | ... | |||
| # Introduction | # Introduction | |||
| ## What is Markdown? | ## What is Markdown? | |||
| Markdown is a plain text format for writing structured documents, | Markdown is a plain text format for writing structured documents, | |||
| based on conventions for indicating formatting in email | based on conventions for indicating formatting in email | |||
| and usenet posts. It was developed by John Gruber (with | and usenet posts. It was developed by John Gruber (with | |||
| skipping to change at line 273 ¶ | skipping to change at line 273 ¶ | |||
| python test/spec_tests.py --spec spec.txt --program PROGRAM | python test/spec_tests.py --spec spec.txt --program PROGRAM | |||
| Since this document describes how Markdown is to be parsed into | Since this document describes how Markdown is to be parsed into | |||
| an abstract syntax tree, it would have made sense to use an abstract | an abstract syntax tree, it would have made sense to use an abstract | |||
| representation of the syntax tree instead of HTML. But HTML is capable | representation of the syntax tree instead of HTML. But HTML is capable | |||
| of representing the structural distinctions we need to make, and the | of representing the structural distinctions we need to make, and the | |||
| choice of HTML for the tests makes it possible to run the tests against | choice of HTML for the tests makes it possible to run the tests against | |||
| an implementation without writing an abstract syntax tree renderer. | an implementation without writing an abstract syntax tree renderer. | |||
| Note that not every feature of the HTML samples is mandated by | ||||
| the spec. For example, the spec says what counts as a link | ||||
| destination, but it doesn't mandate that non-ASCII characters in | ||||
| the URL be percent-encoded. To use the automatic tests, | ||||
| implementers will need to provide a renderer that conforms to | ||||
| the expectations of the spec examples (percent-encoding | ||||
| non-ASCII characters in URLs). But a conforming implementation | ||||
| can use a different renderer and may choose not to | ||||
| percent-encode non-ASCII characters in URLs. | ||||
| This document is generated from a text file, `spec.txt`, written | This document is generated from a text file, `spec.txt`, written | |||
| in Markdown with a small extension for the side-by-side tests. | in Markdown with a small extension for the side-by-side tests. | |||
| The script `tools/makespec.py` can be used to convert `spec.txt` into | The script `tools/makespec.py` can be used to convert `spec.txt` into | |||
| HTML or CommonMark (which can then be converted into other formats). | HTML or CommonMark (which can then be converted into other formats). | |||
| In the examples, the `→` character is used to represent tabs. | In the examples, the `→` character is used to represent tabs. | |||
| # Preliminaries | # Preliminaries | |||
| ## Characters and lines | ## Characters and lines | |||
| skipping to change at line 297 ¶ | skipping to change at line 307 ¶ | |||
| A [character](@) is a Unicode code point. Although some | A [character](@) is a Unicode code point. Although some | |||
| code points (for example, combining accents) do not correspond to | code points (for example, combining accents) do not correspond to | |||
| characters in an intuitive sense, all code points count as characters | characters in an intuitive sense, all code points count as characters | |||
| for purposes of this spec. | for purposes of this spec. | |||
| This spec does not specify an encoding; it thinks of lines as composed | This spec does not specify an encoding; it thinks of lines as composed | |||
| of [characters] rather than bytes. A conforming parser may be limited | of [characters] rather than bytes. A conforming parser may be limited | |||
| to a certain encoding. | to a certain encoding. | |||
| A [line](@) is a sequence of zero or more [characters] | A [line](@) is a sequence of zero or more [characters] | |||
| other than newline (`U+000A`) or carriage return (`U+000D`), | other than line feed (`U+000A`) or carriage return (`U+000D`), | |||
| followed by a [line ending] or by the end of file. | followed by a [line ending] or by the end of file. | |||
| A [line ending](@) is a newline (`U+000A`), a carriage return | A [line ending](@) is a line feed (`U+000A`), a carriage return | |||
| (`U+000D`) not followed by a newline, or a carriage return and a | (`U+000D`) not followed by a line feed, or a carriage return and a | |||
| following newline. | following line feed. | |||
| A line containing no characters, or a line containing only spaces | A line containing no characters, or a line containing only spaces | |||
| (`U+0020`) or tabs (`U+0009`), is called a [blank line](@). | (`U+0020`) or tabs (`U+0009`), is called a [blank line](@). | |||
| The following definitions of character classes will be used in this spec: | The following definitions of character classes will be used in this spec: | |||
| A [whitespace character](@) is a space | ||||
| (`U+0020`), tab (`U+0009`), newline (`U+000A`), line tabulation (`U+000B`), | ||||
| form feed (`U+000C`), or carriage return (`U+000D`). | ||||
| [Whitespace](@) is a sequence of one or more [whitespace | ||||
| characters]. | ||||
| A [Unicode whitespace character](@) is | A [Unicode whitespace character](@) is | |||
| any code point in the Unicode `Zs` general category, or a tab (`U+0009`), | any code point in the Unicode `Zs` general category, or a tab (`U+0009`), | |||
| carriage return (`U+000D`), newline (`U+000A`), or form feed | line feed (`U+000A`), form feed (`U+000C`), or carriage return (`U+000D`). | |||
| (`U+000C`). | ||||
| [Unicode whitespace](@) is a sequence of one | [Unicode whitespace](@) is a sequence of one or more | |||
| or more [Unicode whitespace characters]. | [Unicode whitespace characters]. | |||
| A [tab](@) is `U+0009`. | ||||
| A [space](@) is `U+0020`. | A [space](@) is `U+0020`. | |||
| A [non-whitespace character](@) is any character | An [ASCII control character](@) is a character between `U+0000–1F` (both | |||
| that is not a [whitespace character]. | including) or `U+007F`. | |||
| An [ASCII punctuation character](@) | An [ASCII punctuation character](@) | |||
| is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`, | is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`, | |||
| `*`, `+`, `,`, `-`, `.`, `/` (U+0021–2F), | `*`, `+`, `,`, `-`, `.`, `/` (U+0021–2F), | |||
| `:`, `;`, `<`, `=`, `>`, `?`, `@` (U+003A–0040), | `:`, `;`, `<`, `=`, `>`, `?`, `@` (U+003A–0040), | |||
| `[`, `\`, `]`, `^`, `_`, `` ` `` (U+005B–0060), | `[`, `\`, `]`, `^`, `_`, `` ` `` (U+005B–0060), | |||
| `{`, `|`, `}`, or `~` (U+007B–007E). | `{`, `|`, `}`, or `~` (U+007B–007E). | |||
| A [punctuation character](@) is an [ASCII | A [Unicode punctuation character](@) is an [ASCII | |||
| punctuation character] or anything in | punctuation character] or anything in | |||
| the general Unicode categories `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`. | the general Unicode categories `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`. | |||
| ## Tabs | ## Tabs | |||
| Tabs in lines are not expanded to [spaces]. However, | Tabs in lines are not expanded to [spaces]. However, | |||
| in contexts where whitespace helps to define block structure, | in contexts where spaces help to define block structure, | |||
| tabs behave as if they were replaced by spaces with a tab stop | tabs behave as if they were replaced by spaces with a tab stop | |||
| of 4 characters. | of 4 characters. | |||
| Thus, for example, a tab can be used instead of four spaces | Thus, for example, a tab can be used instead of four spaces | |||
| in an indented code block. (Note, however, that internal | in an indented code block. (Note, however, that internal | |||
| tabs are passed through as literal tabs, not expanded to | tabs are passed through as literal tabs, not expanded to | |||
| spaces.) | spaces.) | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| →foo→baz→→bim | →foo→baz→→bim | |||
| skipping to change at line 479 ¶ | skipping to change at line 483 ¶ | |||
| *→*→*→ | *→*→*→ | |||
| . | . | |||
| <hr /> | <hr /> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ## Insecure characters | ## Insecure characters | |||
| For security reasons, the Unicode character `U+0000` must be replaced | For security reasons, the Unicode character `U+0000` must be replaced | |||
| with the REPLACEMENT CHARACTER (`U+FFFD`). | with the REPLACEMENT CHARACTER (`U+FFFD`). | |||
| ## Backslash escapes | ||||
| Any ASCII punctuation character may be backslash-escaped: | ||||
| ```````````````````````````````` example | ||||
| \!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~ | ||||
| . | ||||
| <p>!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~</p> | ||||
| ```````````````````````````````` | ||||
| Backslashes before other characters are treated as literal | ||||
| backslashes: | ||||
| ```````````````````````````````` example | ||||
| \→\A\a\ \3\φ\« | ||||
| . | ||||
| <p>\→\A\a\ \3\φ\«</p> | ||||
| ```````````````````````````````` | ||||
| Escaped characters are treated as regular characters and do | ||||
| not have their usual Markdown meanings: | ||||
| ```````````````````````````````` example | ||||
| \*not emphasized* | ||||
| \<br/> not a tag | ||||
| \[not a link](/foo) | ||||
| \`not code` | ||||
| 1\. not a list | ||||
| \* not a list | ||||
| \# not a heading | ||||
| \[foo]: /url "not a reference" | ||||
| \ö not a character entity | ||||
| . | ||||
| <p>*not emphasized* | ||||
| <br/> not a tag | ||||
| [not a link](/foo) | ||||
| `not code` | ||||
| 1. not a list | ||||
| * not a list | ||||
| # not a heading | ||||
| [foo]: /url "not a reference" | ||||
| &ouml; not a character entity</p> | ||||
| ```````````````````````````````` | ||||
| If a backslash is itself escaped, the following character is not: | ||||
| ```````````````````````````````` example | ||||
| \\*emphasis* | ||||
| . | ||||
| <p>\<em>emphasis</em></p> | ||||
| ```````````````````````````````` | ||||
| A backslash at the end of the line is a [hard line break]: | ||||
| ```````````````````````````````` example | ||||
| foo\ | ||||
| bar | ||||
| . | ||||
| <p>foo<br /> | ||||
| bar</p> | ||||
| ```````````````````````````````` | ||||
| Backslash escapes do not work in code blocks, code spans, autolinks, or | ||||
| raw HTML: | ||||
| ```````````````````````````````` example | ||||
| `` \[\` `` | ||||
| . | ||||
| <p><code>\[\`</code></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| \[\] | ||||
| . | ||||
| <pre><code>\[\] | ||||
| </code></pre> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| ~~~ | ||||
| \[\] | ||||
| ~~~ | ||||
| . | ||||
| <pre><code>\[\] | ||||
| </code></pre> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| <http://example.com?find=\*> | ||||
| . | ||||
| <p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| <a href="/bar\/)"> | ||||
| . | ||||
| <a href="/bar\/)"> | ||||
| ```````````````````````````````` | ||||
| But they work in all other contexts, including URLs and link titles, | ||||
| link references, and [info strings] in [fenced code blocks]: | ||||
| ```````````````````````````````` example | ||||
| [foo](/bar\* "ti\*tle") | ||||
| . | ||||
| <p><a href="/bar*" title="ti*tle">foo</a></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| [foo] | ||||
| [foo]: /bar\* "ti\*tle" | ||||
| . | ||||
| <p><a href="/bar*" title="ti*tle">foo</a></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| ``` foo\+bar | ||||
| foo | ||||
| ``` | ||||
| . | ||||
| <pre><code class="language-foo+bar">foo | ||||
| </code></pre> | ||||
| ```````````````````````````````` | ||||
| ## Entity and numeric character references | ||||
| Valid HTML entity references and numeric character references | ||||
| can be used in place of the corresponding Unicode character, | ||||
| with the following exceptions: | ||||
| - Entity and character references are not recognized in code | ||||
| blocks and code spans. | ||||
| - Entity and character references cannot stand in place of | ||||
| special characters that define structural elements in | ||||
| CommonMark. For example, although `*` can be used | ||||
| in place of a literal `*` character, `*` cannot replace | ||||
| `*` in emphasis delimiters, bullet list markers, or thematic | ||||
| breaks. | ||||
| Conforming CommonMark parsers need not store information about | ||||
| whether a particular character was represented in the source | ||||
| using a Unicode character or an entity reference. | ||||
| [Entity references](@) consist of `&` + any of the valid | ||||
| HTML5 entity names + `;`. The | ||||
| document <https://html.spec.whatwg.org/entities.json> | ||||
| is used as an authoritative source for the valid entity | ||||
| references and their corresponding code points. | ||||
| ```````````````````````````````` example | ||||
| & © Æ Ď | ||||
| ¾ ℋ ⅆ | ||||
| ∲ ≧̸ | ||||
| . | ||||
| <p> & © Æ Ď | ||||
| ¾ ℋ ⅆ | ||||
| ∲ ≧̸</p> | ||||
| ```````````````````````````````` | ||||
| [Decimal numeric character | ||||
| references](@) | ||||
| consist of `&#` + a string of 1--7 arabic digits + `;`. A | ||||
| numeric character reference is parsed as the corresponding | ||||
| Unicode character. Invalid Unicode code points will be replaced by | ||||
| the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons, | ||||
| the code point `U+0000` will also be replaced by `U+FFFD`. | ||||
| ```````````````````````````````` example | ||||
| # Ӓ Ϡ � | ||||
| . | ||||
| <p># Ӓ Ϡ �</p> | ||||
| ```````````````````````````````` | ||||
| [Hexadecimal numeric character | ||||
| references](@) consist of `&#` + | ||||
| either `X` or `x` + a string of 1-6 hexadecimal digits + `;`. | ||||
| They too are parsed as the corresponding Unicode character (this | ||||
| time specified with a hexadecimal numeral instead of decimal). | ||||
| ```````````````````````````````` example | ||||
| " ആ ಫ | ||||
| . | ||||
| <p>" ആ ಫ</p> | ||||
| ```````````````````````````````` | ||||
| Here are some nonentities: | ||||
| ```````````````````````````````` example | ||||
|   &x; &#; &#x; | ||||
| � | ||||
| &#abcdef0; | ||||
| &ThisIsNotDefined; &hi?; | ||||
| . | ||||
| <p>&nbsp &x; &#; &#x; | ||||
| &#87654321; | ||||
| &#abcdef0; | ||||
| &ThisIsNotDefined; &hi?;</p> | ||||
| ```````````````````````````````` | ||||
| Although HTML5 does accept some entity references | ||||
| without a trailing semicolon (such as `©`), these are not | ||||
| recognized here, because it makes the grammar too ambiguous: | ||||
| ```````````````````````````````` example | ||||
| © | ||||
| . | ||||
| <p>&copy</p> | ||||
| ```````````````````````````````` | ||||
| Strings that are not on the list of HTML5 named entities are not | ||||
| recognized as entity references either: | ||||
| ```````````````````````````````` example | ||||
| &MadeUpEntity; | ||||
| . | ||||
| <p>&MadeUpEntity;</p> | ||||
| ```````````````````````````````` | ||||
| Entity and numeric character references are recognized in any | ||||
| context besides code spans or code blocks, including | ||||
| URLs, [link titles], and [fenced code block][] [info strings]: | ||||
| ```````````````````````````````` example | ||||
| <a href="öö.html"> | ||||
| . | ||||
| <a href="öö.html"> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| [foo](/föö "föö") | ||||
| . | ||||
| <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| [foo] | ||||
| [foo]: /föö "föö" | ||||
| . | ||||
| <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| ``` föö | ||||
| foo | ||||
| ``` | ||||
| . | ||||
| <pre><code class="language-föö">foo | ||||
| </code></pre> | ||||
| ```````````````````````````````` | ||||
| Entity and numeric character references are treated as literal | ||||
| text in code spans and code blocks: | ||||
| ```````````````````````````````` example | ||||
| `föö` | ||||
| . | ||||
| <p><code>f&ouml;&ouml;</code></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| föfö | ||||
| . | ||||
| <pre><code>f&ouml;f&ouml; | ||||
| </code></pre> | ||||
| ```````````````````````````````` | ||||
| Entity and numeric character references cannot be used | ||||
| in place of symbols indicating structure in CommonMark | ||||
| documents. | ||||
| ```````````````````````````````` example | ||||
| *foo* | ||||
| *foo* | ||||
| . | ||||
| <p>*foo* | ||||
| <em>foo</em></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| * foo | ||||
| * foo | ||||
| . | ||||
| <p>* foo</p> | ||||
| <ul> | ||||
| <li>foo</li> | ||||
| </ul> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| foo bar | ||||
| . | ||||
| <p>foo | ||||
| bar</p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| 	foo | ||||
| . | ||||
| <p>→foo</p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| [a](url "tit") | ||||
| . | ||||
| <p>[a](url "tit")</p> | ||||
| ```````````````````````````````` | ||||
| # Blocks and inlines | # Blocks and inlines | |||
| We can think of a document as a sequence of | We can think of a document as a sequence of | |||
| [blocks](@)---structural elements like paragraphs, block | [blocks](@)---structural elements like paragraphs, block | |||
| quotations, lists, headings, rules, and code blocks. Some blocks (like | quotations, lists, headings, rules, and code blocks. Some blocks (like | |||
| block quotes and list items) contain other blocks; others (like | block quotes and list items) contain other blocks; others (like | |||
| headings and paragraphs) contain [inline](@) content---text, | headings and paragraphs) contain [inline](@) content---text, | |||
| links, emphasized text, images, code spans, and so on. | links, emphasized text, images, code spans, and so on. | |||
| ## Precedence | ## Precedence | |||
| skipping to change at line 516 ¶ | skipping to change at line 832 ¶ | |||
| paragraphs, headings, and other block constructs can be parsed for inline | paragraphs, headings, and other block constructs can be parsed for inline | |||
| structure. The second step requires information about link reference | structure. The second step requires information about link reference | |||
| definitions that will be available only at the end of the first | definitions that will be available only at the end of the first | |||
| step. Note that the first step requires processing lines in sequence, | step. Note that the first step requires processing lines in sequence, | |||
| but the second can be parallelized, since the inline parsing of | but the second can be parallelized, since the inline parsing of | |||
| one block element does not affect the inline parsing of any other. | one block element does not affect the inline parsing of any other. | |||
| ## Container blocks and leaf blocks | ## Container blocks and leaf blocks | |||
| We can divide blocks into two types: | We can divide blocks into two types: | |||
| [container blocks](@), | [container blocks](#container-blocks), | |||
| which can contain other blocks, and [leaf blocks](@), | which can contain other blocks, and [leaf blocks](#leaf-blocks), | |||
| which cannot. | which cannot. | |||
| # Leaf blocks | # Leaf blocks | |||
| This section describes the different kinds of leaf block that make up a | This section describes the different kinds of leaf block that make up a | |||
| Markdown document. | Markdown document. | |||
| ## Thematic breaks | ## Thematic breaks | |||
| A line consisting of 0-3 spaces of indentation, followed by a sequence | A line consisting of optionally up to three spaces of indentation, followed by a | |||
| of three or more matching `-`, `_`, or `*` characters, each followed | sequence of three or more matching `-`, `_`, or `*` characters, each followed | |||
| optionally by any number of spaces or tabs, forms a | optionally by any number of spaces or tabs, forms a | |||
| [thematic break](@). | [thematic break](@). | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| *** | *** | |||
| --- | --- | |||
| ___ | ___ | |||
| . | . | |||
| <hr /> | <hr /> | |||
| <hr /> | <hr /> | |||
| skipping to change at line 568 ¶ | skipping to change at line 884 ¶ | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| -- | -- | |||
| ** | ** | |||
| __ | __ | |||
| . | . | |||
| <p>-- | <p>-- | |||
| ** | ** | |||
| __</p> | __</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| One to three spaces indent are allowed: | Up to three spaces of indentation are allowed: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| *** | *** | |||
| *** | *** | |||
| *** | *** | |||
| . | . | |||
| <hr /> | <hr /> | |||
| <hr /> | <hr /> | |||
| <hr /> | <hr /> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Four spaces is too many: | Four spaces of indentation is too many: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| *** | *** | |||
| . | . | |||
| <pre><code>*** | <pre><code>*** | |||
| </code></pre> | </code></pre> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| Foo | Foo | |||
| skipping to change at line 605 ¶ | skipping to change at line 921 ¶ | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| More than three characters may be used: | More than three characters may be used: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| _____________________________________ | _____________________________________ | |||
| . | . | |||
| <hr /> | <hr /> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Spaces are allowed between the characters: | Spaces and tabs are allowed between the characters: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| - - - | - - - | |||
| . | . | |||
| <hr /> | <hr /> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| ** * ** * ** * ** | ** * ** * ** * ** | |||
| . | . | |||
| <hr /> | <hr /> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| - - - - | - - - - | |||
| . | . | |||
| <hr /> | <hr /> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Spaces are allowed at the end: | Spaces and tabs are allowed at the end: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| - - - - | - - - - | |||
| . | . | |||
| <hr /> | <hr /> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| However, no other characters may occur in the line: | However, no other characters may occur in the line: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| skipping to change at line 647 ¶ | skipping to change at line 963 ¶ | |||
| a------ | a------ | |||
| ---a--- | ---a--- | |||
| . | . | |||
| <p>_ _ _ _ a</p> | <p>_ _ _ _ a</p> | |||
| <p>a------</p> | <p>a------</p> | |||
| <p>---a---</p> | <p>---a---</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| It is required that all of the [non-whitespace characters] be the same. | It is required that all of the characters other than spaces or tabs be the same. | |||
| So, this is not a thematic break: | So, this is not a thematic break: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| *-* | *-* | |||
| . | . | |||
| <p><em>-</em></p> | <p><em>-</em></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Thematic breaks do not need blank lines before or after: | Thematic breaks do not need blank lines before or after: | |||
| skipping to change at line 736 ¶ | skipping to change at line 1052 ¶ | |||
| </li> | </li> | |||
| </ul> | </ul> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ## ATX headings | ## ATX headings | |||
| An [ATX heading](@) | An [ATX heading](@) | |||
| consists of a string of characters, parsed as inline content, between an | consists of a string of characters, parsed as inline content, between an | |||
| opening sequence of 1--6 unescaped `#` characters and an optional | opening sequence of 1--6 unescaped `#` characters and an optional | |||
| closing sequence of any number of unescaped `#` characters. | closing sequence of any number of unescaped `#` characters. | |||
| The opening sequence of `#` characters must be followed by a | The opening sequence of `#` characters must be followed by spaces or tabs, or | |||
| [space] or by the end of line. The optional closing sequence of `#`s must be | by the end of line. The optional closing sequence of `#`s must be preceded by | |||
| preceded by a [space] and may be followed by spaces only. The opening | spaces or tabs and may be followed by spaces or tabs only. The opening | |||
| `#` character may be indented 0-3 spaces. The raw contents of the | `#` character may be preceded by up to three spaces of indentation. The raw | |||
| heading are stripped of leading and trailing spaces before being parsed | contents of the heading are stripped of leading and trailing space or tabs | |||
| as inline content. The heading level is equal to the number of `#` | before being parsed as inline content. The heading level is equal to the number | |||
| characters in the opening sequence. | of `#` characters in the opening sequence. | |||
| Simple headings: | Simple headings: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| # foo | # foo | |||
| ## foo | ## foo | |||
| ### foo | ### foo | |||
| #### foo | #### foo | |||
| ##### foo | ##### foo | |||
| ###### foo | ###### foo | |||
| skipping to change at line 770 ¶ | skipping to change at line 1086 ¶ | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| More than six `#` characters is not a heading: | More than six `#` characters is not a heading: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| ####### foo | ####### foo | |||
| . | . | |||
| <p>####### foo</p> | <p>####### foo</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| At least one space is required between the `#` characters and the | At least one space or tab is required between the `#` characters and the | |||
| heading's contents, unless the heading is empty. Note that many | heading's contents, unless the heading is empty. Note that many | |||
| implementations currently do not require the space. However, the | implementations currently do not require the space. However, the | |||
| space was required by the | space was required by the | |||
| [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py), | [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py), | |||
| and it helps prevent things like the following from being parsed as | and it helps prevent things like the following from being parsed as | |||
| headings: | headings: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| #5 bolt | #5 bolt | |||
| skipping to change at line 803 ¶ | skipping to change at line 1119 ¶ | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Contents are parsed as inlines: | Contents are parsed as inlines: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| # foo *bar* \*baz\* | # foo *bar* \*baz\* | |||
| . | . | |||
| <h1>foo <em>bar</em> *baz*</h1> | <h1>foo <em>bar</em> *baz*</h1> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Leading and trailing [whitespace] is ignored in parsing inline content: | Leading and trailing spaces or tabs are ignored in parsing inline content: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| # foo | # foo | |||
| . | . | |||
| <h1>foo</h1> | <h1>foo</h1> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| One to three spaces indentation are allowed: | Up to three spaces of indentation are allowed: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| ### foo | ### foo | |||
| ## foo | ## foo | |||
| # foo | # foo | |||
| . | . | |||
| <h3>foo</h3> | <h3>foo</h3> | |||
| <h2>foo</h2> | <h2>foo</h2> | |||
| <h1>foo</h1> | <h1>foo</h1> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Four spaces are too much: | Four spaces of indentation is too many: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| # foo | # foo | |||
| . | . | |||
| <pre><code># foo | <pre><code># foo | |||
| </code></pre> | </code></pre> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| foo | foo | |||
| skipping to change at line 860 ¶ | skipping to change at line 1176 ¶ | |||
| It need not be the same length as the opening sequence: | It need not be the same length as the opening sequence: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| # foo ################################## | # foo ################################## | |||
| ##### foo ## | ##### foo ## | |||
| . | . | |||
| <h1>foo</h1> | <h1>foo</h1> | |||
| <h5>foo</h5> | <h5>foo</h5> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Spaces are allowed after the closing sequence: | Spaces or tabs are allowed after the closing sequence: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| ### foo ### | ### foo ### | |||
| . | . | |||
| <h3>foo</h3> | <h3>foo</h3> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| A sequence of `#` characters with anything but [spaces] following it | A sequence of `#` characters with anything but spaces or tabs following it | |||
| is not a closing sequence, but counts as part of the contents of the | is not a closing sequence, but counts as part of the contents of the | |||
| heading: | heading: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| ### foo ### b | ### foo ### b | |||
| . | . | |||
| <h3>foo ### b</h3> | <h3>foo ### b</h3> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| The closing sequence must be preceded by a space: | The closing sequence must be preceded by a space or tab: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| # foo# | # foo# | |||
| . | . | |||
| <h1>foo#</h1> | <h1>foo#</h1> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Backslash-escaped `#` characters do not count as part | Backslash-escaped `#` characters do not count as part | |||
| of the closing sequence: | of the closing sequence: | |||
| skipping to change at line 937 ¶ | skipping to change at line 1253 ¶ | |||
| ### ### | ### ### | |||
| . | . | |||
| <h2></h2> | <h2></h2> | |||
| <h1></h1> | <h1></h1> | |||
| <h3></h3> | <h3></h3> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ## Setext headings | ## Setext headings | |||
| A [setext heading](@) consists of one or more | A [setext heading](@) consists of one or more | |||
| lines of text, each containing at least one [non-whitespace | lines of text, not interrupted by a blank line, of which the first line does not | |||
| character], with no more than 3 spaces indentation, followed by | have more than 3 spaces of indentation, followed by | |||
| a [setext heading underline]. The lines of text must be such | a [setext heading underline]. The lines of text must be such | |||
| that, were they not followed by the setext heading underline, | that, were they not followed by the setext heading underline, | |||
| they would be interpreted as a paragraph: they cannot be | they would be interpreted as a paragraph: they cannot be | |||
| interpretable as a [code fence], [ATX heading][ATX headings], | interpretable as a [code fence], [ATX heading][ATX headings], | |||
| [block quote][block quotes], [thematic break][thematic breaks], | [block quote][block quotes], [thematic break][thematic breaks], | |||
| [list item][list items], or [HTML block][HTML blocks]. | [list item][list items], or [HTML block][HTML blocks]. | |||
| A [setext heading underline](@) is a sequence of | A [setext heading underline](@) is a sequence of | |||
| `=` characters or a sequence of `-` characters, with no more than 3 | `=` characters or a sequence of `-` characters, with no more than 3 | |||
| spaces indentation and any number of trailing spaces. If a line | spaces of indentation and any number of trailing spaces or tabs. If a line | |||
| containing a single `-` can be interpreted as an | containing a single `-` can be interpreted as an | |||
| empty [list items], it should be interpreted this way | empty [list items], it should be interpreted this way | |||
| and not as a [setext heading underline]. | and not as a [setext heading underline]. | |||
| The heading is a level 1 heading if `=` characters are used in | The heading is a level 1 heading if `=` characters are used in | |||
| the [setext heading underline], and a level 2 heading if `-` | the [setext heading underline], and a level 2 heading if `-` | |||
| characters are used. The contents of the heading are the result | characters are used. The contents of the heading are the result | |||
| of parsing the preceding lines of text as CommonMark inline | of parsing the preceding lines of text as CommonMark inline | |||
| content. | content. | |||
| skipping to change at line 991 ¶ | skipping to change at line 1307 ¶ | |||
| baz* | baz* | |||
| ==== | ==== | |||
| . | . | |||
| <h1>Foo <em>bar | <h1>Foo <em>bar | |||
| baz</em></h1> | baz</em></h1> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| The contents are the result of parsing the headings's raw | The contents are the result of parsing the headings's raw | |||
| content as inlines. The heading's raw content is formed by | content as inlines. The heading's raw content is formed by | |||
| concatenating the lines and removing initial and final | concatenating the lines and removing initial and final | |||
| [whitespace]. | spaces or tabs. | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| Foo *bar | Foo *bar | |||
| baz*→ | baz*→ | |||
| ==== | ==== | |||
| . | . | |||
| <h1>Foo <em>bar | <h1>Foo <em>bar | |||
| baz</em></h1> | baz</em></h1> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| skipping to change at line 1015 ¶ | skipping to change at line 1331 ¶ | |||
| Foo | Foo | |||
| ------------------------- | ------------------------- | |||
| Foo | Foo | |||
| = | = | |||
| . | . | |||
| <h2>Foo</h2> | <h2>Foo</h2> | |||
| <h1>Foo</h1> | <h1>Foo</h1> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| The heading content can be indented up to three spaces, and need | The heading content can be preceded by up to three spaces of indentation, and | |||
| not line up with the underlining: | need not line up with the underlining: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| Foo | Foo | |||
| --- | --- | |||
| Foo | Foo | |||
| ----- | ----- | |||
| Foo | Foo | |||
| === | === | |||
| . | . | |||
| <h2>Foo</h2> | <h2>Foo</h2> | |||
| <h2>Foo</h2> | <h2>Foo</h2> | |||
| <h1>Foo</h1> | <h1>Foo</h1> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Four spaces indent is too much: | Four spaces of indentation is too many: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| Foo | Foo | |||
| --- | --- | |||
| Foo | Foo | |||
| --- | --- | |||
| . | . | |||
| <pre><code>Foo | <pre><code>Foo | |||
| --- | --- | |||
| Foo | Foo | |||
| </code></pre> | </code></pre> | |||
| <hr /> | <hr /> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| The setext heading underline can be indented up to three spaces, and | The setext heading underline can be preceded by up to three spaces of | |||
| may have trailing spaces: | indentation, and may have trailing spaces or tabs: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| Foo | Foo | |||
| ---- | ---- | |||
| . | . | |||
| <h2>Foo</h2> | <h2>Foo</h2> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Four spaces is too much: | Four spaces of indentation is too many: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| Foo | Foo | |||
| --- | --- | |||
| . | . | |||
| <p>Foo | <p>Foo | |||
| ---</p> | ---</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| The setext heading underline cannot contain internal spaces: | The setext heading underline cannot contain internal spaces or tabs: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| Foo | Foo | |||
| = = | = = | |||
| Foo | Foo | |||
| --- - | --- - | |||
| . | . | |||
| <p>Foo | <p>Foo | |||
| = =</p> | = =</p> | |||
| <p>Foo</p> | <p>Foo</p> | |||
| <hr /> | <hr /> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Trailing spaces in the content line do not cause a line break: | Trailing spaces or tabs in the content line do not cause a hard line break: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| Foo | Foo | |||
| ----- | ----- | |||
| . | . | |||
| <h2>Foo</h2> | <h2>Foo</h2> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Nor does a backslash at the end: | Nor does a backslash at the end: | |||
| skipping to change at line 1332 ¶ | skipping to change at line 1648 ¶ | |||
| bar | bar | |||
| --- | --- | |||
| baz</p> | baz</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ## Indented code blocks | ## Indented code blocks | |||
| An [indented code block](@) is composed of one or more | An [indented code block](@) is composed of one or more | |||
| [indented chunks] separated by blank lines. | [indented chunks] separated by blank lines. | |||
| An [indented chunk](@) is a sequence of non-blank lines, | An [indented chunk](@) is a sequence of non-blank lines, | |||
| each indented four or more spaces. The contents of the code block are | each preceded by four or more spaces of indentation. The contents of the code | |||
| the literal contents of the lines, including trailing | block are the literal contents of the lines, including trailing | |||
| [line endings], minus four spaces of indentation. | [line endings], minus four spaces of indentation. | |||
| An indented code block has no [info string]. | An indented code block has no [info string]. | |||
| An indented code block cannot interrupt a paragraph, so there must be | An indented code block cannot interrupt a paragraph, so there must be | |||
| a blank line between a paragraph and a following indented code block. | a blank line between a paragraph and a following indented code block. | |||
| (A blank line is not needed, however, between a code block and a following | (A blank line is not needed, however, between a code block and a following | |||
| paragraph.) | paragraph.) | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| a simple | a simple | |||
| skipping to change at line 1416 ¶ | skipping to change at line 1732 ¶ | |||
| chunk3 | chunk3 | |||
| . | . | |||
| <pre><code>chunk1 | <pre><code>chunk1 | |||
| chunk2 | chunk2 | |||
| chunk3 | chunk3 | |||
| </code></pre> | </code></pre> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Any initial spaces beyond four will be included in the content, even | Any initial spaces or tabs beyond four spaces of indentation will be included in | |||
| in interior blank lines: | the content, even in interior blank lines: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| chunk1 | chunk1 | |||
| chunk2 | chunk2 | |||
| . | . | |||
| <pre><code>chunk1 | <pre><code>chunk1 | |||
| chunk2 | chunk2 | |||
| </code></pre> | </code></pre> | |||
| skipping to change at line 1442 ¶ | skipping to change at line 1758 ¶ | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| Foo | Foo | |||
| bar | bar | |||
| . | . | |||
| <p>Foo | <p>Foo | |||
| bar</p> | bar</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| However, any non-blank line with fewer than four leading spaces ends | However, any non-blank line with fewer than four spaces of indentation ends | |||
| the code block immediately. So a paragraph may occur immediately | the code block immediately. So a paragraph may occur immediately | |||
| after indented code: | after indented code: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| foo | foo | |||
| bar | bar | |||
| . | . | |||
| <pre><code>foo | <pre><code>foo | |||
| </code></pre> | </code></pre> | |||
| <p>bar</p> | <p>bar</p> | |||
| skipping to change at line 1475 ¶ | skipping to change at line 1791 ¶ | |||
| . | . | |||
| <h1>Heading</h1> | <h1>Heading</h1> | |||
| <pre><code>foo | <pre><code>foo | |||
| </code></pre> | </code></pre> | |||
| <h2>Heading</h2> | <h2>Heading</h2> | |||
| <pre><code>foo | <pre><code>foo | |||
| </code></pre> | </code></pre> | |||
| <hr /> | <hr /> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| The first line can be indented more than four spaces: | The first line can be preceded by more than four spaces of indentation: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| foo | foo | |||
| bar | bar | |||
| . | . | |||
| <pre><code> foo | <pre><code> foo | |||
| bar | bar | |||
| </code></pre> | </code></pre> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| skipping to change at line 1498 ¶ | skipping to change at line 1814 ¶ | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| foo | foo | |||
| . | . | |||
| <pre><code>foo | <pre><code>foo | |||
| </code></pre> | </code></pre> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Trailing spaces are included in the code block's content: | Trailing spaces or tabs are included in the code block's content: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| foo | foo | |||
| . | . | |||
| <pre><code>foo | <pre><code>foo | |||
| </code></pre> | </code></pre> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ## Fenced code blocks | ## Fenced code blocks | |||
| A [code fence](@) is a sequence | A [code fence](@) is a sequence | |||
| of at least three consecutive backtick characters (`` ` ``) or | of at least three consecutive backtick characters (`` ` ``) or | |||
| tildes (`~`). (Tildes and backticks cannot be mixed.) | tildes (`~`). (Tildes and backticks cannot be mixed.) | |||
| A [fenced code block](@) | A [fenced code block](@) | |||
| begins with a code fence, indented no more than three spaces. | begins with a code fence, preceded by up to three spaces of indentation. | |||
| The line with the opening code fence may optionally contain some text | The line with the opening code fence may optionally contain some text | |||
| following the code fence; this is trimmed of leading and trailing | following the code fence; this is trimmed of leading and trailing | |||
| whitespace and called the [info string](@). If the [info string] comes | spaces or tabs and called the [info string](@). If the [info string] comes | |||
| after a backtick fence, it may not contain any backtick | after a backtick fence, it may not contain any backtick | |||
| characters. (The reason for this restriction is that otherwise | characters. (The reason for this restriction is that otherwise | |||
| some inline code would be incorrectly interpreted as the | some inline code would be incorrectly interpreted as the | |||
| beginning of a fenced code block.) | beginning of a fenced code block.) | |||
| The content of the code block consists of all subsequent lines, until | The content of the code block consists of all subsequent lines, until | |||
| a closing [code fence] of the same type as the code block | a closing [code fence] of the same type as the code block | |||
| began with (backticks or tildes), and with at least as many backticks | began with (backticks or tildes), and with at least as many backticks | |||
| or tildes as the opening code fence. If the leading code fence is | or tildes as the opening code fence. If the leading code fence is | |||
| indented N spaces, then up to N spaces of indentation are removed from | preceded by N spaces of indentation, then up to N spaces of indentation are | |||
| each line of the content (if present). (If a content line is not | removed from each line of the content (if present). (If a content line is not | |||
| indented, it is preserved unchanged. If it is indented less than N | indented, it is preserved unchanged. If it is indented N spaces or less, all | |||
| spaces, all of the indentation is removed.) | of the indentation is removed.) | |||
| The closing code fence may be indented up to three spaces, and may be | The closing code fence may be preceded by up to three spaces of indentation, and | |||
| followed only by spaces, which are ignored. If the end of the | may be followed only by spaces or tabs, which are ignored. If the end of the | |||
| containing block (or document) is reached and no closing code fence | containing block (or document) is reached and no closing code fence | |||
| has been found, the code block contains all of the lines after the | has been found, the code block contains all of the lines after the | |||
| opening code fence until the end of the containing block (or | opening code fence until the end of the containing block (or | |||
| document). (An alternative spec would require backtracking in the | document). (An alternative spec would require backtracking in the | |||
| event that a closing code fence is not found. But this makes parsing | event that a closing code fence is not found. But this makes parsing | |||
| much less efficient, and there seems to be no real down side to the | much less efficient, and there seems to be no real down side to the | |||
| behavior described here.) | behavior described here.) | |||
| A fenced code block may interrupt a paragraph, and does not require | A fenced code block may interrupt a paragraph, and does not require | |||
| a blank line either before or after. | a blank line either before or after. | |||
| skipping to change at line 1732 ¶ | skipping to change at line 2048 ¶ | |||
| aaa | aaa | |||
| aaa | aaa | |||
| ``` | ``` | |||
| . | . | |||
| <pre><code>aaa | <pre><code>aaa | |||
| aaa | aaa | |||
| aaa | aaa | |||
| </code></pre> | </code></pre> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Four spaces indentation produces an indented code block: | Four spaces of indentation is too many: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| ``` | ``` | |||
| aaa | aaa | |||
| ``` | ``` | |||
| . | . | |||
| <pre><code>``` | <pre><code>``` | |||
| aaa | aaa | |||
| ``` | ``` | |||
| </code></pre> | </code></pre> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Closing fences may be indented by 0-3 spaces, and their indentation | Closing fences may be preceded by up to three spaces of indentation, and their | |||
| need not match that of the opening fence: | indentation need not match that of the opening fence: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| ``` | ``` | |||
| aaa | aaa | |||
| ``` | ``` | |||
| . | . | |||
| <pre><code>aaa | <pre><code>aaa | |||
| </code></pre> | </code></pre> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| skipping to change at line 1778 ¶ | skipping to change at line 2094 ¶ | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| ``` | ``` | |||
| aaa | aaa | |||
| ``` | ``` | |||
| . | . | |||
| <pre><code>aaa | <pre><code>aaa | |||
| ``` | ``` | |||
| </code></pre> | </code></pre> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Code fences (opening and closing) cannot contain internal spaces: | Code fences (opening and closing) cannot contain internal spaces or tabs: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| ``` ``` | ``` ``` | |||
| aaa | aaa | |||
| . | . | |||
| <p><code> </code> | <p><code> </code> | |||
| aaa</p> | aaa</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| skipping to change at line 1910 ¶ | skipping to change at line 2226 ¶ | |||
| </code></pre> | </code></pre> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ## HTML blocks | ## HTML blocks | |||
| An [HTML block](@) is a group of lines that is treated | An [HTML block](@) is a group of lines that is treated | |||
| as raw HTML (and will not be escaped in HTML output). | as raw HTML (and will not be escaped in HTML output). | |||
| There are seven kinds of [HTML block], which can be defined by their | There are seven kinds of [HTML block], which can be defined by their | |||
| start and end conditions. The block begins with a line that meets a | start and end conditions. The block begins with a line that meets a | |||
| [start condition](@) (after up to three spaces optional indentation). | [start condition](@) (after up to three optional spaces of indentation). | |||
| It ends with the first subsequent line that meets a matching [end | It ends with the first subsequent line that meets a matching | |||
| condition](@), or the last line of the document, or the last line of | [end condition](@), or the last line of the document, or the last line of | |||
| the [container block](#container-blocks) containing the current HTML | the [container block](#container-blocks) containing the current HTML | |||
| block, if no line is encountered that meets the [end condition]. If | block, if no line is encountered that meets the [end condition]. If | |||
| the first line meets both the [start condition] and the [end | the first line meets both the [start condition] and the [end | |||
| condition], the block will contain just that line. | condition], the block will contain just that line. | |||
| 1. **Start condition:** line begins with the string `<script`, | 1. **Start condition:** line begins with the string `<pre`, | |||
| `<pre`, or `<style` (case-insensitive), followed by whitespace, | `<script`, `<style`, or `<textarea` (case-insensitive), followed by a space, | |||
| the string `>`, or the end of the line.\ | a tab, the string `>`, or the end of the line.\ | |||
| **End condition:** line contains an end tag | **End condition:** line contains an end tag | |||
| `</script>`, `</pre>`, or `</style>` (case-insensitive; it | `</pre>`, `</script>`, `</style>`, or `</textarea>` (case-insensitive; it | |||
| need not match the start tag). | need not match the start tag). | |||
| 2. **Start condition:** line begins with the string `<!--`.\ | 2. **Start condition:** line begins with the string `<!--`.\ | |||
| **End condition:** line contains the string `-->`. | **End condition:** line contains the string `-->`. | |||
| 3. **Start condition:** line begins with the string `<?`.\ | 3. **Start condition:** line begins with the string `<?`.\ | |||
| **End condition:** line contains the string `?>`. | **End condition:** line contains the string `?>`. | |||
| 4. **Start condition:** line begins with the string `<!` | 4. **Start condition:** line begins with the string `<!` | |||
| followed by an uppercase ASCII letter.\ | followed by an ASCII letter.\ | |||
| **End condition:** line contains the character `>`. | **End condition:** line contains the character `>`. | |||
| 5. **Start condition:** line begins with the string | 5. **Start condition:** line begins with the string | |||
| `<![CDATA[`.\ | `<![CDATA[`.\ | |||
| **End condition:** line contains the string `]]>`. | **End condition:** line contains the string `]]>`. | |||
| 6. **Start condition:** line begins the string `<` or `</` | 6. **Start condition:** line begins the string `<` or `</` | |||
| followed by one of the strings (case-insensitive) `address`, | followed by one of the strings (case-insensitive) `address`, | |||
| `article`, `aside`, `base`, `basefont`, `blockquote`, `body`, | `article`, `aside`, `base`, `basefont`, `blockquote`, `body`, | |||
| `caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`, | `caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`, | |||
| `dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`, | `dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`, | |||
| `footer`, `form`, `frame`, `frameset`, | `footer`, `form`, `frame`, `frameset`, | |||
| `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`, | `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`, | |||
| `html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`, | `html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`, | |||
| `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`, | `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`, | |||
| `section`, `source`, `summary`, `table`, `tbody`, `td`, | `section`, `source`, `summary`, `table`, `tbody`, `td`, | |||
| `tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed | `tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed | |||
| by [whitespace], the end of the line, the string `>`, or | by a space, a tab, the end of the line, the string `>`, or | |||
| the string `/>`.\ | the string `/>`.\ | |||
| **End condition:** line is followed by a [blank line]. | **End condition:** line is followed by a [blank line]. | |||
| 7. **Start condition:** line begins with a complete [open tag] | 7. **Start condition:** line begins with a complete [open tag] | |||
| (with any [tag name] other than `script`, | (with any [tag name] other than `pre`, `script`, | |||
| `style`, or `pre`) or a complete [closing tag], | `style`, or `textarea`) or a complete [closing tag], | |||
| followed only by [whitespace] or the end of the line.\ | followed by zero or more spaces and tabs, followed by the end of the line.\ | |||
| **End condition:** line is followed by a [blank line]. | **End condition:** line is followed by a [blank line]. | |||
| HTML blocks continue until they are closed by their appropriate | HTML blocks continue until they are closed by their appropriate | |||
| [end condition], or the last line of the document or other [container | [end condition], or the last line of the document or other [container | |||
| block](#container-blocks). This means any HTML **within an HTML | block](#container-blocks). This means any HTML **within an HTML | |||
| block** that might otherwise be recognised as a start condition will | block** that might otherwise be recognised as a start condition will | |||
| be ignored by the parser and passed through as-is, without changing | be ignored by the parser and passed through as-is, without changing | |||
| the parser's state. | the parser's state. | |||
| For instance, `<pre>` within a HTML block started by `<table>` will not affect | For instance, `<pre>` within an HTML block started by `<table>` will not affect | |||
| the parser state; as the HTML block was started in by start condition 6, it | the parser state; as the HTML block was started in by start condition 6, it | |||
| will end at any blank line. This can be surprising: | will end at any blank line. This can be surprising: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| <table><tr><td> | <table><tr><td> | |||
| <pre> | <pre> | |||
| **Hello**, | **Hello**, | |||
| _world_. | _world_. | |||
| </pre> | </pre> | |||
| </td></tr></table> | </td></tr></table> | |||
| . | . | |||
| <table><tr><td> | <table><tr><td> | |||
| <pre> | <pre> | |||
| **Hello**, | **Hello**, | |||
| <p><em>world</em>. | <p><em>world</em>. | |||
| </pre></p> | </pre></p> | |||
| </td></tr></table> | </td></tr></table> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| In this case, the HTML block is terminated by the newline — the `**Hello**` | In this case, the HTML block is terminated by the blank line — the `**Hello**` | |||
| text remains verbatim — and regular parsing resumes, with a paragraph, | text remains verbatim — and regular parsing resumes, with a paragraph, | |||
| emphasised `world` and inline and block HTML following. | emphasised `world` and inline and block HTML following. | |||
| All types of [HTML blocks] except type 7 may interrupt | All types of [HTML blocks] except type 7 may interrupt | |||
| a paragraph. Blocks of type 7 may not interrupt a paragraph. | a paragraph. Blocks of type 7 may not interrupt a paragraph. | |||
| (This restriction is intended to prevent unwanted interpretation | (This restriction is intended to prevent unwanted interpretation | |||
| of long tags inside a wrapped paragraph as starting HTML blocks.) | of long tags inside a wrapped paragraph as starting HTML blocks.) | |||
| Some simple examples follow. Here are some basic HTML blocks | Some simple examples follow. Here are some basic HTML blocks | |||
| of type 6: | of type 6: | |||
| skipping to change at line 2245 ¶ | skipping to change at line 2561 ¶ | |||
| the tag is not on a line by itself, we get inline HTML | the tag is not on a line by itself, we get inline HTML | |||
| rather than an [HTML block].) | rather than an [HTML block].) | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| <del>*foo*</del> | <del>*foo*</del> | |||
| . | . | |||
| <p><del><em>foo</em></del></p> | <p><del><em>foo</em></del></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| HTML tags designed to contain literal content | HTML tags designed to contain literal content | |||
| (`script`, `style`, `pre`), comments, processing instructions, | (`pre`, `script`, `style`, `textarea`), comments, processing instructions, | |||
| and declarations are treated somewhat differently. | and declarations are treated somewhat differently. | |||
| Instead of ending at the first blank line, these blocks | Instead of ending at the first blank line, these blocks | |||
| end at the first line containing a corresponding end tag. | end at the first line containing a corresponding end tag. | |||
| As a result, these blocks can contain blank lines: | As a result, these blocks can contain blank lines: | |||
| A pre tag (type 1): | A pre tag (type 1): | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| <pre language="haskell"><code> | <pre language="haskell"><code> | |||
| import Text.HTML.TagSoup | import Text.HTML.TagSoup | |||
| skipping to change at line 2289 ¶ | skipping to change at line 2605 ¶ | |||
| okay | okay | |||
| . | . | |||
| <script type="text/javascript"> | <script type="text/javascript"> | |||
| // JavaScript example | // JavaScript example | |||
| document.getElementById("demo").innerHTML = "Hello JavaScript!"; | document.getElementById("demo").innerHTML = "Hello JavaScript!"; | |||
| </script> | </script> | |||
| <p>okay</p> | <p>okay</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| A textarea tag (type 1): | ||||
| ```````````````````````````````` example | ||||
| <textarea> | ||||
| *foo* | ||||
| _bar_ | ||||
| </textarea> | ||||
| . | ||||
| <textarea> | ||||
| *foo* | ||||
| _bar_ | ||||
| </textarea> | ||||
| ```````````````````````````````` | ||||
| A style tag (type 1): | A style tag (type 1): | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| <style | <style | |||
| type="text/css"> | type="text/css"> | |||
| h1 {color:red;} | h1 {color:red;} | |||
| p {color:blue;} | p {color:blue;} | |||
| </style> | </style> | |||
| okay | okay | |||
| skipping to change at line 2455 ¶ | skipping to change at line 2791 ¶ | |||
| } else { | } else { | |||
| return 0; | return 0; | |||
| } | } | |||
| } | } | |||
| ]]> | ]]> | |||
| <p>okay</p> | <p>okay</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| The opening tag can be indented 1-3 spaces, but not 4: | The opening tag can be preceded by up to three spaces of indentation, but not | |||
| four: | ||||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| <!-- foo --> | <!-- foo --> | |||
| <!-- foo --> | <!-- foo --> | |||
| . | . | |||
| <!-- foo --> | <!-- foo --> | |||
| <pre><code><!-- foo --> | <pre><code><!-- foo --> | |||
| </code></pre> | </code></pre> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| skipping to change at line 2526 ¶ | skipping to change at line 2863 ¶ | |||
| <a href="bar"> | <a href="bar"> | |||
| baz</p> | baz</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| This rule differs from John Gruber's original Markdown syntax | This rule differs from John Gruber's original Markdown syntax | |||
| specification, which says: | specification, which says: | |||
| > The only restrictions are that block-level HTML elements — | > The only restrictions are that block-level HTML elements — | |||
| > e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from | > e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from | |||
| > surrounding content by blank lines, and the start and end tags of the | > surrounding content by blank lines, and the start and end tags of the | |||
| > block should not be indented with tabs or spaces. | > block should not be indented with spaces or tabs. | |||
| In some ways Gruber's rule is more restrictive than the one given | In some ways Gruber's rule is more restrictive than the one given | |||
| here: | here: | |||
| - It requires that an HTML block be preceded by a blank line. | - It requires that an HTML block be preceded by a blank line. | |||
| - It does not allow the start tag to be indented. | - It does not allow the start tag to be indented. | |||
| - It requires a matching end tag, which it also does not allow to | - It requires a matching end tag, which it also does not allow to | |||
| be indented. | be indented. | |||
| Most Markdown implementations (including some of Gruber's own) do not | Most Markdown implementations (including some of Gruber's own) do not | |||
| skipping to change at line 2640 ¶ | skipping to change at line 2977 ¶ | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Fortunately, blank lines are usually not necessary and can be | Fortunately, blank lines are usually not necessary and can be | |||
| deleted. The exception is inside `<pre>` tags, but as described | deleted. The exception is inside `<pre>` tags, but as described | |||
| [above][HTML blocks], raw HTML blocks starting with `<pre>` | [above][HTML blocks], raw HTML blocks starting with `<pre>` | |||
| *can* contain blank lines. | *can* contain blank lines. | |||
| ## Link reference definitions | ## Link reference definitions | |||
| A [link reference definition](@) | A [link reference definition](@) | |||
| consists of a [link label], indented up to three spaces, followed | consists of a [link label], optionally preceded by up to three spaces of | |||
| by a colon (`:`), optional [whitespace] (including up to one | indentation, followed | |||
| by a colon (`:`), optional spaces or tabs (including up to one | ||||
| [line ending]), a [link destination], | [line ending]), a [link destination], | |||
| optional [whitespace] (including up to one | optional spaces or tabs (including up to one | |||
| [line ending]), and an optional [link | [line ending]), and an optional [link | |||
| title], which if it is present must be separated | title], which if it is present must be separated | |||
| from the [link destination] by [whitespace]. | from the [link destination] by spaces or tabs. | |||
| No further [non-whitespace characters] may occur on the line. | No further character may occur. | |||
| A [link reference definition] | A [link reference definition] | |||
| does not correspond to a structural element of a document. Instead, it | does not correspond to a structural element of a document. Instead, it | |||
| defines a label which can be used in [reference links] | defines a label which can be used in [reference links] | |||
| and reference-style [images] elsewhere in the document. [Link | and reference-style [images] elsewhere in the document. [Link | |||
| reference definitions] can come either before or after the links that use | reference definitions] can come either before or after the links that use | |||
| them. | them. | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [foo]: /url "title" | [foo]: /url "title" | |||
| skipping to change at line 2758 ¶ | skipping to change at line 3096 ¶ | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [foo]: <> | [foo]: <> | |||
| [foo] | [foo] | |||
| . | . | |||
| <p><a href="">foo</a></p> | <p><a href="">foo</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| The title must be separated from the link destination by | The title must be separated from the link destination by | |||
| whitespace: | spaces or tabs: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [foo]: <bar>(baz) | [foo]: <bar>(baz) | |||
| [foo] | [foo] | |||
| . | . | |||
| <p>[foo]: <bar>(baz)</p> | <p>[foo]: <bar>(baz)</p> | |||
| <p>[foo]</p> | <p>[foo]</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| skipping to change at line 2821 ¶ | skipping to change at line 3159 ¶ | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [ΑΓΩ]: /φου | [ΑΓΩ]: /φου | |||
| [αγω] | [αγω] | |||
| . | . | |||
| <p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p> | <p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Here is a link reference definition with no corresponding link. | Whether something is a [link reference definition] is | |||
| It contributes nothing to the document. | independent of whether the link reference it defines is | |||
| used in the document. Thus, for example, the following | ||||
| document contains just a link reference definition, and | ||||
| no visible content: | ||||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [foo]: /url | [foo]: /url | |||
| . | . | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Here is another one: | Here is another one: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [ | [ | |||
| foo | foo | |||
| ]: /url | ]: /url | |||
| bar | bar | |||
| . | . | |||
| <p>bar</p> | <p>bar</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| This is not a link reference definition, because there are | This is not a link reference definition, because there are | |||
| [non-whitespace characters] after the title: | characters other than spaces or tabs after the title: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [foo]: /url "title" ok | [foo]: /url "title" ok | |||
| . | . | |||
| <p>[foo]: /url "title" ok</p> | <p>[foo]: /url "title" ok</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| This is a link reference definition, but it has no title: | This is a link reference definition, but it has no title: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| skipping to change at line 2965 ¶ | skipping to change at line 3306 ¶ | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [foo] | [foo] | |||
| > [foo]: /url | > [foo]: /url | |||
| . | . | |||
| <p><a href="/url">foo</a></p> | <p><a href="/url">foo</a></p> | |||
| <blockquote> | <blockquote> | |||
| </blockquote> | </blockquote> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Whether something is a [link reference definition] is | ||||
| independent of whether the link reference it defines is | ||||
| used in the document. Thus, for example, the following | ||||
| document contains just a link reference definition, and | ||||
| no visible content: | ||||
| ```````````````````````````````` example | ||||
| [foo]: /url | ||||
| . | ||||
| ```````````````````````````````` | ||||
| ## Paragraphs | ## Paragraphs | |||
| A sequence of non-blank lines that cannot be interpreted as other | A sequence of non-blank lines that cannot be interpreted as other | |||
| kinds of blocks forms a [paragraph](@). | kinds of blocks forms a [paragraph](@). | |||
| The contents of the paragraph are the result of parsing the | The contents of the paragraph are the result of parsing the | |||
| paragraph's raw content as inlines. The paragraph's raw content | paragraph's raw content as inlines. The paragraph's raw content | |||
| is formed by concatenating the lines and removing initial and final | is formed by concatenating the lines and removing initial and final | |||
| [whitespace]. | spaces or tabs. | |||
| A simple example with two paragraphs: | A simple example with two paragraphs: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| aaa | aaa | |||
| bbb | bbb | |||
| . | . | |||
| <p>aaa</p> | <p>aaa</p> | |||
| <p>bbb</p> | <p>bbb</p> | |||
| skipping to change at line 3011 ¶ | skipping to change at line 3341 ¶ | |||
| ccc | ccc | |||
| ddd | ddd | |||
| . | . | |||
| <p>aaa | <p>aaa | |||
| bbb</p> | bbb</p> | |||
| <p>ccc | <p>ccc | |||
| ddd</p> | ddd</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Multiple blank lines between paragraph have no effect: | Multiple blank lines between paragraphs have no effect: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| aaa | aaa | |||
| bbb | bbb | |||
| . | . | |||
| <p>aaa</p> | <p>aaa</p> | |||
| <p>bbb</p> | <p>bbb</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Leading spaces are skipped: | Leading spaces or tabs are skipped: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| aaa | aaa | |||
| bbb | bbb | |||
| . | . | |||
| <p>aaa | <p>aaa | |||
| bbb</p> | bbb</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Lines after the first may be indented any amount, since indented | Lines after the first may be indented any amount, since indented | |||
| skipping to change at line 3045 ¶ | skipping to change at line 3375 ¶ | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| aaa | aaa | |||
| bbb | bbb | |||
| ccc | ccc | |||
| . | . | |||
| <p>aaa | <p>aaa | |||
| bbb | bbb | |||
| ccc</p> | ccc</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| However, the first line may be indented at most three spaces, | However, the first line may be preceded by up to three spaces of indentation. | |||
| or an indented code block will be triggered: | Four spaces of indentation is too many: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| aaa | aaa | |||
| bbb | bbb | |||
| . | . | |||
| <p>aaa | <p>aaa | |||
| bbb</p> | bbb</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| aaa | aaa | |||
| bbb | bbb | |||
| . | . | |||
| <pre><code>aaa | <pre><code>aaa | |||
| </code></pre> | </code></pre> | |||
| <p>bbb</p> | <p>bbb</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Final spaces are stripped before inline parsing, so a paragraph | Final spaces or tabs are stripped before inline parsing, so a paragraph | |||
| that ends with two or more spaces will not end with a [hard line | that ends with two or more spaces will not end with a [hard line | |||
| break]: | break]: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| aaa | aaa | |||
| bbb | bbb | |||
| . | . | |||
| <p>aaa<br /> | <p>aaa<br /> | |||
| bbb</p> | bbb</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| skipping to change at line 3118 ¶ | skipping to change at line 3448 ¶ | |||
| > with these blocks as its content. | > with these blocks as its content. | |||
| So, we explain what counts as a block quote or list item by explaining | So, we explain what counts as a block quote or list item by explaining | |||
| how these can be *generated* from their contents. This should suffice | how these can be *generated* from their contents. This should suffice | |||
| to define the syntax, although it does not give a recipe for *parsing* | to define the syntax, although it does not give a recipe for *parsing* | |||
| these constructions. (A recipe is provided below in the section entitled | these constructions. (A recipe is provided below in the section entitled | |||
| [A parsing strategy](#appendix-a-parsing-strategy).) | [A parsing strategy](#appendix-a-parsing-strategy).) | |||
| ## Block quotes | ## Block quotes | |||
| A [block quote marker](@) | A [block quote marker](@), | |||
| consists of 0-3 spaces of initial indent, plus (a) the character `>` together | optionally preceded by up to three spaces of indentation, | |||
| with a following space, or (b) a single character `>` not followed by a space. | consists of (a) the character `>` together with a following space of | |||
| indentation, or (b) a single character `>` not followed by a space of | ||||
| indentation. | ||||
| The following rules define [block quotes]: | The following rules define [block quotes]: | |||
| 1. **Basic case.** If a string of lines *Ls* constitute a sequence | 1. **Basic case.** If a string of lines *Ls* constitute a sequence | |||
| of blocks *Bs*, then the result of prepending a [block quote | of blocks *Bs*, then the result of prepending a [block quote | |||
| marker] to the beginning of each line in *Ls* | marker] to the beginning of each line in *Ls* | |||
| is a [block quote](#block-quotes) containing *Bs*. | is a [block quote](#block-quotes) containing *Bs*. | |||
| 2. **Laziness.** If a string of lines *Ls* constitute a [block | 2. **Laziness.** If a string of lines *Ls* constitute a [block | |||
| quote](#block-quotes) with contents *Bs*, then the result of deleting | quote](#block-quotes) with contents *Bs*, then the result of deleting | |||
| the initial [block quote marker] from one or | the initial [block quote marker] from one or | |||
| more lines in which the next [non-whitespace character] after the [block | more lines in which the next character other than a space or tab after the | |||
| quote marker] is [paragraph continuation | [block quote marker] is [paragraph continuation | |||
| text] is a block quote with *Bs* as its content. | text] is a block quote with *Bs* as its content. | |||
| [Paragraph continuation text](@) is text | [Paragraph continuation text](@) is text | |||
| that will be parsed as part of the content of a paragraph, but does | that will be parsed as part of the content of a paragraph, but does | |||
| not occur at the beginning of the paragraph. | not occur at the beginning of the paragraph. | |||
| 3. **Consecutiveness.** A document cannot contain two [block | 3. **Consecutiveness.** A document cannot contain two [block | |||
| quotes] in a row unless there is a [blank line] between them. | quotes] in a row unless there is a [blank line] between them. | |||
| Nothing else counts as a [block quote](#block-quotes). | Nothing else counts as a [block quote](#block-quotes). | |||
| skipping to change at line 3158 ¶ | skipping to change at line 3490 ¶ | |||
| > bar | > bar | |||
| > baz | > baz | |||
| . | . | |||
| <blockquote> | <blockquote> | |||
| <h1>Foo</h1> | <h1>Foo</h1> | |||
| <p>bar | <p>bar | |||
| baz</p> | baz</p> | |||
| </blockquote> | </blockquote> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| The spaces after the `>` characters can be omitted: | The space or tab after the `>` characters can be omitted: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| ># Foo | ># Foo | |||
| >bar | >bar | |||
| > baz | > baz | |||
| . | . | |||
| <blockquote> | <blockquote> | |||
| <h1>Foo</h1> | <h1>Foo</h1> | |||
| <p>bar | <p>bar | |||
| baz</p> | baz</p> | |||
| </blockquote> | </blockquote> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| The `>` characters can be indented 1-3 spaces: | The `>` characters can be preceded by up to three spaces of indentation: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| > # Foo | > # Foo | |||
| > bar | > bar | |||
| > baz | > baz | |||
| . | . | |||
| <blockquote> | <blockquote> | |||
| <h1>Foo</h1> | <h1>Foo</h1> | |||
| <p>bar | <p>bar | |||
| baz</p> | baz</p> | |||
| </blockquote> | </blockquote> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Four spaces gives us a code block: | Four spaces of indentation is too many: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| > # Foo | > # Foo | |||
| > bar | > bar | |||
| > baz | > baz | |||
| . | . | |||
| <pre><code>> # Foo | <pre><code>> # Foo | |||
| > bar | > bar | |||
| > baz | > baz | |||
| </code></pre> | </code></pre> | |||
| skipping to change at line 3500 ¶ | skipping to change at line 3832 ¶ | |||
| <p>foo | <p>foo | |||
| bar | bar | |||
| baz</p> | baz</p> | |||
| </blockquote> | </blockquote> | |||
| </blockquote> | </blockquote> | |||
| </blockquote> | </blockquote> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| When including an indented code block in a block quote, | When including an indented code block in a block quote, | |||
| remember that the [block quote marker] includes | remember that the [block quote marker] includes | |||
| both the `>` and a following space. So *five spaces* are needed after | both the `>` and a following space of indentation. So *five spaces* are needed | |||
| the `>`: | after the `>`: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| > code | > code | |||
| > not code | > not code | |||
| . | . | |||
| <blockquote> | <blockquote> | |||
| <pre><code>code | <pre><code>code | |||
| </code></pre> | </code></pre> | |||
| </blockquote> | </blockquote> | |||
| skipping to change at line 3534 ¶ | skipping to change at line 3866 ¶ | |||
| An [ordered list marker](@) | An [ordered list marker](@) | |||
| is a sequence of 1--9 arabic digits (`0-9`), followed by either a | is a sequence of 1--9 arabic digits (`0-9`), followed by either a | |||
| `.` character or a `)` character. (The reason for the length | `.` character or a `)` character. (The reason for the length | |||
| limit is that with 10 digits we start seeing integer overflows | limit is that with 10 digits we start seeing integer overflows | |||
| in some browsers.) | in some browsers.) | |||
| The following rules define [list items]: | The following rules define [list items]: | |||
| 1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of | 1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of | |||
| blocks *Bs* starting with a [non-whitespace character], and *M* is a | blocks *Bs* starting with a character other than a space or tab, and *M* is | |||
| list marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result | a list marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces of indentation, | |||
| of prepending *M* and the following spaces to the first line of | then the result of prepending *M* and the following spaces to the first line | |||
| *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a | of Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a | |||
| list item with *Bs* as its contents. The type of the list item | list item with *Bs* as its contents. The type of the list item | |||
| (bullet or ordered) is determined by the type of its list marker. | (bullet or ordered) is determined by the type of its list marker. | |||
| If the list item is ordered, then it is also assigned a start | If the list item is ordered, then it is also assigned a start | |||
| number, based on the ordered list marker. | number, based on the ordered list marker. | |||
| Exceptions: | Exceptions: | |||
| 1. When the first list item in a [list] interrupts | 1. When the first list item in a [list] interrupts | |||
| a paragraph---that is, when it starts on a line that would | a paragraph---that is, when it starts on a line that would | |||
| otherwise count as [paragraph continuation text]---then (a) | otherwise count as [paragraph continuation text]---then (a) | |||
| skipping to change at line 3600 ¶ | skipping to change at line 3932 ¶ | |||
| <blockquote> | <blockquote> | |||
| <p>A block quote.</p> | <p>A block quote.</p> | |||
| </blockquote> | </blockquote> | |||
| </li> | </li> | |||
| </ol> | </ol> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| The most important thing to notice is that the position of | The most important thing to notice is that the position of | |||
| the text after the list marker determines how much indentation | the text after the list marker determines how much indentation | |||
| is needed in subsequent blocks in the list item. If the list | is needed in subsequent blocks in the list item. If the list | |||
| marker takes up two spaces, and there are three spaces between | marker takes up two spaces of indentation, and there are three spaces between | |||
| the list marker and the next [non-whitespace character], then blocks | the list marker and the next character other than a space or tab, then blocks | |||
| must be indented five spaces in order to fall under the list | must be indented five spaces in order to fall under the list | |||
| item. | item. | |||
| Here are some examples showing how far content must be indented to be | Here are some examples showing how far content must be indented to be | |||
| put under the list item: | put under the list item: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| - one | - one | |||
| two | two | |||
| skipping to change at line 3658 ¶ | skipping to change at line 3990 ¶ | |||
| . | . | |||
| <ul> | <ul> | |||
| <li> | <li> | |||
| <p>one</p> | <p>one</p> | |||
| <p>two</p> | <p>two</p> | |||
| </li> | </li> | |||
| </ul> | </ul> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| It is tempting to think of this in terms of columns: the continuation | It is tempting to think of this in terms of columns: the continuation | |||
| blocks must be indented at least to the column of the first | blocks must be indented at least to the column of the first character other than | |||
| [non-whitespace character] after the list marker. However, that is not quite rig ht. | a space or tab after the list marker. However, that is not quite right. | |||
| The spaces after the list marker determine how much relative indentation | The spaces of indentation after the list marker determine how much relative | |||
| is needed. Which column this indentation reaches will depend on | indentation is needed. Which column this indentation reaches will depend on | |||
| how the list item is embedded in other constructions, as shown by | how the list item is embedded in other constructions, as shown by | |||
| this example: | this example: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| > > 1. one | > > 1. one | |||
| >> | >> | |||
| >> two | >> two | |||
| . | . | |||
| <blockquote> | <blockquote> | |||
| <blockquote> | <blockquote> | |||
| skipping to change at line 3706 ¶ | skipping to change at line 4038 ¶ | |||
| <blockquote> | <blockquote> | |||
| <blockquote> | <blockquote> | |||
| <ul> | <ul> | |||
| <li>one</li> | <li>one</li> | |||
| </ul> | </ul> | |||
| <p>two</p> | <p>two</p> | |||
| </blockquote> | </blockquote> | |||
| </blockquote> | </blockquote> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Note that at least one space is needed between the list marker and | Note that at least one space or tab is needed between the list marker and | |||
| any following content, so these are not list items: | any following content, so these are not list items: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| -one | -one | |||
| 2.two | 2.two | |||
| . | . | |||
| <p>-one</p> | <p>-one</p> | |||
| <p>2.two</p> | <p>2.two</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| skipping to change at line 3826 ¶ | skipping to change at line 4158 ¶ | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| -1. not ok | -1. not ok | |||
| . | . | |||
| <p>-1. not ok</p> | <p>-1. not ok</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| 2. **Item starting with indented code.** If a sequence of lines *Ls* | 2. **Item starting with indented code.** If a sequence of lines *Ls* | |||
| constitute a sequence of blocks *Bs* starting with an indented code | constitute a sequence of blocks *Bs* starting with an indented code | |||
| block, and *M* is a list marker of width *W* followed by | block, and *M* is a list marker of width *W* followed by | |||
| one space, then the result of prepending *M* and the following | one space of indentation, then the result of prepending *M* and the | |||
| space to the first line of *Ls*, and indenting subsequent lines of | following space to the first line of *Ls*, and indenting subsequent lines | |||
| *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents. | of *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents. | |||
| If a line is empty, then it need not be indented. The type of the | If a line is empty, then it need not be indented. The type of the | |||
| list item (bullet or ordered) is determined by the type of its list | list item (bullet or ordered) is determined by the type of its list | |||
| marker. If the list item is ordered, then it is also assigned a | marker. If the list item is ordered, then it is also assigned a | |||
| start number, based on the ordered list marker. | start number, based on the ordered list marker. | |||
| An indented code block will have to be indented four spaces beyond | An indented code block will have to be preceded by four spaces of indentation | |||
| the edge of the region where text will be included in the list item. | beyond the edge of the region where text will be included in the list item. | |||
| In the following case that is 6 spaces: | In the following case that is 6 spaces: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| - foo | - foo | |||
| bar | bar | |||
| . | . | |||
| <ul> | <ul> | |||
| <li> | <li> | |||
| <p>foo</p> | <p>foo</p> | |||
| skipping to change at line 3869 ¶ | skipping to change at line 4201 ¶ | |||
| <ol start="10"> | <ol start="10"> | |||
| <li> | <li> | |||
| <p>foo</p> | <p>foo</p> | |||
| <pre><code>bar | <pre><code>bar | |||
| </code></pre> | </code></pre> | |||
| </li> | </li> | |||
| </ol> | </ol> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| If the *first* block in the list item is an indented code block, | If the *first* block in the list item is an indented code block, | |||
| then by rule #2, the contents must be indented *one* space after the | then by rule #2, the contents must be preceded by *one* space of indentation | |||
| list marker: | after the list marker: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| indented code | indented code | |||
| paragraph | paragraph | |||
| more code | more code | |||
| . | . | |||
| <pre><code>indented code | <pre><code>indented code | |||
| </code></pre> | </code></pre> | |||
| skipping to change at line 3904 ¶ | skipping to change at line 4236 ¶ | |||
| <li> | <li> | |||
| <pre><code>indented code | <pre><code>indented code | |||
| </code></pre> | </code></pre> | |||
| <p>paragraph</p> | <p>paragraph</p> | |||
| <pre><code>more code | <pre><code>more code | |||
| </code></pre> | </code></pre> | |||
| </li> | </li> | |||
| </ol> | </ol> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Note that an additional space indent is interpreted as space | Note that an additional space of indentation is interpreted as space | |||
| inside the code block: | inside the code block: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| 1. indented code | 1. indented code | |||
| paragraph | paragraph | |||
| more code | more code | |||
| . | . | |||
| <ol> | <ol> | |||
| skipping to change at line 3927 ¶ | skipping to change at line 4259 ¶ | |||
| </code></pre> | </code></pre> | |||
| <p>paragraph</p> | <p>paragraph</p> | |||
| <pre><code>more code | <pre><code>more code | |||
| </code></pre> | </code></pre> | |||
| </li> | </li> | |||
| </ol> | </ol> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Note that rules #1 and #2 only apply to two cases: (a) cases | Note that rules #1 and #2 only apply to two cases: (a) cases | |||
| in which the lines to be included in a list item begin with a | in which the lines to be included in a list item begin with a | |||
| [non-whitespace character], and (b) cases in which | characer other than a space or tab, and (b) cases in which | |||
| they begin with an indented code | they begin with an indented code | |||
| block. In a case like the following, where the first block begins with | block. In a case like the following, where the first block begins with | |||
| a three-space indent, the rules do not allow us to form a list item by | three spaces of indentation, the rules do not allow us to form a list item by | |||
| indenting the whole thing and prepending a list marker: | indenting the whole thing and prepending a list marker: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| foo | foo | |||
| bar | bar | |||
| . | . | |||
| <p>foo</p> | <p>foo</p> | |||
| <p>bar</p> | <p>bar</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| skipping to change at line 3953 ¶ | skipping to change at line 4285 ¶ | |||
| - foo | - foo | |||
| bar | bar | |||
| . | . | |||
| <ul> | <ul> | |||
| <li>foo</li> | <li>foo</li> | |||
| </ul> | </ul> | |||
| <p>bar</p> | <p>bar</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| This is not a significant restriction, because when a block begins | This is not a significant restriction, because when a block is preceded by up to | |||
| with 1-3 spaces indent, the indentation can always be removed without | three spaces of indentation, the indentation can always be removed without | |||
| a change in interpretation, allowing rule #1 to be applied. So, in | a change in interpretation, allowing rule #1 to be applied. So, in | |||
| the above case: | the above case: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| - foo | - foo | |||
| bar | bar | |||
| . | . | |||
| <ul> | <ul> | |||
| <li> | <li> | |||
| <p>foo</p> | <p>foo</p> | |||
| <p>bar</p> | <p>bar</p> | |||
| </li> | </li> | |||
| </ul> | </ul> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| 3. **Item starting with a blank line.** If a sequence of lines *Ls* | 3. **Item starting with a blank line.** If a sequence of lines *Ls* | |||
| starting with a single [blank line] constitute a (possibly empty) | starting with a single [blank line] constitute a (possibly empty) | |||
| sequence of blocks *Bs*, not separated from each other by more than | sequence of blocks *Bs*, and *M* is a list marker of width *W*, | |||
| one blank line, and *M* is a list marker of width *W*, | ||||
| then the result of prepending *M* to the first line of *Ls*, and | then the result of prepending *M* to the first line of *Ls*, and | |||
| indenting subsequent lines of *Ls* by *W + 1* spaces, is a list | preceding subsequent lines of *Ls* by *W + 1* spaces of indentation, is a | |||
| item with *Bs* as its contents. | list item with *Bs* as its contents. | |||
| If a line is empty, then it need not be indented. The type of the | If a line is empty, then it need not be indented. The type of the | |||
| list item (bullet or ordered) is determined by the type of its list | list item (bullet or ordered) is determined by the type of its list | |||
| marker. If the list item is ordered, then it is also assigned a | marker. If the list item is ordered, then it is also assigned a | |||
| start number, based on the ordered list marker. | start number, based on the ordered list marker. | |||
| Here are some list items that start with a blank line but are not empty: | Here are some list items that start with a blank line but are not empty: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| - | - | |||
| foo | foo | |||
| skipping to change at line 4049 ¶ | skipping to change at line 4380 ¶ | |||
| - | - | |||
| - bar | - bar | |||
| . | . | |||
| <ul> | <ul> | |||
| <li>foo</li> | <li>foo</li> | |||
| <li></li> | <li></li> | |||
| <li>bar</li> | <li>bar</li> | |||
| </ul> | </ul> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| It does not matter whether there are spaces following the [list marker]: | It does not matter whether there are spaces or tabs following the [list marker]: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| - foo | - foo | |||
| - | - | |||
| - bar | - bar | |||
| . | . | |||
| <ul> | <ul> | |||
| <li>foo</li> | <li>foo</li> | |||
| <li></li> | <li></li> | |||
| <li>bar</li> | <li>bar</li> | |||
| skipping to change at line 4103 ¶ | skipping to change at line 4434 ¶ | |||
| foo | foo | |||
| 1. | 1. | |||
| . | . | |||
| <p>foo | <p>foo | |||
| *</p> | *</p> | |||
| <p>foo | <p>foo | |||
| 1.</p> | 1.</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| 4. **Indentation.** If a sequence of lines *Ls* constitutes a list item | 4. **Indentation.** If a sequence of lines *Ls* constitutes a list item | |||
| according to rule #1, #2, or #3, then the result of indenting each line | according to rule #1, #2, or #3, then the result of preceding each line | |||
| of *Ls* by 1-3 spaces (the same for each line) also constitutes a | of *Ls* by up to three spaces of indentation (the same for each line) also | |||
| list item with the same contents and attributes. If a line is | constitutes a list item with the same contents and attributes. If a line is | |||
| empty, then it need not be indented. | empty, then it need not be indented. | |||
| Indented one space: | Indented one space: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| 1. A paragraph | 1. A paragraph | |||
| with two lines. | with two lines. | |||
| indented code | indented code | |||
| skipping to change at line 4199 ¶ | skipping to change at line 4530 ¶ | |||
| indented code | indented code | |||
| > A block quote. | > A block quote. | |||
| </code></pre> | </code></pre> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| 5. **Laziness.** If a string of lines *Ls* constitute a [list | 5. **Laziness.** If a string of lines *Ls* constitute a [list | |||
| item](#list-items) with contents *Bs*, then the result of deleting | item](#list-items) with contents *Bs*, then the result of deleting | |||
| some or all of the indentation from one or more lines in which the | some or all of the indentation from one or more lines in which the | |||
| next [non-whitespace character] after the indentation is | next character other than a space or tab after the indentation is | |||
| [paragraph continuation text] is a | [paragraph continuation text] is a | |||
| list item with the same contents and attributes. The unindented | list item with the same contents and attributes. The unindented | |||
| lines are called | lines are called | |||
| [lazy continuation line](@)s. | [lazy continuation line](@)s. | |||
| Here is an example with [lazy continuation lines]: | Here is an example with [lazy continuation lines]: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| 1. A paragraph | 1. A paragraph | |||
| with two lines. | with two lines. | |||
| skipping to change at line 4279 ¶ | skipping to change at line 4610 ¶ | |||
| </li> | </li> | |||
| </ol> | </ol> | |||
| </blockquote> | </blockquote> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| 6. **That's all.** Nothing that is not counted as a list item by rules | 6. **That's all.** Nothing that is not counted as a list item by rules | |||
| #1--5 counts as a [list item](#list-items). | #1--5 counts as a [list item](#list-items). | |||
| The rules for sublists follow from the general rules | The rules for sublists follow from the general rules | |||
| [above][List items]. A sublist must be indented the same number | [above][List items]. A sublist must be indented the same number | |||
| of spaces a paragraph would need to be in order to be included | of spaces of indentation a paragraph would need to be in order to be included | |||
| in the list item. | in the list item. | |||
| So, in this case we need two spaces indent: | So, in this case we need two spaces indent: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| - foo | - foo | |||
| - bar | - bar | |||
| - baz | - baz | |||
| - boo | - boo | |||
| . | . | |||
| skipping to change at line 4505 ¶ | skipping to change at line 4836 ¶ | |||
| <li>baz</li> | <li>baz</li> | |||
| </ul> | </ul> | |||
| </li> | </li> | |||
| </ul> | </ul> | |||
| ``` | ``` | |||
| The choice of four spaces is arbitrary. It can be learned, but it is | The choice of four spaces is arbitrary. It can be learned, but it is | |||
| not likely to be guessed, and it trips up beginners regularly. | not likely to be guessed, and it trips up beginners regularly. | |||
| Would it help to adopt a two-space rule? The problem is that such | Would it help to adopt a two-space rule? The problem is that such | |||
| a rule, together with the rule allowing 1--3 spaces indentation of the | a rule, together with the rule allowing up to three spaces of indentation for | |||
| initial list marker, allows text that is indented *less than* the | the initial list marker, allows text that is indented *less than* the | |||
| original list marker to be included in the list item. For example, | original list marker to be included in the list item. For example, | |||
| `Markdown.pl` parses | `Markdown.pl` parses | |||
| ``` markdown | ``` markdown | |||
| - one | - one | |||
| two | two | |||
| ``` | ``` | |||
| as a single list item, with `two` a continuation paragraph: | as a single list item, with `two` a continuation paragraph: | |||
| skipping to change at line 4890 ¶ | skipping to change at line 5221 ¶ | |||
| </li> | </li> | |||
| <li> | <li> | |||
| <p>b</p> | <p>b</p> | |||
| </li> | </li> | |||
| <li> | <li> | |||
| <p>c</p> | <p>c</p> | |||
| </li> | </li> | |||
| </ol> | </ol> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Note, however, that list items may not be indented more than | Note, however, that list items may not be preceded by more than | |||
| three spaces. Here `- e` is treated as a paragraph continuation | three spaces of indentation. Here `- e` is treated as a paragraph continuation | |||
| line, because it is indented more than three spaces: | line, because it is indented more than three spaces: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| - a | - a | |||
| - b | - b | |||
| - c | - c | |||
| - d | - d | |||
| - e | - e | |||
| . | . | |||
| <ul> | <ul> | |||
| skipping to change at line 4974 ¶ | skipping to change at line 5305 ¶ | |||
| <li> | <li> | |||
| <p>a</p> | <p>a</p> | |||
| </li> | </li> | |||
| <li></li> | <li></li> | |||
| <li> | <li> | |||
| <p>c</p> | <p>c</p> | |||
| </li> | </li> | |||
| </ul> | </ul> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| These are loose lists, even though there is no space between the items, | These are loose lists, even though there are no blank lines between the items, | |||
| because one of the items directly contains two block-level elements | because one of the items directly contains two block-level elements | |||
| with a blank line between them: | with a blank line between them: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| - a | - a | |||
| - b | - b | |||
| c | c | |||
| - d | - d | |||
| . | . | |||
| skipping to change at line 5209 ¶ | skipping to change at line 5540 ¶ | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| `hi`lo` | `hi`lo` | |||
| . | . | |||
| <p><code>hi</code>lo`</p> | <p><code>hi</code>lo`</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| `hi` is parsed as code, leaving the backtick at the end as a literal | `hi` is parsed as code, leaving the backtick at the end as a literal | |||
| backtick. | backtick. | |||
| ## Backslash escapes | ||||
| Any ASCII punctuation character may be backslash-escaped: | ||||
| ```````````````````````````````` example | ||||
| \!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~ | ||||
| . | ||||
| <p>!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~</p> | ||||
| ```````````````````````````````` | ||||
| Backslashes before other characters are treated as literal | ||||
| backslashes: | ||||
| ```````````````````````````````` example | ||||
| \→\A\a\ \3\φ\« | ||||
| . | ||||
| <p>\→\A\a\ \3\φ\«</p> | ||||
| ```````````````````````````````` | ||||
| Escaped characters are treated as regular characters and do | ||||
| not have their usual Markdown meanings: | ||||
| ```````````````````````````````` example | ||||
| \*not emphasized* | ||||
| \<br/> not a tag | ||||
| \[not a link](/foo) | ||||
| \`not code` | ||||
| 1\. not a list | ||||
| \* not a list | ||||
| \# not a heading | ||||
| \[foo]: /url "not a reference" | ||||
| \ö not a character entity | ||||
| . | ||||
| <p>*not emphasized* | ||||
| <br/> not a tag | ||||
| [not a link](/foo) | ||||
| `not code` | ||||
| 1. not a list | ||||
| * not a list | ||||
| # not a heading | ||||
| [foo]: /url "not a reference" | ||||
| &ouml; not a character entity</p> | ||||
| ```````````````````````````````` | ||||
| If a backslash is itself escaped, the following character is not: | ||||
| ```````````````````````````````` example | ||||
| \\*emphasis* | ||||
| . | ||||
| <p>\<em>emphasis</em></p> | ||||
| ```````````````````````````````` | ||||
| A backslash at the end of the line is a [hard line break]: | ||||
| ```````````````````````````````` example | ||||
| foo\ | ||||
| bar | ||||
| . | ||||
| <p>foo<br /> | ||||
| bar</p> | ||||
| ```````````````````````````````` | ||||
| Backslash escapes do not work in code blocks, code spans, autolinks, or | ||||
| raw HTML: | ||||
| ```````````````````````````````` example | ||||
| `` \[\` `` | ||||
| . | ||||
| <p><code>\[\`</code></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| \[\] | ||||
| . | ||||
| <pre><code>\[\] | ||||
| </code></pre> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| ~~~ | ||||
| \[\] | ||||
| ~~~ | ||||
| . | ||||
| <pre><code>\[\] | ||||
| </code></pre> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| <http://example.com?find=\*> | ||||
| . | ||||
| <p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| <a href="/bar\/)"> | ||||
| . | ||||
| <a href="/bar\/)"> | ||||
| ```````````````````````````````` | ||||
| But they work in all other contexts, including URLs and link titles, | ||||
| link references, and [info strings] in [fenced code blocks]: | ||||
| ```````````````````````````````` example | ||||
| [foo](/bar\* "ti\*tle") | ||||
| . | ||||
| <p><a href="/bar*" title="ti*tle">foo</a></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| [foo] | ||||
| [foo]: /bar\* "ti\*tle" | ||||
| . | ||||
| <p><a href="/bar*" title="ti*tle">foo</a></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| ``` foo\+bar | ||||
| foo | ||||
| ``` | ||||
| . | ||||
| <pre><code class="language-foo+bar">foo | ||||
| </code></pre> | ||||
| ```````````````````````````````` | ||||
| ## Entity and numeric character references | ||||
| Valid HTML entity references and numeric character references | ||||
| can be used in place of the corresponding Unicode character, | ||||
| with the following exceptions: | ||||
| - Entity and character references are not recognized in code | ||||
| blocks and code spans. | ||||
| - Entity and character references cannot stand in place of | ||||
| special characters that define structural elements in | ||||
| CommonMark. For example, although `*` can be used | ||||
| in place of a literal `*` character, `*` cannot replace | ||||
| `*` in emphasis delimiters, bullet list markers, or thematic | ||||
| breaks. | ||||
| Conforming CommonMark parsers need not store information about | ||||
| whether a particular character was represented in the source | ||||
| using a Unicode character or an entity reference. | ||||
| [Entity references](@) consist of `&` + any of the valid | ||||
| HTML5 entity names + `;`. The | ||||
| document <https://html.spec.whatwg.org/multipage/entities.json> | ||||
| is used as an authoritative source for the valid entity | ||||
| references and their corresponding code points. | ||||
| ```````````````````````````````` example | ||||
| & © Æ Ď | ||||
| ¾ ℋ ⅆ | ||||
| ∲ ≧̸ | ||||
| . | ||||
| <p> & © Æ Ď | ||||
| ¾ ℋ ⅆ | ||||
| ∲ ≧̸</p> | ||||
| ```````````````````````````````` | ||||
| [Decimal numeric character | ||||
| references](@) | ||||
| consist of `&#` + a string of 1--7 arabic digits + `;`. A | ||||
| numeric character reference is parsed as the corresponding | ||||
| Unicode character. Invalid Unicode code points will be replaced by | ||||
| the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons, | ||||
| the code point `U+0000` will also be replaced by `U+FFFD`. | ||||
| ```````````````````````````````` example | ||||
| # Ӓ Ϡ � | ||||
| . | ||||
| <p># Ӓ Ϡ �</p> | ||||
| ```````````````````````````````` | ||||
| [Hexadecimal numeric character | ||||
| references](@) consist of `&#` + | ||||
| either `X` or `x` + a string of 1-6 hexadecimal digits + `;`. | ||||
| They too are parsed as the corresponding Unicode character (this | ||||
| time specified with a hexadecimal numeral instead of decimal). | ||||
| ```````````````````````````````` example | ||||
| " ആ ಫ | ||||
| . | ||||
| <p>" ആ ಫ</p> | ||||
| ```````````````````````````````` | ||||
| Here are some nonentities: | ||||
| ```````````````````````````````` example | ||||
|   &x; &#; &#x; | ||||
| � | ||||
| &#abcdef0; | ||||
| &ThisIsNotDefined; &hi?; | ||||
| . | ||||
| <p>&nbsp &x; &#; &#x; | ||||
| &#987654321; | ||||
| &#abcdef0; | ||||
| &ThisIsNotDefined; &hi?;</p> | ||||
| ```````````````````````````````` | ||||
| Although HTML5 does accept some entity references | ||||
| without a trailing semicolon (such as `©`), these are not | ||||
| recognized here, because it makes the grammar too ambiguous: | ||||
| ```````````````````````````````` example | ||||
| © | ||||
| . | ||||
| <p>&copy</p> | ||||
| ```````````````````````````````` | ||||
| Strings that are not on the list of HTML5 named entities are not | ||||
| recognized as entity references either: | ||||
| ```````````````````````````````` example | ||||
| &MadeUpEntity; | ||||
| . | ||||
| <p>&MadeUpEntity;</p> | ||||
| ```````````````````````````````` | ||||
| Entity and numeric character references are recognized in any | ||||
| context besides code spans or code blocks, including | ||||
| URLs, [link titles], and [fenced code block][] [info strings]: | ||||
| ```````````````````````````````` example | ||||
| <a href="öö.html"> | ||||
| . | ||||
| <a href="öö.html"> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| [foo](/föö "föö") | ||||
| . | ||||
| <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| [foo] | ||||
| [foo]: /föö "föö" | ||||
| . | ||||
| <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| ``` föö | ||||
| foo | ||||
| ``` | ||||
| . | ||||
| <pre><code class="language-föö">foo | ||||
| </code></pre> | ||||
| ```````````````````````````````` | ||||
| Entity and numeric character references are treated as literal | ||||
| text in code spans and code blocks: | ||||
| ```````````````````````````````` example | ||||
| `föö` | ||||
| . | ||||
| <p><code>f&ouml;&ouml;</code></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| föfö | ||||
| . | ||||
| <pre><code>f&ouml;f&ouml; | ||||
| </code></pre> | ||||
| ```````````````````````````````` | ||||
| Entity and numeric character references cannot be used | ||||
| in place of symbols indicating structure in CommonMark | ||||
| documents. | ||||
| ```````````````````````````````` example | ||||
| *foo* | ||||
| *foo* | ||||
| . | ||||
| <p>*foo* | ||||
| <em>foo</em></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| * foo | ||||
| * foo | ||||
| . | ||||
| <p>* foo</p> | ||||
| <ul> | ||||
| <li>foo</li> | ||||
| </ul> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| foo bar | ||||
| . | ||||
| <p>foo | ||||
| bar</p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| 	foo | ||||
| . | ||||
| <p>→foo</p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| [a](url "tit") | ||||
| . | ||||
| <p>[a](url "tit")</p> | ||||
| ```````````````````````````````` | ||||
| ## Code spans | ## Code spans | |||
| A [backtick string](@) | A [backtick string](@) | |||
| is a string of one or more backtick characters (`` ` ``) that is neither | is a string of one or more backtick characters (`` ` ``) that is neither | |||
| preceded nor followed by a backtick. | preceded nor followed by a backtick. | |||
| A [code span](@) begins with a backtick string and ends with | A [code span](@) begins with a backtick string and ends with | |||
| a backtick string of equal length. The contents of the code span are | a backtick string of equal length. The contents of the code span are | |||
| the characters between the two backtick strings, normalized in the | the characters between these two backtick strings, normalized in the | |||
| following ways: | following ways: | |||
| - First, [line endings] are converted to [spaces]. | - First, [line endings] are converted to [spaces]. | |||
| - If the resulting string both begins *and* ends with a [space] | - If the resulting string both begins *and* ends with a [space] | |||
| character, but does not consist entirely of [space] | character, but does not consist entirely of [space] | |||
| characters, a single [space] character is removed from the | characters, a single [space] character is removed from the | |||
| front and back. This allows you to include code that begins | front and back. This allows you to include code that begins | |||
| or ends with backtick characters, which must be separated by | or ends with backtick characters, which must be separated by | |||
| whitespace from the opening or closing backtick strings. | whitespace from the opening or closing backtick strings. | |||
| skipping to change at line 5793 ¶ | skipping to change at line 5812 ¶ | |||
| for efficient parsing strategies that do not backtrack. | for efficient parsing strategies that do not backtrack. | |||
| First, some definitions. A [delimiter run](@) is either | First, some definitions. A [delimiter run](@) is either | |||
| a sequence of one or more `*` characters that is not preceded or | a sequence of one or more `*` characters that is not preceded or | |||
| followed by a non-backslash-escaped `*` character, or a sequence | followed by a non-backslash-escaped `*` character, or a sequence | |||
| of one or more `_` characters that is not preceded or followed by | of one or more `_` characters that is not preceded or followed by | |||
| a non-backslash-escaped `_` character. | a non-backslash-escaped `_` character. | |||
| A [left-flanking delimiter run](@) is | A [left-flanking delimiter run](@) is | |||
| a [delimiter run] that is (1) not followed by [Unicode whitespace], | a [delimiter run] that is (1) not followed by [Unicode whitespace], | |||
| and either (2a) not followed by a [punctuation character], or | and either (2a) not followed by a [Unicode punctuation character], or | |||
| (2b) followed by a [punctuation character] and | (2b) followed by a [Unicode punctuation character] and | |||
| preceded by [Unicode whitespace] or a [punctuation character]. | preceded by [Unicode whitespace] or a [Unicode punctuation character]. | |||
| For purposes of this definition, the beginning and the end of | For purposes of this definition, the beginning and the end of | |||
| the line count as Unicode whitespace. | the line count as Unicode whitespace. | |||
| A [right-flanking delimiter run](@) is | A [right-flanking delimiter run](@) is | |||
| a [delimiter run] that is (1) not preceded by [Unicode whitespace], | a [delimiter run] that is (1) not preceded by [Unicode whitespace], | |||
| and either (2a) not preceded by a [punctuation character], or | and either (2a) not preceded by a [Unicode punctuation character], or | |||
| (2b) preceded by a [punctuation character] and | (2b) preceded by a [Unicode punctuation character] and | |||
| followed by [Unicode whitespace] or a [punctuation character]. | followed by [Unicode whitespace] or a [Unicode punctuation character]. | |||
| For purposes of this definition, the beginning and the end of | For purposes of this definition, the beginning and the end of | |||
| the line count as Unicode whitespace. | the line count as Unicode whitespace. | |||
| Here are some examples of delimiter runs. | Here are some examples of delimiter runs. | |||
| - left-flanking but not right-flanking: | - left-flanking but not right-flanking: | |||
| ``` | ``` | |||
| ***abc | ***abc | |||
| _abc | _abc | |||
| skipping to change at line 5858 ¶ | skipping to change at line 5877 ¶ | |||
| The following rules define emphasis and strong emphasis: | The following rules define emphasis and strong emphasis: | |||
| 1. A single `*` character [can open emphasis](@) | 1. A single `*` character [can open emphasis](@) | |||
| iff (if and only if) it is part of a [left-flanking delimiter run]. | iff (if and only if) it is part of a [left-flanking delimiter run]. | |||
| 2. A single `_` character [can open emphasis] iff | 2. A single `_` character [can open emphasis] iff | |||
| it is part of a [left-flanking delimiter run] | it is part of a [left-flanking delimiter run] | |||
| and either (a) not part of a [right-flanking delimiter run] | and either (a) not part of a [right-flanking delimiter run] | |||
| or (b) part of a [right-flanking delimiter run] | or (b) part of a [right-flanking delimiter run] | |||
| preceded by punctuation. | preceded by a [Unicode punctuation character]. | |||
| 3. A single `*` character [can close emphasis](@) | 3. A single `*` character [can close emphasis](@) | |||
| iff it is part of a [right-flanking delimiter run]. | iff it is part of a [right-flanking delimiter run]. | |||
| 4. A single `_` character [can close emphasis] iff | 4. A single `_` character [can close emphasis] iff | |||
| it is part of a [right-flanking delimiter run] | it is part of a [right-flanking delimiter run] | |||
| and either (a) not part of a [left-flanking delimiter run] | and either (a) not part of a [left-flanking delimiter run] | |||
| or (b) part of a [left-flanking delimiter run] | or (b) part of a [left-flanking delimiter run] | |||
| followed by punctuation. | followed by a [Unicode punctuation character]. | |||
| 5. A double `**` [can open strong emphasis](@) | 5. A double `**` [can open strong emphasis](@) | |||
| iff it is part of a [left-flanking delimiter run]. | iff it is part of a [left-flanking delimiter run]. | |||
| 6. A double `__` [can open strong emphasis] iff | 6. A double `__` [can open strong emphasis] iff | |||
| it is part of a [left-flanking delimiter run] | it is part of a [left-flanking delimiter run] | |||
| and either (a) not part of a [right-flanking delimiter run] | and either (a) not part of a [right-flanking delimiter run] | |||
| or (b) part of a [right-flanking delimiter run] | or (b) part of a [right-flanking delimiter run] | |||
| preceded by punctuation. | preceded by a [Unicode punctuation character]. | |||
| 7. A double `**` [can close strong emphasis](@) | 7. A double `**` [can close strong emphasis](@) | |||
| iff it is part of a [right-flanking delimiter run]. | iff it is part of a [right-flanking delimiter run]. | |||
| 8. A double `__` [can close strong emphasis] iff | 8. A double `__` [can close strong emphasis] iff | |||
| it is part of a [right-flanking delimiter run] | it is part of a [right-flanking delimiter run] | |||
| and either (a) not part of a [left-flanking delimiter run] | and either (a) not part of a [left-flanking delimiter run] | |||
| or (b) part of a [left-flanking delimiter run] | or (b) part of a [left-flanking delimiter run] | |||
| followed by punctuation. | followed by a [Unicode punctuation character]. | |||
| 9. Emphasis begins with a delimiter that [can open emphasis] and ends | 9. Emphasis begins with a delimiter that [can open emphasis] and ends | |||
| with a delimiter that [can close emphasis], and that uses the same | with a delimiter that [can close emphasis], and that uses the same | |||
| character (`_` or `*`) as the opening delimiter. The | character (`_` or `*`) as the opening delimiter. The | |||
| opening and closing delimiters must belong to separate | opening and closing delimiters must belong to separate | |||
| [delimiter runs]. If one of the delimiters can both | [delimiter runs]. If one of the delimiters can both | |||
| open and close emphasis, then the sum of the lengths of the | open and close emphasis, then the sum of the lengths of the | |||
| delimiter runs containing the opening and closing delimiters | delimiter runs containing the opening and closing delimiters | |||
| must not be a multiple of 3 unless both lengths are | must not be a multiple of 3 unless both lengths are | |||
| multiples of 3. | multiples of 3. | |||
| skipping to change at line 6081 ¶ | skipping to change at line 6100 ¶ | |||
| This is not emphasis, because the closing `*` is preceded by | This is not emphasis, because the closing `*` is preceded by | |||
| whitespace: | whitespace: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| *foo bar * | *foo bar * | |||
| . | . | |||
| <p>*foo bar *</p> | <p>*foo bar *</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| A newline also counts as whitespace: | A line ending also counts as whitespace: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| *foo bar | *foo bar | |||
| * | * | |||
| . | . | |||
| <p>*foo bar | <p>*foo bar | |||
| *</p> | *</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| This is not emphasis, because the second `*` is | This is not emphasis, because the second `*` is | |||
| skipping to change at line 6228 ¶ | skipping to change at line 6247 ¶ | |||
| This is not strong emphasis, because the opening delimiter is | This is not strong emphasis, because the opening delimiter is | |||
| followed by whitespace: | followed by whitespace: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| __ foo bar__ | __ foo bar__ | |||
| . | . | |||
| <p>__ foo bar__</p> | <p>__ foo bar__</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| A newline counts as whitespace: | A line ending counts as whitespace: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| __ | __ | |||
| foo bar__ | foo bar__ | |||
| . | . | |||
| <p>__ | <p>__ | |||
| foo bar__</p> | foo bar__</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| This is not strong emphasis, because the opening `__` is preceded | This is not strong emphasis, because the opening `__` is preceded | |||
| by an alphanumeric and followed by punctuation: | by an alphanumeric and followed by punctuation: | |||
| skipping to change at line 6477 ¶ | skipping to change at line 6496 ¶ | |||
| emphasis sections in this example: | emphasis sections in this example: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| *foo**bar* | *foo**bar* | |||
| . | . | |||
| <p><em>foo**bar</em></p> | <p><em>foo**bar</em></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| The same condition ensures that the following | The same condition ensures that the following | |||
| cases are all strong emphasis nested inside | cases are all strong emphasis nested inside | |||
| emphasis, even when the interior spaces are | emphasis, even when the interior whitespace is | |||
| omitted: | omitted: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| ***foo** bar* | ***foo** bar* | |||
| . | . | |||
| <p><em><strong>foo</strong> bar</em></p> | <p><em><strong>foo</strong> bar</em></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| *foo **bar*** | *foo **bar*** | |||
| skipping to change at line 6980 ¶ | skipping to change at line 6999 ¶ | |||
| than the brackets in link text. Thus, for example, | than the brackets in link text. Thus, for example, | |||
| `` [foo`]` `` could not be a link text, since the second `]` | `` [foo`]` `` could not be a link text, since the second `]` | |||
| is part of a code span. | is part of a code span. | |||
| - The brackets in link text bind more tightly than markers for | - The brackets in link text bind more tightly than markers for | |||
| [emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link. | [emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link. | |||
| A [link destination](@) consists of either | A [link destination](@) consists of either | |||
| - a sequence of zero or more characters between an opening `<` and a | - a sequence of zero or more characters between an opening `<` and a | |||
| closing `>` that contains no line breaks or unescaped | closing `>` that contains no line endings or unescaped | |||
| `<` or `>` characters, or | `<` or `>` characters, or | |||
| - a nonempty sequence of characters that does not start with | - a nonempty sequence of characters that does not start with `<`, | |||
| `<`, does not include ASCII space or control characters, and | does not include [ASCII control characters][ASCII control character] | |||
| includes parentheses only if (a) they are backslash-escaped or | or [space] character, and includes parentheses only if (a) they are | |||
| (b) they are part of a balanced pair of unescaped parentheses. | backslash-escaped or (b) they are part of a balanced pair of | |||
| unescaped parentheses. | ||||
| (Implementations may impose limits on parentheses nesting to | (Implementations may impose limits on parentheses nesting to | |||
| avoid performance issues, but at least three levels of nesting | avoid performance issues, but at least three levels of nesting | |||
| should be supported.) | should be supported.) | |||
| A [link title](@) consists of either | A [link title](@) consists of either | |||
| - a sequence of zero or more characters between straight double-quote | - a sequence of zero or more characters between straight double-quote | |||
| characters (`"`), including a `"` character only if it is | characters (`"`), including a `"` character only if it is | |||
| backslash-escaped, or | backslash-escaped, or | |||
| skipping to change at line 7009 ¶ | skipping to change at line 7029 ¶ | |||
| backslash-escaped, or | backslash-escaped, or | |||
| - a sequence of zero or more characters between matching parentheses | - a sequence of zero or more characters between matching parentheses | |||
| (`(...)`), including a `(` or `)` character only if it is | (`(...)`), including a `(` or `)` character only if it is | |||
| backslash-escaped. | backslash-escaped. | |||
| Although [link titles] may span multiple lines, they may not contain | Although [link titles] may span multiple lines, they may not contain | |||
| a [blank line]. | a [blank line]. | |||
| An [inline link](@) consists of a [link text] followed immediately | An [inline link](@) consists of a [link text] followed immediately | |||
| by a left parenthesis `(`, optional [whitespace], an optional | by a left parenthesis `(`, an optional [link destination], an optional | |||
| [link destination], an optional [link title] separated from the link | [link title], and a right parenthesis `)`. | |||
| destination by [whitespace], optional [whitespace], and a right | These four components may be separated by spaces, tabs, and up to one line | |||
| parenthesis `)`. The link's text consists of the inlines contained | ending. | |||
| If both [link destination] and [link title] are present, they *must* be | ||||
| separated by spaces, tabs, and up to one line ending. | ||||
| The link's text consists of the inlines contained | ||||
| in the [link text] (excluding the enclosing square brackets). | in the [link text] (excluding the enclosing square brackets). | |||
| The link's URI consists of the link destination, excluding enclosing | The link's URI consists of the link destination, excluding enclosing | |||
| `<...>` if present, with backslash-escapes in effect as described | `<...>` if present, with backslash-escapes in effect as described | |||
| above. The link's title consists of the link title, excluding its | above. The link's title consists of the link title, excluding its | |||
| enclosing delimiters, with backslash-escapes in effect as described | enclosing delimiters, with backslash-escapes in effect as described | |||
| above. | above. | |||
| Here is a simple inline link: | Here is a simple inline link: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [link](/uri "title") | [link](/uri "title") | |||
| . | . | |||
| <p><a href="/uri" title="title">link</a></p> | <p><a href="/uri" title="title">link</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| The title may be omitted: | The title, the link text and even | |||
| the destination may be omitted: | ||||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [link](/uri) | [link](/uri) | |||
| . | . | |||
| <p><a href="/uri">link</a></p> | <p><a href="/uri">link</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Both the title and the destination may be omitted: | ```````````````````````````````` example | |||
| [](./target.md) | ||||
| . | ||||
| <p><a href="./target.md"></a></p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [link]() | [link]() | |||
| . | . | |||
| <p><a href="">link</a></p> | <p><a href="">link</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [link](<>) | [link](<>) | |||
| . | . | |||
| <p><a href="">link</a></p> | <p><a href="">link</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ||||
| []() | ||||
| . | ||||
| <p><a href=""></a></p> | ||||
| ```````````````````````````````` | ||||
| The destination can only contain spaces if it is | The destination can only contain spaces if it is | |||
| enclosed in pointy brackets: | enclosed in pointy brackets: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [link](/my uri) | [link](/my uri) | |||
| . | . | |||
| <p>[link](/my uri)</p> | <p>[link](/my uri)</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [link](</my uri>) | [link](</my uri>) | |||
| . | . | |||
| <p><a href="/my%20uri">link</a></p> | <p><a href="/my%20uri">link</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| The destination cannot contain line breaks, | The destination cannot contain line endings, | |||
| even if enclosed in pointy brackets: | even if enclosed in pointy brackets: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [link](foo | [link](foo | |||
| bar) | bar) | |||
| . | . | |||
| <p>[link](foo | <p>[link](foo | |||
| bar)</p> | bar)</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| skipping to change at line 7135 ¶ | skipping to change at line 7170 ¶ | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [link](foo(and(bar))) | [link](foo(and(bar))) | |||
| . | . | |||
| <p><a href="foo(and(bar))">link</a></p> | <p><a href="foo(and(bar))">link</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| However, if you have unbalanced parentheses, you need to escape or use the | However, if you have unbalanced parentheses, you need to escape or use the | |||
| `<...>` form: | `<...>` form: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [link](foo(and(bar)) | ||||
| . | ||||
| <p>[link](foo(and(bar))</p> | ||||
| ```````````````````````````````` | ||||
| ```````````````````````````````` example | ||||
| [link](foo\(and\(bar\)) | [link](foo\(and\(bar\)) | |||
| . | . | |||
| <p><a href="foo(and(bar)">link</a></p> | <p><a href="foo(and(bar)">link</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [link](<foo(and(bar)>) | [link](<foo(and(bar)>) | |||
| . | . | |||
| <p><a href="foo(and(bar)">link</a></p> | <p><a href="foo(and(bar)">link</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| skipping to change at line 7224 ¶ | skipping to change at line 7265 ¶ | |||
| Backslash escapes and entity and numeric character references | Backslash escapes and entity and numeric character references | |||
| may be used in titles: | may be used in titles: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [link](/url "title \""") | [link](/url "title \""") | |||
| . | . | |||
| <p><a href="/url" title="title """>link</a></p> | <p><a href="/url" title="title """>link</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Titles must be separated from the link using a [whitespace]. | Titles must be separated from the link using spaces, tabs, and up to one line | |||
| ending. | ||||
| Other [Unicode whitespace] like non-breaking space doesn't work. | Other [Unicode whitespace] like non-breaking space doesn't work. | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [link](/url "title") | [link](/url "title") | |||
| . | . | |||
| <p><a href="/url%C2%A0%22title%22">link</a></p> | <p><a href="/url%C2%A0%22title%22">link</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Nested balanced quotes are not allowed without escaping: | Nested balanced quotes are not allowed without escaping: | |||
| skipping to change at line 7264 ¶ | skipping to change at line 7306 ¶ | |||
| quote type for the enclosing title---to write titles containing | quote type for the enclosing title---to write titles containing | |||
| double quotes. `Markdown.pl`'s handling of titles has a number | double quotes. `Markdown.pl`'s handling of titles has a number | |||
| of other strange features. For example, it allows single-quoted | of other strange features. For example, it allows single-quoted | |||
| titles in inline links, but not reference links. And, in | titles in inline links, but not reference links. And, in | |||
| reference links but not inline links, it allows a title to begin | reference links but not inline links, it allows a title to begin | |||
| with `"` and end with `)`. `Markdown.pl` 1.0.1 even allows | with `"` and end with `)`. `Markdown.pl` 1.0.1 even allows | |||
| titles with no closing quotation mark, though 1.0.2b8 does not. | titles with no closing quotation mark, though 1.0.2b8 does not. | |||
| It seems preferable to adopt a simple, rational rule that works | It seems preferable to adopt a simple, rational rule that works | |||
| the same way in inline links and link reference definitions.) | the same way in inline links and link reference definitions.) | |||
| [Whitespace] is allowed around the destination and title: | Spaces, tabs, and up to one line ending is allowed around the destination and | |||
| title: | ||||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [link]( /uri | [link]( /uri | |||
| "title" ) | "title" ) | |||
| . | . | |||
| <p><a href="/uri" title="title">link</a></p> | <p><a href="/uri" title="title">link</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| But it is not allowed between the link text and the | But it is not allowed between the link text and the | |||
| following parenthesis: | following parenthesis: | |||
| skipping to change at line 7398 ¶ | skipping to change at line 7441 ¶ | |||
| There are three kinds of [reference link](@)s: | There are three kinds of [reference link](@)s: | |||
| [full](#full-reference-link), [collapsed](#collapsed-reference-link), | [full](#full-reference-link), [collapsed](#collapsed-reference-link), | |||
| and [shortcut](#shortcut-reference-link). | and [shortcut](#shortcut-reference-link). | |||
| A [full reference link](@) | A [full reference link](@) | |||
| consists of a [link text] immediately followed by a [link label] | consists of a [link text] immediately followed by a [link label] | |||
| that [matches] a [link reference definition] elsewhere in the document. | that [matches] a [link reference definition] elsewhere in the document. | |||
| A [link label](@) begins with a left bracket (`[`) and ends | A [link label](@) begins with a left bracket (`[`) and ends | |||
| with the first right bracket (`]`) that is not backslash-escaped. | with the first right bracket (`]`) that is not backslash-escaped. | |||
| Between these brackets there must be at least one [non-whitespace character]. | Between these brackets there must be at least one character that is not a space, | |||
| tab, or line ending. | ||||
| Unescaped square bracket characters are not allowed inside the | Unescaped square bracket characters are not allowed inside the | |||
| opening and closing square brackets of [link labels]. A link | opening and closing square brackets of [link labels]. A link | |||
| label can have at most 999 characters inside the square | label can have at most 999 characters inside the square | |||
| brackets. | brackets. | |||
| One label [matches](@) | One label [matches](@) | |||
| another just in case their normalized forms are equal. To normalize a | another just in case their normalized forms are equal. To normalize a | |||
| label, strip off the opening and closing brackets, | label, strip off the opening and closing brackets, | |||
| perform the *Unicode case fold*, strip leading and trailing | perform the *Unicode case fold*, strip leading and trailing | |||
| [whitespace] and collapse consecutive internal | spaces, tabs, and line endings, and collapse consecutive internal | |||
| [whitespace] to a single space. If there are multiple | spaces, tabs, and line endings to a single space. If there are multiple | |||
| matching reference link definitions, the one that comes first in the | matching reference link definitions, the one that comes first in the | |||
| document is used. (It is desirable in such cases to emit a warning.) | document is used. (It is desirable in such cases to emit a warning.) | |||
| The contents of the first link label are parsed as inlines, which are | The link's URI and title are provided by the matching [link | |||
| used as the link's text. The link's URI and title are provided by the | reference definition]. | |||
| matching [link reference definition]. | ||||
| Here is a simple example: | Here is a simple example: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [foo][bar] | [foo][bar] | |||
| [bar]: /url "title" | [bar]: /url "title" | |||
| . | . | |||
| <p><a href="/url" title="title">foo</a></p> | <p><a href="/url" title="title">foo</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| skipping to change at line 7500 ¶ | skipping to change at line 7543 ¶ | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| *[foo*][ref] | *[foo*][ref] | |||
| [ref]: /uri | [ref]: /uri | |||
| . | . | |||
| <p>*<a href="/uri">foo*</a></p> | <p>*<a href="/uri">foo*</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [foo *bar][ref] | [foo *bar][ref]* | |||
| [ref]: /uri | [ref]: /uri | |||
| . | . | |||
| <p><a href="/uri">foo *bar</a></p> | <p><a href="/uri">foo *bar</a>*</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| These cases illustrate the precedence of HTML tags, code spans, | These cases illustrate the precedence of HTML tags, code spans, | |||
| and autolinks over link grouping: | and autolinks over link grouping: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [foo <bar attr="][ref]"> | [foo <bar attr="][ref]"> | |||
| [ref]: /uri | [ref]: /uri | |||
| . | . | |||
| skipping to change at line 7547 ¶ | skipping to change at line 7590 ¶ | |||
| [foo][BaR] | [foo][BaR] | |||
| [bar]: /url "title" | [bar]: /url "title" | |||
| . | . | |||
| <p><a href="/url" title="title">foo</a></p> | <p><a href="/url" title="title">foo</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Unicode case fold is used: | Unicode case fold is used: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [Толпой][Толпой] is a Russian word. | [ẞ] | |||
| [ТОЛПОЙ]: /url | [SS]: /url | |||
| . | . | |||
| <p><a href="/url">Толпой</a> is a Russian word.</p> | <p><a href="/url">ẞ</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Consecutive internal [whitespace] is treated as one space for | Consecutive internal spaces, tabs, and line endings are treated as one space for | |||
| purposes of determining matching: | purposes of determining matching: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [Foo | [Foo | |||
| bar]: /url | bar]: /url | |||
| [Baz][Foo bar] | [Baz][Foo bar] | |||
| . | . | |||
| <p><a href="/url">Baz</a></p> | <p><a href="/url">Baz</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| No [whitespace] is allowed between the [link text] and the | No spaces, tabs, or line endings are allowed between the [link text] and the | |||
| [link label]: | [link label]: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [foo] [bar] | [foo] [bar] | |||
| [bar]: /url "title" | [bar]: /url "title" | |||
| . | . | |||
| <p>[foo] <a href="/url" title="title">bar</a></p> | <p>[foo] <a href="/url" title="title">bar</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| skipping to change at line 7687 ¶ | skipping to change at line 7730 ¶ | |||
| Note that in this example `]` is not backslash-escaped: | Note that in this example `]` is not backslash-escaped: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [bar\\]: /uri | [bar\\]: /uri | |||
| [bar\\] | [bar\\] | |||
| . | . | |||
| <p><a href="/uri">bar\</a></p> | <p><a href="/uri">bar\</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| A [link label] must contain at least one [non-whitespace character]: | A [link label] must contain at least one character that is not a space, tab, or | |||
| line ending: | ||||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [] | [] | |||
| []: /uri | []: /uri | |||
| . | . | |||
| <p>[]</p> | <p>[]</p> | |||
| <p>[]: /uri</p> | <p>[]: /uri</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| skipping to change at line 7746 ¶ | skipping to change at line 7790 ¶ | |||
| The link labels are case-insensitive: | The link labels are case-insensitive: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [Foo][] | [Foo][] | |||
| [foo]: /url "title" | [foo]: /url "title" | |||
| . | . | |||
| <p><a href="/url" title="title">Foo</a></p> | <p><a href="/url" title="title">Foo</a></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| As with full reference links, [whitespace] is not | As with full reference links, spaces, tabs, or line endings are not | |||
| allowed between the two sets of brackets: | allowed between the two sets of brackets: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| [foo] | [foo] | |||
| [] | [] | |||
| [foo]: /url "title" | [foo]: /url "title" | |||
| . | . | |||
| <p><a href="/url" title="title">foo</a> | <p><a href="/url" title="title">foo</a> | |||
| []</p> | []</p> | |||
| skipping to change at line 8046 ¶ | skipping to change at line 8090 ¶ | |||
| The labels are case-insensitive: | The labels are case-insensitive: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| ![Foo][] | ![Foo][] | |||
| [foo]: /url "title" | [foo]: /url "title" | |||
| . | . | |||
| <p><img src="/url" alt="Foo" title="title" /></p> | <p><img src="/url" alt="Foo" title="title" /></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| As with reference links, [whitespace] is not allowed | As with reference links, spaces, tabs, and line endings, are not allowed | |||
| between the two sets of brackets: | between the two sets of brackets: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| ![foo] | ![foo] | |||
| [] | [] | |||
| [foo]: /url "title" | [foo]: /url "title" | |||
| . | . | |||
| <p><img src="/url" alt="foo" title="title" /> | <p><img src="/url" alt="foo" title="title" /> | |||
| []</p> | []</p> | |||
| skipping to change at line 8132 ¶ | skipping to change at line 8176 ¶ | |||
| [Autolink](@)s are absolute URIs and email addresses inside | [Autolink](@)s are absolute URIs and email addresses inside | |||
| `<` and `>`. They are parsed as links, with the URL or email address | `<` and `>`. They are parsed as links, with the URL or email address | |||
| as the link label. | as the link label. | |||
| A [URI autolink](@) consists of `<`, followed by an | A [URI autolink](@) consists of `<`, followed by an | |||
| [absolute URI] followed by `>`. It is parsed as | [absolute URI] followed by `>`. It is parsed as | |||
| a link to the URI, with the URI as the link's label. | a link to the URI, with the URI as the link's label. | |||
| An [absolute URI](@), | An [absolute URI](@), | |||
| for these purposes, consists of a [scheme] followed by a colon (`:`) | for these purposes, consists of a [scheme] followed by a colon (`:`) | |||
| followed by zero or more characters other than ASCII | followed by zero or more characters other [ASCII control | |||
| [whitespace] and control characters, `<`, and `>`. If | characters][ASCII control character], [space], `<`, and `>`. | |||
| the URI includes these characters, they must be percent-encoded | If the URI includes these characters, they must be percent-encoded | |||
| (e.g. `%20` for a space). | (e.g. `%20` for a space). | |||
| For purposes of this spec, a [scheme](@) is any sequence | For purposes of this spec, a [scheme](@) is any sequence | |||
| of 2--32 characters beginning with an ASCII letter and followed | of 2--32 characters beginning with an ASCII letter and followed | |||
| by any combination of ASCII letters, digits, or the symbols plus | by any combination of ASCII letters, digits, or the symbols plus | |||
| ("+"), period ("."), or hyphen ("-"). | ("+"), period ("."), or hyphen ("-"). | |||
| Here are some valid autolinks: | Here are some valid autolinks: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| skipping to change at line 8301 ¶ | skipping to change at line 8345 ¶ | |||
| raw HTML tag and will be rendered in HTML without escaping. | raw HTML tag and will be rendered in HTML without escaping. | |||
| Tag and attribute names are not limited to current HTML tags, | Tag and attribute names are not limited to current HTML tags, | |||
| so custom tags (and even, say, DocBook tags) may be used. | so custom tags (and even, say, DocBook tags) may be used. | |||
| Here is the grammar for tags: | Here is the grammar for tags: | |||
| A [tag name](@) consists of an ASCII letter | A [tag name](@) consists of an ASCII letter | |||
| followed by zero or more ASCII letters, digits, or | followed by zero or more ASCII letters, digits, or | |||
| hyphens (`-`). | hyphens (`-`). | |||
| An [attribute](@) consists of [whitespace], | An [attribute](@) consists of spaces, tabs, and up to one line ending, | |||
| an [attribute name], and an optional | an [attribute name], and an optional | |||
| [attribute value specification]. | [attribute value specification]. | |||
| An [attribute name](@) | An [attribute name](@) | |||
| consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII | consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII | |||
| letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML | letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML | |||
| specification restricted to ASCII. HTML5 is laxer.) | specification restricted to ASCII. HTML5 is laxer.) | |||
| An [attribute value specification](@) | An [attribute value specification](@) | |||
| consists of optional [whitespace], | consists of optional spaces, tabs, and up to one line ending, | |||
| a `=` character, optional [whitespace], and an [attribute | a `=` character, optional spaces, tabs, and up to one line ending, | |||
| value]. | and an [attribute value]. | |||
| An [attribute value](@) | An [attribute value](@) | |||
| consists of an [unquoted attribute value], | consists of an [unquoted attribute value], | |||
| a [single-quoted attribute value], or a [double-quoted attribute value]. | a [single-quoted attribute value], or a [double-quoted attribute value]. | |||
| An [unquoted attribute value](@) | An [unquoted attribute value](@) | |||
| is a nonempty string of characters not | is a nonempty string of characters not | |||
| including [whitespace], `"`, `'`, `=`, `<`, `>`, or `` ` ``. | including spaces, tabs, line endings, `"`, `'`, `=`, `<`, `>`, or `` ` ``. | |||
| A [single-quoted attribute value](@) | A [single-quoted attribute value](@) | |||
| consists of `'`, zero or more | consists of `'`, zero or more | |||
| characters not including `'`, and a final `'`. | characters not including `'`, and a final `'`. | |||
| A [double-quoted attribute value](@) | A [double-quoted attribute value](@) | |||
| consists of `"`, zero or more | consists of `"`, zero or more | |||
| characters not including `"`, and a final `"`. | characters not including `"`, and a final `"`. | |||
| An [open tag](@) consists of a `<` character, a [tag name], | An [open tag](@) consists of a `<` character, a [tag name], | |||
| zero or more [attributes], optional [whitespace], an optional `/` | zero or more [attributes], optional spaces, tabs, and up to one line ending, | |||
| character, and a `>` character. | an optional `/` character, and a `>` character. | |||
| A [closing tag](@) consists of the string `</`, a | A [closing tag](@) consists of the string `</`, a | |||
| [tag name], optional [whitespace], and the character `>`. | [tag name], optional spaces, tabs, and up to one line ending, and the character | |||
| `>`. | ||||
| An [HTML comment](@) consists of `<!--` + *text* + `-->`, | An [HTML comment](@) consists of `<!--` + *text* + `-->`, | |||
| where *text* does not start with `>` or `->`, does not end with `-`, | where *text* does not start with `>` or `->`, does not end with `-`, | |||
| and does not contain `--`. (See the | and does not contain `--`. (See the | |||
| [HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).) | [HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).) | |||
| A [processing instruction](@) | A [processing instruction](@) | |||
| consists of the string `<?`, a string | consists of the string `<?`, a string | |||
| of characters not including the string `?>`, and the string | of characters not including the string `?>`, and the string | |||
| `?>`. | `?>`. | |||
| A [declaration](@) consists of the | A [declaration](@) consists of the string `<!`, an ASCII letter, zero or more | |||
| string `<!`, a name consisting of one or more uppercase ASCII letters, | characters not including the character `>`, and the character `>`. | |||
| [whitespace], a string of characters not including the | ||||
| character `>`, and the character `>`. | ||||
| A [CDATA section](@) consists of | A [CDATA section](@) consists of | |||
| the string `<![CDATA[`, a string of characters not including the string | the string `<![CDATA[`, a string of characters not including the string | |||
| `]]>`, and the string `]]>`. | `]]>`, and the string `]]>`. | |||
| An [HTML tag](@) consists of an [open tag], a [closing tag], | An [HTML tag](@) consists of an [open tag], a [closing tag], | |||
| an [HTML comment], a [processing instruction], a [declaration], | an [HTML comment], a [processing instruction], a [declaration], | |||
| or a [CDATA section]. | or a [CDATA section]. | |||
| Here are some simple open tags: | Here are some simple open tags: | |||
| skipping to change at line 8377 ¶ | skipping to change at line 8420 ¶ | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Empty elements: | Empty elements: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| <a/><b2/> | <a/><b2/> | |||
| . | . | |||
| <p><a/><b2/></p> | <p><a/><b2/></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| [Whitespace] is allowed: | Whitespace is allowed: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| <a /><b2 | <a /><b2 | |||
| data="foo" > | data="foo" > | |||
| . | . | |||
| <p><a /><b2 | <p><a /><b2 | |||
| data="foo" ></p> | data="foo" ></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| With attributes: | With attributes: | |||
| skipping to change at line 8429 ¶ | skipping to change at line 8472 ¶ | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Illegal attribute values: | Illegal attribute values: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| <a href="hi'> <a href=hi'> | <a href="hi'> <a href=hi'> | |||
| . | . | |||
| <p><a href="hi'> <a href=hi'></p> | <p><a href="hi'> <a href=hi'></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Illegal [whitespace]: | Illegal whitespace: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| < a>< | < a>< | |||
| foo><bar/ > | foo><bar/ > | |||
| <foo bar=baz | <foo bar=baz | |||
| bim!bop /> | bim!bop /> | |||
| . | . | |||
| <p>< a>< | <p>< a>< | |||
| foo><bar/ > | foo><bar/ > | |||
| <foo bar=baz | <foo bar=baz | |||
| bim!bop /></p> | bim!bop /></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Missing [whitespace]: | Missing whitespace: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| <a href='bar'title=title> | <a href='bar'title=title> | |||
| . | . | |||
| <p><a href='bar'title=title></p> | <p><a href='bar'title=title></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Closing tags: | Closing tags: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| skipping to change at line 8543 ¶ | skipping to change at line 8586 ¶ | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| <a href="\""> | <a href="\""> | |||
| . | . | |||
| <p><a href="""></p> | <p><a href="""></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ## Hard line breaks | ## Hard line breaks | |||
| A line break (not in a code span or HTML tag) that is preceded | A line ending (not in a code span or HTML tag) that is preceded | |||
| by two or more spaces and does not occur at the end of a block | by two or more spaces and does not occur at the end of a block | |||
| is parsed as a [hard line break](@) (rendered | is parsed as a [hard line break](@) (rendered | |||
| in HTML as a `<br />` tag): | in HTML as a `<br />` tag): | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| foo | foo | |||
| baz | baz | |||
| . | . | |||
| <p>foo<br /> | <p>foo<br /> | |||
| baz</p> | baz</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| For a more visible alternative, a backslash before the | For a more visible alternative, a backslash before the | |||
| [line ending] may be used instead of two spaces: | [line ending] may be used instead of two or more spaces: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| foo\ | foo\ | |||
| baz | baz | |||
| . | . | |||
| <p>foo<br /> | <p>foo<br /> | |||
| baz</p> | baz</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| More than two spaces can be used: | More than two spaces can be used: | |||
| skipping to change at line 8595 ¶ | skipping to change at line 8638 ¶ | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| foo\ | foo\ | |||
| bar | bar | |||
| . | . | |||
| <p>foo<br /> | <p>foo<br /> | |||
| bar</p> | bar</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Line breaks can occur inside emphasis, links, and other constructs | Hard line breaks can occur inside emphasis, links, and other constructs | |||
| that allow inline content: | that allow inline content: | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| *foo | *foo | |||
| bar* | bar* | |||
| . | . | |||
| <p><em>foo<br /> | <p><em>foo<br /> | |||
| bar</em></p> | bar</em></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| *foo\ | *foo\ | |||
| bar* | bar* | |||
| . | . | |||
| <p><em>foo<br /> | <p><em>foo<br /> | |||
| bar</em></p> | bar</em></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| Line breaks do not occur inside code spans | Hard line breaks do not occur inside code spans | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| `code | `code | |||
| span` | span` | |||
| . | . | |||
| <p><code>code span</code></p> | <p><code>code span</code></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| `code\ | `code\ | |||
| span` | span` | |||
| . | . | |||
| <p><code>code\ span</code></p> | <p><code>code\ span</code></p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| or HTML tags: | or HTML tags: | |||
| skipping to change at line 8678 ¶ | skipping to change at line 8721 ¶ | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| ### foo | ### foo | |||
| . | . | |||
| <h3>foo</h3> | <h3>foo</h3> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| ## Soft line breaks | ## Soft line breaks | |||
| A regular line break (not in a code span or HTML tag) that is not | A regular line ending (not in a code span or HTML tag) that is not | |||
| preceded by two or more spaces or a backslash is parsed as a | preceded by two or more spaces or a backslash is parsed as a | |||
| [softbreak](@). (A softbreak may be rendered in HTML either as a | [softbreak](@). (A soft line break may be rendered in HTML either as a | |||
| [line ending] or as a space. The result will be the same in | [line ending] or as a space. The result will be the same in | |||
| browsers. In the examples here, a [line ending] will be used.) | browsers. In the examples here, a [line ending] will be used.) | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| foo | foo | |||
| baz | baz | |||
| . | . | |||
| <p>foo | <p>foo | |||
| baz</p> | baz</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| skipping to change at line 8704 ¶ | skipping to change at line 8747 ¶ | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| foo | foo | |||
| baz | baz | |||
| . | . | |||
| <p>foo | <p>foo | |||
| baz</p> | baz</p> | |||
| ```````````````````````````````` | ```````````````````````````````` | |||
| A conforming parser may render a soft line break in HTML either as a | A conforming parser may render a soft line break in HTML either as a | |||
| line break or as a space. | line ending or as a space. | |||
| A renderer may also provide an option to render soft line breaks | A renderer may also provide an option to render soft line breaks | |||
| as hard line breaks. | as hard line breaks. | |||
| ## Textual content | ## Textual content | |||
| Any characters not given an interpretation by the above rules will | Any characters not given an interpretation by the above rules will | |||
| be parsed as plain textual content. | be parsed as plain textual content. | |||
| ```````````````````````````````` example | ```````````````````````````````` example | |||
| skipping to change at line 8809 ¶ | skipping to change at line 8852 ¶ | |||
| if the block is to remain open. For example, a block quote requires a | if the block is to remain open. For example, a block quote requires a | |||
| `>` character. A paragraph requires a non-blank line. | `>` character. A paragraph requires a non-blank line. | |||
| In this phase we may match all or just some of the open | In this phase we may match all or just some of the open | |||
| blocks. But we cannot close unmatched blocks yet, because we may have a | blocks. But we cannot close unmatched blocks yet, because we may have a | |||
| [lazy continuation line]. | [lazy continuation line]. | |||
| 2. Next, after consuming the continuation markers for existing | 2. Next, after consuming the continuation markers for existing | |||
| blocks, we look for new block starts (e.g. `>` for a block quote). | blocks, we look for new block starts (e.g. `>` for a block quote). | |||
| If we encounter a new block start, we close any blocks unmatched | If we encounter a new block start, we close any blocks unmatched | |||
| in step 1 before creating the new block as a child of the last | in step 1 before creating the new block as a child of the last | |||
| matched block. | matched container block. | |||
| 3. Finally, we look at the remainder of the line (after block | 3. Finally, we look at the remainder of the line (after block | |||
| markers like `>`, list markers, and indentation have been consumed). | markers like `>`, list markers, and indentation have been consumed). | |||
| This is text that can be incorporated into the last open | This is text that can be incorporated into the last open | |||
| block (a paragraph, code block, heading, or raw HTML). | block (a paragraph, code block, heading, or raw HTML). | |||
| Setext headings are formed when we see a line of a paragraph | Setext headings are formed when we see a line of a paragraph | |||
| that is a [setext heading underline]. | that is a [setext heading underline]. | |||
| Reference link definitions are detected when a paragraph is closed; | Reference link definitions are detected when a paragraph is closed; | |||
| skipping to change at line 9025 ¶ | skipping to change at line 9068 ¶ | |||
| Parameter `stack_bottom` sets a lower bound to how far we | Parameter `stack_bottom` sets a lower bound to how far we | |||
| descend in the [delimiter stack]. If it is NULL, we can | descend in the [delimiter stack]. If it is NULL, we can | |||
| go all the way to the bottom. Otherwise, we stop before | go all the way to the bottom. Otherwise, we stop before | |||
| visiting `stack_bottom`. | visiting `stack_bottom`. | |||
| Let `current_position` point to the element on the [delimiter stack] | Let `current_position` point to the element on the [delimiter stack] | |||
| just above `stack_bottom` (or the first element if `stack_bottom` | just above `stack_bottom` (or the first element if `stack_bottom` | |||
| is NULL). | is NULL). | |||
| We keep track of the `openers_bottom` for each delimiter | We keep track of the `openers_bottom` for each delimiter | |||
| type (`*`, `_`) and each length of the closing delimiter run | type (`*`, `_`), indexed to the length of the closing delimiter run | |||
| (modulo 3). Initialize this to `stack_bottom`. | (modulo 3) and to whether the closing delimiter can also be an | |||
| opener. Initialize this to `stack_bottom`. | ||||
| Then we repeat the following until we run out of potential | Then we repeat the following until we run out of potential | |||
| closers: | closers: | |||
| - Move `current_position` forward in the delimiter stack (if needed) | - Move `current_position` forward in the delimiter stack (if needed) | |||
| until we find the first potential closer with delimiter `*` or `_`. | until we find the first potential closer with delimiter `*` or `_`. | |||
| (This will be the potential closer closest | (This will be the potential closer closest | |||
| to the beginning of the input -- the first one in parse order.) | to the beginning of the input -- the first one in parse order.) | |||
| - Now, look back in the stack (staying above `stack_bottom` and | - Now, look back in the stack (staying above `stack_bottom` and | |||
| End of changes. 151 change blocks. | ||||
| 551 lines changed or deleted | 595 lines changed or added | |||
This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||