| spec.txt | spec.txt | |||
|---|---|---|---|---|
| --- | --- | |||
| title: CommonMark Spec | title: CommonMark Spec | |||
| author: | author: | |||
| - John MacFarlane | - John MacFarlane | |||
| version: 0.10 | version: 0.11 | |||
| date: 2014-11-06 | date: 2014-11-10 | |||
| ... | ... | |||
| # Introduction | # Introduction | |||
| ## What is Markdown? | ## What is Markdown? | |||
| Markdown is a plain text format for writing structured documents, | Markdown is a plain text format for writing structured documents, | |||
| based on conventions used for indicating formatting in email and | based on conventions used for indicating formatting in email and | |||
| usenet posts. It was developed in 2004 by John Gruber, who wrote | usenet posts. It was developed in 2004 by John Gruber, who wrote | |||
| the first Markdown-to-HTML converter in perl, and it soon became | the first Markdown-to-HTML converter in perl, and it soon became | |||
| skipping to change at line 194 | skipping to change at line 194 | |||
| This document is generated from a text file, `spec.txt`, written | This document is generated from a text file, `spec.txt`, written | |||
| in Markdown with a small extension for the side-by-side tests. | in Markdown with a small extension for the side-by-side tests. | |||
| The script `spec2md.pl` can be used to turn `spec.txt` into pandoc | The script `spec2md.pl` can be used to turn `spec.txt` into pandoc | |||
| Markdown, which can then be converted into other formats. | Markdown, which can then be converted into other formats. | |||
| In the examples, the `→` character is used to represent tabs. | In the examples, the `→` character is used to represent tabs. | |||
| # Preprocessing | # Preprocessing | |||
| A [line](#line) <a id="line"></a> | A [line](@line) | |||
| is a sequence of zero or more [characters](#character) followed by a | is a sequence of zero or more [characters](#character) followed by a | |||
| line ending (CR, LF, or CRLF) or by the end of file. | line ending (CR, LF, or CRLF) or by the end of file. | |||
| A [character](#character)<a id="character"></a> is a unicode code point. | A [character](@character) is a unicode code point. | |||
| This spec does not specify an encoding; it thinks of lines as composed | This spec does not specify an encoding; it thinks of lines as composed | |||
| of characters rather than bytes. A conforming parser may be limited | of characters rather than bytes. A conforming parser may be limited | |||
| to a certain encoding. | to a certain encoding. | |||
| Tabs in lines are expanded to spaces, with a tab stop of 4 characters: | Tabs in lines are expanded to spaces, with a tab stop of 4 characters: | |||
| . | . | |||
| →foo→baz→→bim | →foo→baz→→bim | |||
| . | . | |||
| <pre><code>foo baz bim | <pre><code>foo baz bim | |||
| skipping to change at line 224 | skipping to change at line 224 | |||
| ὐ→a | ὐ→a | |||
| . | . | |||
| <pre><code>a a | <pre><code>a a | |||
| ὐ a | ὐ a | |||
| </code></pre> | </code></pre> | |||
| . | . | |||
| Line endings are replaced by newline characters (LF). | Line endings are replaced by newline characters (LF). | |||
| A line containing no characters, or a line containing only spaces (after | A line containing no characters, or a line containing only spaces (after | |||
| tab expansion), is called a [blank line](#blank-line). | tab expansion), is called a [blank line](@blank-line). | |||
| <a id="blank-line"></a> | ||||
| # Blocks and inlines | # Blocks and inlines | |||
| We can think of a document as a sequence of [blocks](#block)<a | We can think of a document as a sequence of | |||
| id="block"></a>---structural elements like paragraphs, block quotations, | [blocks](@block)---structural | |||
| elements like paragraphs, block quotations, | ||||
| lists, headers, rules, and code blocks. Blocks can contain other | lists, headers, rules, and code blocks. Blocks can contain other | |||
| blocks, or they can contain [inline](#inline)<a id="inline"></a> content: | blocks, or they can contain [inline](@inline) content: | |||
| words, spaces, links, emphasized text, images, and inline code. | words, spaces, links, emphasized text, images, and inline code. | |||
| ## Precedence | ## Precedence | |||
| Indicators of block structure always take precedence over indicators | Indicators of block structure always take precedence over indicators | |||
| of inline structure. So, for example, the following is a list with | of inline structure. So, for example, the following is a list with | |||
| two items, not a list with one item containing a code span: | two items, not a list with one item containing a code span: | |||
| . | . | |||
| - `one | - `one | |||
| skipping to change at line 263 | skipping to change at line 263 | |||
| paragraphs, headers, and other block constructs can be parsed for inline | paragraphs, headers, and other block constructs can be parsed for inline | |||
| structure. The second step requires information about link reference | structure. The second step requires information about link reference | |||
| definitions that will be available only at the end of the first | definitions that will be available only at the end of the first | |||
| step. Note that the first step requires processing lines in sequence, | step. Note that the first step requires processing lines in sequence, | |||
| but the second can be parallelized, since the inline parsing of | but the second can be parallelized, since the inline parsing of | |||
| one block element does not affect the inline parsing of any other. | one block element does not affect the inline parsing of any other. | |||
| ## Container blocks and leaf blocks | ## Container blocks and leaf blocks | |||
| We can divide blocks into two types: | We can divide blocks into two types: | |||
| [container blocks](#container-block), <a id="container-block"></a> | [container blocks](@container-block), | |||
| which can contain other blocks, and [leaf blocks](#leaf-block), | which can contain other blocks, and [leaf blocks](@leaf-block), | |||
| <a id="leaf-block"></a> which cannot. | which cannot. | |||
| # Leaf blocks | # Leaf blocks | |||
| This section describes the different kinds of leaf block that make up a | This section describes the different kinds of leaf block that make up a | |||
| Markdown document. | Markdown document. | |||
| ## Horizontal rules | ## Horizontal rules | |||
| A line consisting of 0-3 spaces of indentation, followed by a sequence | A line consisting of 0-3 spaces of indentation, followed by a sequence | |||
| of three or more matching `-`, `_`, or `*` characters, each followed | of three or more matching `-`, `_`, or `*` characters, each followed | |||
| optionally by any number of spaces, forms a [horizontal | optionally by any number of spaces, forms a [horizontal | |||
| rule](#horizontal-rule). <a id="horizontal-rule"></a> | rule](@horizontal-rule). | |||
| . | . | |||
| *** | *** | |||
| --- | --- | |||
| ___ | ___ | |||
| . | . | |||
| <hr /> | <hr /> | |||
| <hr /> | <hr /> | |||
| <hr /> | <hr /> | |||
| . | . | |||
| skipping to change at line 477 | skipping to change at line 477 | |||
| - * * * | - * * * | |||
| . | . | |||
| <ul> | <ul> | |||
| <li>Foo</li> | <li>Foo</li> | |||
| <li><hr /></li> | <li><hr /></li> | |||
| </ul> | </ul> | |||
| . | . | |||
| ## ATX headers | ## ATX headers | |||
| An [ATX header](#atx-header) <a id="atx-header"></a> | An [ATX header](@atx-header) | |||
| consists of a string of characters, parsed as inline content, between an | consists of a string of characters, parsed as inline content, between an | |||
| opening sequence of 1--6 unescaped `#` characters and an optional | opening sequence of 1--6 unescaped `#` characters and an optional | |||
| closing sequence of any number of `#` characters. The opening sequence | closing sequence of any number of `#` characters. The opening sequence | |||
| of `#` characters cannot be followed directly by a nonspace character. | of `#` characters cannot be followed directly by a nonspace character. | |||
| The optional closing sequence of `#`s must be preceded by a space and may be | The optional closing sequence of `#`s must be preceded by a space and may be | |||
| followed by spaces only. The opening `#` character may be indented 0-3 | followed by spaces only. The opening `#` character may be indented 0-3 | |||
| spaces. The raw contents of the header are stripped of leading and | spaces. The raw contents of the header are stripped of leading and | |||
| trailing spaces before being parsed as inline content. The header level | trailing spaces before being parsed as inline content. The header level | |||
| is equal to the number of `#` characters in the opening sequence. | is equal to the number of `#` characters in the opening sequence. | |||
| skipping to change at line 675 | skipping to change at line 675 | |||
| # | # | |||
| ### ### | ### ### | |||
| . | . | |||
| <h2></h2> | <h2></h2> | |||
| <h1></h1> | <h1></h1> | |||
| <h3></h3> | <h3></h3> | |||
| . | . | |||
| ## Setext headers | ## Setext headers | |||
| A [setext header](#setext-header) <a id="setext-header"></a> | A [setext header](@setext-header) | |||
| consists of a line of text, containing at least one nonspace character, | consists of a line of text, containing at least one nonspace character, | |||
| with no more than 3 spaces indentation, followed by a [setext header | with no more than 3 spaces indentation, followed by a [setext header | |||
| underline](#setext-header-underline). The line of text must be | underline](#setext-header-underline). The line of text must be | |||
| one that, were it not followed by the setext header underline, | one that, were it not followed by the setext header underline, | |||
| would be interpreted as part of a paragraph: it cannot be a code | would be interpreted as part of a paragraph: it cannot be a code | |||
| block, header, blockquote, horizontal rule, or list. A [setext header | block, header, blockquote, horizontal rule, or list. A [setext header | |||
| underline](#setext-header-underline) <a id="setext-header-underline"></a> | underline](@setext-header-underline) | |||
| is a sequence of `=` characters or a sequence of `-` characters, with no | is a sequence of `=` characters or a sequence of `-` characters, with no | |||
| more than 3 spaces indentation and any number of trailing | more than 3 spaces indentation and any number of trailing | |||
| spaces. The header is a level 1 header if `=` characters are used, and | spaces. The header is a level 1 header if `=` characters are used, and | |||
| a level 2 header if `-` characters are used. The contents of the header | a level 2 header if `-` characters are used. The contents of the header | |||
| are the result of parsing the first line as Markdown inline content. | are the result of parsing the first line as Markdown inline content. | |||
| In general, a setext header need not be preceded or followed by a | In general, a setext header need not be preceded or followed by a | |||
| blank line. However, it cannot interrupt a paragraph, so when a | blank line. However, it cannot interrupt a paragraph, so when a | |||
| setext header comes after a paragraph, a blank line is needed between | setext header comes after a paragraph, a blank line is needed between | |||
| them. | them. | |||
| skipping to change at line 946 | skipping to change at line 946 | |||
| . | . | |||
| \> foo | \> foo | |||
| ------ | ------ | |||
| . | . | |||
| <h2>> foo</h2> | <h2>> foo</h2> | |||
| . | . | |||
| ## Indented code blocks | ## Indented code blocks | |||
| An [indented code block](#indented-code-block) | An [indented code block](@indented-code-block) | |||
| <a id="indented-code-block"></a> is composed of one or more | is composed of one or more | |||
| [indented chunks](#indented-chunk) separated by blank lines. | [indented chunks](#indented-chunk) separated by blank lines. | |||
| An [indented chunk](#indented-chunk) <a id="indented-chunk"></a> | An [indented chunk](@indented-chunk) | |||
| is a sequence of non-blank lines, each indented four or more | is a sequence of non-blank lines, each indented four or more | |||
| spaces. An indented code block cannot interrupt a paragraph, so | spaces. An indented code block cannot interrupt a paragraph, so | |||
| if it occurs before or after a paragraph, there must be an | if it occurs before or after a paragraph, there must be an | |||
| intervening blank line. The contents of the code block are | intervening blank line. The contents of the code block are | |||
| the literal contents of the lines, including trailing newlines, | the literal contents of the lines, including trailing newlines, | |||
| minus four spaces of indentation. An indented code block has no | minus four spaces of indentation. An indented code block has no | |||
| attributes. | attributes. | |||
| . | . | |||
| a simple | a simple | |||
| skipping to change at line 1092 | skipping to change at line 1092 | |||
| . | . | |||
| foo | foo | |||
| . | . | |||
| <pre><code>foo | <pre><code>foo | |||
| </code></pre> | </code></pre> | |||
| . | . | |||
| ## Fenced code blocks | ## Fenced code blocks | |||
| A [code fence](#code-fence) <a id="code-fence"></a> is a sequence | A [code fence](@code-fence) is a sequence | |||
| of at least three consecutive backtick characters (`` ` ``) or | of at least three consecutive backtick characters (`` ` ``) or | |||
| tildes (`~`). (Tildes and backticks cannot be mixed.) | tildes (`~`). (Tildes and backticks cannot be mixed.) | |||
| A [fenced code block](#fenced-code-block) <a id="fenced-code-block"></a> | A [fenced code block](@fenced-code-block) | |||
| begins with a code fence, indented no more than three spaces. | begins with a code fence, indented no more than three spaces. | |||
| The line with the opening code fence may optionally contain some text | The line with the opening code fence may optionally contain some text | |||
| following the code fence; this is trimmed of leading and trailing | following the code fence; this is trimmed of leading and trailing | |||
| spaces and called the [info string](#info-string). | spaces and called the [info string](@info-string). | |||
| <a id="info-string"></a> The info string may not contain any backtick | The info string may not contain any backtick | |||
| characters. (The reason for this restriction is that otherwise | characters. (The reason for this restriction is that otherwise | |||
| some inline code would be incorrectly interpreted as the | some inline code would be incorrectly interpreted as the | |||
| beginning of a fenced code block.) | beginning of a fenced code block.) | |||
| The content of the code block consists of all subsequent lines, until | The content of the code block consists of all subsequent lines, until | |||
| a closing [code fence](#code-fence) of the same type as the code block | a closing [code fence](#code-fence) of the same type as the code block | |||
| began with (backticks or tildes), and with at least as many backticks | began with (backticks or tildes), and with at least as many backticks | |||
| or tildes as the opening code fence. If the leading code fence is | or tildes as the opening code fence. If the leading code fence is | |||
| indented N spaces, then up to N spaces of indentation are removed from | indented N spaces, then up to N spaces of indentation are removed from | |||
| each line of the content (if present). (If a content line is not | each line of the content (if present). (If a content line is not | |||
| skipping to change at line 1451 | skipping to change at line 1451 | |||
| ``` | ``` | |||
| ``` aaa | ``` aaa | |||
| ``` | ``` | |||
| . | . | |||
| <pre><code>``` aaa | <pre><code>``` aaa | |||
| </code></pre> | </code></pre> | |||
| . | . | |||
| ## HTML blocks | ## HTML blocks | |||
| An [HTML block tag](#html-block-tag) <a id="html-block-tag"></a> is | An [HTML block tag](@html-block-tag) is | |||
| an [open tag](#open-tag) or [closing tag](#closing-tag) whose tag | an [open tag](#open-tag) or [closing tag](#closing-tag) whose tag | |||
| name is one of the following (case-insensitive): | name is one of the following (case-insensitive): | |||
| `article`, `header`, `aside`, `hgroup`, `blockquote`, `hr`, `iframe`, | `article`, `header`, `aside`, `hgroup`, `blockquote`, `hr`, `iframe`, | |||
| `body`, `li`, `map`, `button`, `object`, `canvas`, `ol`, `caption`, | `body`, `li`, `map`, `button`, `object`, `canvas`, `ol`, `caption`, | |||
| `output`, `col`, `p`, `colgroup`, `pre`, `dd`, `progress`, `div`, | `output`, `col`, `p`, `colgroup`, `pre`, `dd`, `progress`, `div`, | |||
| `section`, `dl`, `table`, `td`, `dt`, `tbody`, `embed`, `textarea`, | `section`, `dl`, `table`, `td`, `dt`, `tbody`, `embed`, `textarea`, | |||
| `fieldset`, `tfoot`, `figcaption`, `th`, `figure`, `thead`, `footer`, | `fieldset`, `tfoot`, `figcaption`, `th`, `figure`, `thead`, `footer`, | |||
| `tr`, `form`, `ul`, `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `video`, | `tr`, `form`, `ul`, `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `video`, | |||
| `script`, `style`. | `script`, `style`. | |||
| An [HTML block](#html-block) <a id="html-block"></a> begins with an | An [HTML block](@html-block) begins with an | |||
| [HTML block tag](#html-block-tag), [HTML comment](#html-comment), | [HTML block tag](#html-block-tag), [HTML comment](#html-comment), | |||
| [processing instruction](#processing-instruction), | [processing instruction](#processing-instruction), | |||
| [declaration](#declaration), or [CDATA section](#cdata-section). | [declaration](#declaration), or [CDATA section](#cdata-section). | |||
| It ends when a [blank line](#blank-line) or the end of the | It ends when a [blank line](#blank-line) or the end of the | |||
| input is encountered. The initial line may be indented up to three | input is encountered. The initial line may be indented up to three | |||
| spaces, and subsequent lines may have any indentation. The contents | spaces, and subsequent lines may have any indentation. The contents | |||
| of the HTML block are interpreted as raw HTML, and will not be escaped | of the HTML block are interpreted as raw HTML, and will not be escaped | |||
| in HTML output. | in HTML output. | |||
| Some simple examples: | Some simple examples: | |||
| skipping to change at line 1736 | skipping to change at line 1736 | |||
| . | . | |||
| Moreover, blank lines are usually not necessary and can be | Moreover, blank lines are usually not necessary and can be | |||
| deleted. The exception is inside `<pre>` tags; here, one can | deleted. The exception is inside `<pre>` tags; here, one can | |||
| replace the blank lines with ` ` entities. | replace the blank lines with ` ` entities. | |||
| So there is no important loss of expressive power with the new rule. | So there is no important loss of expressive power with the new rule. | |||
| ## Link reference definitions | ## Link reference definitions | |||
| A [link reference definition](#link-reference-definition) | A [link reference definition](@link-reference-definition) | |||
| <a id="link-reference-definition"></a> consists of a [link | consists of a [link | |||
| label](#link-label), indented up to three spaces, followed | label](#link-label), indented up to three spaces, followed | |||
| by a colon (`:`), optional blank space (including up to one | by a colon (`:`), optional blank space (including up to one | |||
| newline), a [link destination](#link-destination), optional | newline), a [link destination](#link-destination), optional | |||
| blank space (including up to one newline), and an optional [link | blank space (including up to one newline), and an optional [link | |||
| title](#link-title), which if it is present must be separated | title](#link-title), which if it is present must be separated | |||
| from the [link destination](#link-destination) by whitespace. | from the [link destination](#link-destination) by whitespace. | |||
| No further non-space characters may occur on the line. | No further non-space characters may occur on the line. | |||
| A [link reference-definition](#link-reference-definition) | A [link reference-definition](#link-reference-definition) | |||
| does not correspond to a structural element of a document. Instead, it | does not correspond to a structural element of a document. Instead, it | |||
| skipping to change at line 1961 | skipping to change at line 1961 | |||
| > [foo]: /url | > [foo]: /url | |||
| . | . | |||
| <p><a href="/url">foo</a></p> | <p><a href="/url">foo</a></p> | |||
| <blockquote> | <blockquote> | |||
| </blockquote> | </blockquote> | |||
| . | . | |||
| ## Paragraphs | ## Paragraphs | |||
| A sequence of non-blank lines that cannot be interpreted as other | A sequence of non-blank lines that cannot be interpreted as other | |||
| kinds of blocks forms a [paragraph](#paragraph).<a id="paragraph"></a> | kinds of blocks forms a [paragraph](@paragraph). | |||
| The contents of the paragraph are the result of parsing the | The contents of the paragraph are the result of parsing the | |||
| paragraph's raw content as inlines. The paragraph's raw content | paragraph's raw content as inlines. The paragraph's raw content | |||
| is formed by concatenating the lines and removing initial and final | is formed by concatenating the lines and removing initial and final | |||
| spaces. | spaces. | |||
| A simple example with two paragraphs: | A simple example with two paragraphs: | |||
| . | . | |||
| aaa | aaa | |||
| skipping to change at line 2100 | skipping to change at line 2100 | |||
| > with these blocks as its content. | > with these blocks as its content. | |||
| So, we explain what counts as a block quote or list item by explaining | So, we explain what counts as a block quote or list item by explaining | |||
| how these can be *generated* from their contents. This should suffice | how these can be *generated* from their contents. This should suffice | |||
| to define the syntax, although it does not give a recipe for *parsing* | to define the syntax, although it does not give a recipe for *parsing* | |||
| these constructions. (A recipe is provided below in the section entitled | these constructions. (A recipe is provided below in the section entitled | |||
| [A parsing strategy](#appendix-a-a-parsing-strategy).) | [A parsing strategy](#appendix-a-a-parsing-strategy).) | |||
| ## Block quotes | ## Block quotes | |||
| A [block quote marker](#block-quote-marker) <a id="block-quote-marker"></a> | A [block quote marker](@block-quote-marker) | |||
| consists of 0-3 spaces of initial indent, plus (a) the character `>` together | consists of 0-3 spaces of initial indent, plus (a) the character `>` together | |||
| with a following space, or (b) a single character `>` not followed by a space. | with a following space, or (b) a single character `>` not followed by a space. | |||
| The following rules define [block quotes](#block-quote): | The following rules define [block quotes](@block-quote): | |||
| <a id="block-quote"></a> | ||||
| 1. **Basic case.** If a string of lines *Ls* constitute a sequence | 1. **Basic case.** If a string of lines *Ls* constitute a sequence | |||
| of blocks *Bs*, then the result of prepending a [block quote | of blocks *Bs*, then the result of prepending a [block quote | |||
| marker](#block-quote-marker) to the beginning of each line in *Ls* | marker](#block-quote-marker) to the beginning of each line in *Ls* | |||
| is a [block quote](#block-quote) containing *Bs*. | is a [block quote](#block-quote) containing *Bs*. | |||
| 2. **Laziness.** If a string of lines *Ls* constitute a [block | 2. **Laziness.** If a string of lines *Ls* constitute a [block | |||
| quote](#block-quote) with contents *Bs*, then the result of deleting | quote](#block-quote) with contents *Bs*, then the result of deleting | |||
| the initial [block quote marker](#block-quote-marker) from one or | the initial [block quote marker](#block-quote-marker) from one or | |||
| more lines in which the next non-space character after the [block | more lines in which the next non-space character after the [block | |||
| quote marker](#block-quote-marker) is [paragraph continuation | quote marker](#block-quote-marker) is [paragraph continuation | |||
| text](#paragraph-continuation-text) is a block quote with *Bs* as | text](#paragraph-continuation-text) is a block quote with *Bs* as | |||
| its content. <a id="paragraph-continuation-text"></a> | its content. | |||
| [Paragraph continuation text](#paragraph-continuation-text) is text | [Paragraph continuation text](@paragraph-continuation-text) is text | |||
| that will be parsed as part of the content of a paragraph, but does | that will be parsed as part of the content of a paragraph, but does | |||
| not occur at the beginning of the paragraph. | not occur at the beginning of the paragraph. | |||
| 3. **Consecutiveness.** A document cannot contain two [block | 3. **Consecutiveness.** A document cannot contain two [block | |||
| quotes](#block-quote) in a row unless there is a [blank | quotes](#block-quote) in a row unless there is a [blank | |||
| line](#blank-line) between them. | line](#blank-line) between them. | |||
| Nothing else counts as a [block quote](#block-quote). | Nothing else counts as a [block quote](#block-quote). | |||
| Here is a simple example: | Here is a simple example: | |||
| skipping to change at line 2461 | skipping to change at line 2460 | |||
| <pre><code>code | <pre><code>code | |||
| </code></pre> | </code></pre> | |||
| </blockquote> | </blockquote> | |||
| <blockquote> | <blockquote> | |||
| <p>not code</p> | <p>not code</p> | |||
| </blockquote> | </blockquote> | |||
| . | . | |||
| ## List items | ## List items | |||
| A [list marker](#list-marker) <a id="list-marker"></a> is a | A [list marker](@list-marker) is a | |||
| [bullet list marker](#bullet-list-marker) or an [ordered list | [bullet list marker](#bullet-list-marker) or an [ordered list | |||
| marker](#ordered-list-marker). | marker](#ordered-list-marker). | |||
| A [bullet list marker](#bullet-list-marker) <a id="bullet-list-marker"></a> | A [bullet list marker](@bullet-list-marker) | |||
| is a `-`, `+`, or `*` character. | is a `-`, `+`, or `*` character. | |||
| An [ordered list marker](#ordered-list-marker) <a id="ordered-list-marker"></a> | An [ordered list marker](@ordered-list-marker) | |||
| is a sequence of one of more digits (`0-9`), followed by either a | is a sequence of one of more digits (`0-9`), followed by either a | |||
| `.` character or a `)` character. | `.` character or a `)` character. | |||
| The following rules define [list items](#list-item):<a | The following rules define [list items](@list-item): | |||
| id="list-item"></a> | ||||
| 1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of | 1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of | |||
| blocks *Bs* starting with a non-space character and not separated | blocks *Bs* starting with a non-space character and not separated | |||
| from each other by more than one blank line, and *M* is a list | from each other by more than one blank line, and *M* is a list | |||
| marker *M* of width *W* followed by 0 < *N* < 5 spaces, then the result | marker *M* of width *W* followed by 0 < *N* < 5 spaces, then the result | |||
| of prepending *M* and the following spaces to the first line of | of prepending *M* and the following spaces to the first line of | |||
| *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a | *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a | |||
| list item with *Bs* as its contents. The type of the list item | list item with *Bs* as its contents. The type of the list item | |||
| (bullet or ordered) is determined by the type of its list marker. | (bullet or ordered) is determined by the type of its list marker. | |||
| If the list item is ordered, then it is also assigned a start | If the list item is ordered, then it is also assigned a start | |||
| skipping to change at line 2919 | skipping to change at line 2917 | |||
| > A block quote. | > A block quote. | |||
| </code></pre> | </code></pre> | |||
| . | . | |||
| 4. **Laziness.** If a string of lines *Ls* constitute a [list | 4. **Laziness.** If a string of lines *Ls* constitute a [list | |||
| item](#list-item) with contents *Bs*, then the result of deleting | item](#list-item) with contents *Bs*, then the result of deleting | |||
| some or all of the indentation from one or more lines in which the | some or all of the indentation from one or more lines in which the | |||
| next non-space character after the indentation is | next non-space character after the indentation is | |||
| [paragraph continuation text](#paragraph-continuation-text) is a | [paragraph continuation text](#paragraph-continuation-text) is a | |||
| list item with the same contents and attributes.<a | list item with the same contents and attributes. The unindented | |||
| id="lazy-continuation-line"></a> | lines are called | |||
| [lazy continuation lines](@lazy-continuation-line). | ||||
| Here is an example with [lazy continuation | Here is an example with [lazy continuation | |||
| lines](#lazy-continuation-line): | lines](#lazy-continuation-line): | |||
| . | . | |||
| 1. A paragraph | 1. A paragraph | |||
| with two lines. | with two lines. | |||
| indented code | indented code | |||
| skipping to change at line 3296 | skipping to change at line 3295 | |||
| The one case that needs special treatment is a list item that *starts* | The one case that needs special treatment is a list item that *starts* | |||
| with indented code. How much indentation is required in that case, since | with indented code. How much indentation is required in that case, since | |||
| we don't have a "first paragraph" to measure from? Rule #2 simply stipulates | we don't have a "first paragraph" to measure from? Rule #2 simply stipulates | |||
| that in such cases, we require one space indentation from the list marker | that in such cases, we require one space indentation from the list marker | |||
| (and then the normal four spaces for the indented code). This will match the | (and then the normal four spaces for the indented code). This will match the | |||
| four-space rule in cases where the list marker plus its initial indentation | four-space rule in cases where the list marker plus its initial indentation | |||
| takes four spaces (a common case), but diverge in other cases. | takes four spaces (a common case), but diverge in other cases. | |||
| ## Lists | ## Lists | |||
| A [list](#list) <a id="list"></a> is a sequence of one or more | A [list](@list) is a sequence of one or more | |||
| list items [of the same type](#of-the-same-type). The list items | list items [of the same type](#of-the-same-type). The list items | |||
| may be separated by single [blank lines](#blank-line), but two | may be separated by single [blank lines](#blank-line), but two | |||
| blank lines end all containing lists. | blank lines end all containing lists. | |||
| Two list items are [of the same type](#of-the-same-type) | Two list items are [of the same type](@of-the-same-type) | |||
| <a id="of-the-same-type"></a> if they begin with a [list | if they begin with a [list | |||
| marker](#list-marker) of the same type. Two list markers are of the | marker](#list-marker) of the same type. Two list markers are of the | |||
| same type if (a) they are bullet list markers using the same character | same type if (a) they are bullet list markers using the same character | |||
| (`-`, `+`, or `*`) or (b) they are ordered list numbers with the same | (`-`, `+`, or `*`) or (b) they are ordered list numbers with the same | |||
| delimiter (either `.` or `)`). | delimiter (either `.` or `)`). | |||
| A list is an [ordered list](#ordered-list) <a id="ordered-list"></a> | A list is an [ordered list](@ordered-list) | |||
| if its constituent list items begin with | if its constituent list items begin with | |||
| [ordered list markers](#ordered-list-marker), and a [bullet | [ordered list markers](#ordered-list-marker), and a [bullet | |||
| list](#bullet-list) <a id="bullet-list"></a> if its constituent list | list](@bullet-list) if its constituent list | |||
| items begin with [bullet list markers](#bullet-list-marker). | items begin with [bullet list markers](#bullet-list-marker). | |||
| The [start number](#start-number) <a id="start-number"></a> | The [start number](@start-number) | |||
| of an [ordered list](#ordered-list) is determined by the list number of | of an [ordered list](#ordered-list) is determined by the list number of | |||
| its initial list item. The numbers of subsequent list items are | its initial list item. The numbers of subsequent list items are | |||
| disregarded. | disregarded. | |||
| A list is [loose](#loose)<a id="loose"></a> if it any of its constituent | A list is [loose](@loose) if it any of its constituent | |||
| list items are separated by blank lines, or if any of its constituent | list items are separated by blank lines, or if any of its constituent | |||
| list items directly contain two block-level elements with a blank line | list items directly contain two block-level elements with a blank line | |||
| between them. Otherwise a list is [tight](#tight).<a id="tight"></a> | between them. Otherwise a list is [tight](@tight). | |||
| (The difference in HTML output is that paragraphs in a loose list are | (The difference in HTML output is that paragraphs in a loose list are | |||
| wrapped in `<p>` tags, while paragraphs in a tight list are not.) | wrapped in `<p>` tags, while paragraphs in a tight list are not.) | |||
| Changing the bullet or ordered list delimiter starts a new list: | Changing the bullet or ordered list delimiter starts a new list: | |||
| . | . | |||
| - foo | - foo | |||
| - bar | - bar | |||
| + baz | + baz | |||
| . | . | |||
| skipping to change at line 3400 | skipping to change at line 3399 | |||
| First, it is natural and not uncommon for people to start lists without | First, it is natural and not uncommon for people to start lists without | |||
| blank lines: | blank lines: | |||
| I need to buy | I need to buy | |||
| - new shoes | - new shoes | |||
| - a coat | - a coat | |||
| - a plane ticket | - a plane ticket | |||
| Second, we are attracted to a | Second, we are attracted to a | |||
| > [principle of uniformity](#principle-of-uniformity):<a | > [principle of uniformity](@principle-of-uniformity): | |||
| > id="principle-of-uniformity"></a> if a span of text has a certain | > if a span of text has a certain | |||
| > meaning, it will continue to have the same meaning when put into a list | > meaning, it will continue to have the same meaning when put into a list | |||
| > item. | > item. | |||
| (Indeed, the spec for [list items](#list-item) presupposes this.) | (Indeed, the spec for [list items](#list-item) presupposes this.) | |||
| This principle implies that if | This principle implies that if | |||
| * I need to buy | * I need to buy | |||
| - new shoes | - new shoes | |||
| - a coat | - a coat | |||
| - a plane ticket | - a plane ticket | |||
| skipping to change at line 3919 | skipping to change at line 3918 | |||
| With the goal of making this standard as HTML-agnostic as possible, all | With the goal of making this standard as HTML-agnostic as possible, all | |||
| valid HTML entities in any context are recognized as such and | valid HTML entities in any context are recognized as such and | |||
| converted into unicode characters before they are stored in the AST. | converted into unicode characters before they are stored in the AST. | |||
| This allows implementations that target HTML output to trivially escape | This allows implementations that target HTML output to trivially escape | |||
| the entities when generating HTML, and simplifies the job of | the entities when generating HTML, and simplifies the job of | |||
| implementations targetting other languages, as these will only need to | implementations targetting other languages, as these will only need to | |||
| handle the unicode chars and need not be HTML-entity aware. | handle the unicode chars and need not be HTML-entity aware. | |||
| [Named entities](#name-entities) <a id="named-entities"></a> consist of `&` | [Named entities](@name-entities) consist of `&` | |||
| + any of the valid HTML5 entity names + `;`. The | + any of the valid HTML5 entity names + `;`. The | |||
| [following document](http://www.whatwg.org/specs/web-apps/current-work/multipage /entities.json) | [following document](http://www.whatwg.org/specs/web-apps/current-work/multipage /entities.json) | |||
| is used as an authoritative source of the valid entity names and their | is used as an authoritative source of the valid entity names and their | |||
| corresponding codepoints. | corresponding codepoints. | |||
| Conforming implementations that target HTML don't need to generate | Conforming implementations that target HTML don't need to generate | |||
| entities for all the valid named entities that exist, with the exception | entities for all the valid named entities that exist, with the exception | |||
| of `"` (`"`), `&` (`&`), `<` (`<`) and `>` (`>`), which | of `"` (`"`), `&` (`&`), `<` (`<`) and `>` (`>`), which | |||
| always need to be written as entities for security reasons. | always need to be written as entities for security reasons. | |||
| . | . | |||
| & © Æ Ď ¾ ℋ ⅆ &Cl ockwiseContourIntegral; | & © Æ Ď ¾ ℋ ⅆ &Cl ockwiseContourIntegral; | |||
| . | . | |||
| <p> & © Æ Ď ¾ ℋ ⅆ ∲</p> | <p> & © Æ Ď ¾ ℋ ⅆ ∲</p> | |||
| . | . | |||
| [Decimal entities](#decimal-entities) <a id="decimal-entities"></a> | [Decimal entities](@decimal-entities) | |||
| consist of `&#` + a string of 1--8 arabic digits + `;`. Again, these | consist of `&#` + a string of 1--8 arabic digits + `;`. Again, these | |||
| entities need to be recognised and tranformed into their corresponding | entities need to be recognised and tranformed into their corresponding | |||
| UTF8 codepoints. Invalid Unicode codepoints will be written as the | UTF8 codepoints. Invalid Unicode codepoints will be written as the | |||
| "unknown codepoint" character (`0xFFFD`) | "unknown codepoint" character (`0xFFFD`) | |||
| . | . | |||
| # Ӓ Ϡ � | # Ӓ Ϡ � | |||
| . | . | |||
| <p># Ӓ Ϡ �</p> | <p># Ӓ Ϡ �</p> | |||
| . | . | |||
| [Hexadecimal entities](#hexadecimal-entities) <a id="hexadecimal-entities"></a> | [Hexadecimal entities](@hexadecimal-entities) | |||
| consist of `&#` + either `X` or `x` + a string of 1-8 hexadecimal digits | consist of `&#` + either `X` or `x` + a string of 1-8 hexadecimal digits | |||
| + `;`. They will also be parsed and turned into their corresponding UTF8 values in the AST. | + `;`. They will also be parsed and turned into their corresponding UTF8 values in the AST. | |||
| . | . | |||
| " ആ ಫ | " ആ ಫ | |||
| . | . | |||
| <p>" ആ ಫ</p> | <p>" ആ ಫ</p> | |||
| . | . | |||
| Here are some nonentities: | Here are some nonentities: | |||
| skipping to change at line 4035 | skipping to change at line 4034 | |||
| . | . | |||
| föfö | föfö | |||
| . | . | |||
| <pre><code>f&ouml;f&ouml; | <pre><code>f&ouml;f&ouml; | |||
| </code></pre> | </code></pre> | |||
| . | . | |||
| ## Code span | ## Code span | |||
| A [backtick string](#backtick-string) <a id="backtick-string"></a> | A [backtick string](@backtick-string) | |||
| is a string of one or more backtick characters (`` ` ``) that is neither | is a string of one or more backtick characters (`` ` ``) that is neither | |||
| preceded nor followed by a backtick. | preceded nor followed by a backtick. | |||
| A code span begins with a backtick string and ends with a backtick | A [code span](@code-span) begins with a backtick string and ends with a backtick | |||
| string of equal length. The contents of the code span are the | string of equal length. The contents of the code span are the | |||
| characters between the two backtick strings, with leading and trailing | characters between the two backtick strings, with leading and trailing | |||
| spaces and newlines removed, and consecutive spaces and newlines | spaces and newlines removed, and consecutive spaces and newlines | |||
| collapsed to single spaces. | collapsed to single spaces. | |||
| This is a simple code span: | This is a simple code span: | |||
| . | . | |||
| `foo` | `foo` | |||
| . | . | |||
| skipping to change at line 4219 | skipping to change at line 4218 | |||
| spans, but users often do not.) | spans, but users often do not.) | |||
| ``` markdown | ``` markdown | |||
| internal emphasis: foo*bar*baz | internal emphasis: foo*bar*baz | |||
| no emphasis: foo_bar_baz | no emphasis: foo_bar_baz | |||
| ``` | ``` | |||
| The following rules capture all of these patterns, while allowing | The following rules capture all of these patterns, while allowing | |||
| for efficient parsing strategies that do not backtrack: | for efficient parsing strategies that do not backtrack: | |||
| 1. A single `*` character [can open emphasis](#can-open-emphasis) | 1. A single `*` character [can open emphasis](@can-open-emphasis) | |||
| <a id="can-open-emphasis"></a> iff it is not followed by | iff it is not followed by | |||
| whitespace. | whitespace. | |||
| 2. A single `_` character [can open emphasis](#can-open-emphasis) iff | 2. A single `_` character [can open emphasis](#can-open-emphasis) iff | |||
| it is not followed by whitespace and it is not preceded by an | it is not followed by whitespace and it is not preceded by an | |||
| ASCII alphanumeric character. | ASCII alphanumeric character. | |||
| 3. A single `*` character [can close emphasis](#can-close-emphasis) | 3. A single `*` character [can close emphasis](@can-close-emphasis) | |||
| <a id="can-close-emphasis"></a> iff it is not preceded by whitespace. | iff it is not preceded by whitespace. | |||
| 4. A single `_` character [can close emphasis](#can-close-emphasis) iff | 4. A single `_` character [can close emphasis](#can-close-emphasis) iff | |||
| it is not preceded by whitespace and it is not followed by an | it is not preceded by whitespace and it is not followed by an | |||
| ASCII alphanumeric character. | ASCII alphanumeric character. | |||
| 5. A double `**` [can open strong emphasis](#can-open-strong-emphasis) | 5. A double `**` [can open strong emphasis](@can-open-strong-emphasis) | |||
| <a id="can-open-strong-emphasis" ></a> iff it is not followed by | iff it is not followed by | |||
| whitespace. | whitespace. | |||
| 6. A double `__` [can open strong emphasis](#can-open-strong-emphasis) | 6. A double `__` [can open strong emphasis](#can-open-strong-emphasis) | |||
| iff it is not followed by whitespace and it is not preceded by an | iff it is not followed by whitespace and it is not preceded by an | |||
| ASCII alphanumeric character. | ASCII alphanumeric character. | |||
| 7. A double `**` [can close strong emphasis](#can-close-strong-emphasis) | 7. A double `**` [can close strong emphasis](@can-close-strong-emphasis) | |||
| <a id="can-close-strong-emphasis" ></a> iff it is not preceded by | iff it is not preceded by | |||
| whitespace. | whitespace. | |||
| 8. A double `__` [can close strong emphasis](#can-close-strong-emphasis) | 8. A double `__` [can close strong emphasis](#can-close-strong-emphasis) | |||
| iff it is not preceded by whitespace and it is not followed by an | iff it is not preceded by whitespace and it is not followed by an | |||
| ASCII alphanumeric character. | ASCII alphanumeric character. | |||
| 9. Emphasis begins with a delimiter that [can open | 9. Emphasis begins with a delimiter that [can open | |||
| emphasis](#can-open-emphasis) and ends with a delimiter that [can close | emphasis](#can-open-emphasis) and ends with a delimiter that [can close | |||
| emphasis](#can-close-emphasis), and that uses the same | emphasis](#can-close-emphasis), and that uses the same | |||
| character (`_` or `*`) as the opening delimiter. There must | character (`_` or `*`) as the opening delimiter. There must | |||
| skipping to change at line 5075 | skipping to change at line 5074 | |||
| . | . | |||
| . | . | |||
| __a<http://foo.bar?q=__> | __a<http://foo.bar?q=__> | |||
| . | . | |||
| <p>__a<a href="http://foo.bar?q=__">http://foo.bar?q=__</a></p> | <p>__a<a href="http://foo.bar?q=__">http://foo.bar?q=__</a></p> | |||
| . | . | |||
| ## Links | ## Links | |||
| A link contains a [link label](#link-label) (the visible text), | A link contains [link text](#link-label) (the visible text), | |||
| a [destination](#destination) (the URI that is the link destination), | a [destination](#destination) (the URI that is the link destination), | |||
| and optionally a [link title](#link-title). There are two basic kinds | and optionally a [link title](#link-title). There are two basic kinds | |||
| of links in Markdown. In [inline links](#inline-links) the destination | of links in Markdown. In [inline links](#inline-links) the destination | |||
| and title are given immediately after the label. In [reference | and title are given immediately after the link text. In [reference | |||
| links](#reference-links) the destination and title are defined elsewhere | links](#reference-links) the destination and title are defined elsewhere | |||
| in the document. | in the document. | |||
| A [link label](#link-label) <a id="link-label"></a> consists of | A [link text](@link-text) consists of a sequence of zero or more | |||
| inline elements enclosed by square brackets (`[` and `]`). The | ||||
| following rules apply: | ||||
| - an opening `[`, followed by | - Links may not contain other links, at any level of nesting. | |||
| - zero or more backtick code spans, autolinks, HTML tags, link labels, | ||||
| backslash-escaped ASCII punctuation characters, or non-`]` characters, | ||||
| followed by | ||||
| - a closing `]`. | ||||
| <span class="insert">Links may not contain other links, at any level of nesting | - Brackets are allowed in the link text only if (a) they are | |||
| .</span> | backslash-escaped or (b) they appear as a matched pair of brackets, | |||
| These rules are motivated by the following intuitive ideas: | with an open bracket `[`, a sequence of zero or more inlines, and | |||
| a close bracket `]`. | ||||
| - A link label is a container for inline elements. | - Backtick [code spans](#code-span), [autolinks](#autolink), and | |||
| - The square brackets bind more tightly than emphasis markers, | raw [HTML tags](#html-tag) bind more tightly | |||
| but less tightly than `<>` or `` ` ``. | than the brackets in link text. Thus, for example, | |||
| - Link labels may contain material in matching square brackets. | `` [foo`]` `` could not be a link text, since the second `]` | |||
| is part of a code span. | ||||
| A [link destination](#link-destination) <a id="link-destination"></a> | - The brackets in link text bind more tightly than markers for | |||
| consists of either | [emphasis and strong emphasis](#emphasis-and-strong-emphasis). | |||
| Thus, for example, `*[foo*](url)` is a link. | ||||
| A [link destination](@link-destination) consists of either | ||||
| - a sequence of zero or more characters between an opening `<` and a | - a sequence of zero or more characters between an opening `<` and a | |||
| closing `>` that contains no line breaks or unescaped `<` or `>` | closing `>` that contains no line breaks or unescaped `<` or `>` | |||
| characters, or | characters, or | |||
| - a nonempty sequence of characters that does not include | - a nonempty sequence of characters that does not include | |||
| ASCII space or control characters, and includes parentheses | ASCII space or control characters, and includes parentheses | |||
| only if (a) they are backslash-escaped or (b) they are part of | only if (a) they are backslash-escaped or (b) they are part of | |||
| a balanced pair of unescaped parentheses that is not itself | a balanced pair of unescaped parentheses that is not itself | |||
| inside a balanced pair of unescaped paretheses. | inside a balanced pair of unescaped paretheses. | |||
| A [link title](#link-title) <a id="link-title"></a> consists of either | A [link title](@link-title) consists of either | |||
| - a sequence of zero or more characters between straight double-quote | - a sequence of zero or more characters between straight double-quote | |||
| characters (`"`), including a `"` character only if it is | characters (`"`), including a `"` character only if it is | |||
| backslash-escaped, or | backslash-escaped, or | |||
| - a sequence of zero or more characters between straight single-quote | - a sequence of zero or more characters between straight single-quote | |||
| characters (`'`), including a `'` character only if it is | characters (`'`), including a `'` character only if it is | |||
| backslash-escaped, or | backslash-escaped, or | |||
| - a sequence of zero or more characters between matching parentheses | - a sequence of zero or more characters between matching parentheses | |||
| (`(...)`), including a `)` character only if it is backslash-escaped. | (`(...)`), including a `)` character only if it is backslash-escaped. | |||
| An [inline link](#inline-link) <a id="inline-link"></a> | An [inline link](@inline-link) | |||
| consists of a [link label](#link-label) followed immediately | consists of a [link text](#link-text) followed immediately | |||
| by a left parenthesis `(`, optional whitespace, | by a left parenthesis `(`, optional whitespace, | |||
| an optional [link destination](#link-destination), | an optional [link destination](#link-destination), | |||
| an optional [link title](#link-title) separated from the link | an optional [link title](#link-title) separated from the link | |||
| destination by whitespace, optional whitespace, and a right | destination by whitespace, optional whitespace, and a right | |||
| parenthesis `)`. The link's text consists of the label (excluding | parenthesis `)`. The link's text consists of the inlines contained | |||
| the enclosing square brackets) parsed as inlines. The link's | in the [link text](#link-text) (excluding the enclosing square brackets). | |||
| URI consists of the link destination, excluding enclosing `<...>` if | The link's URI consists of the link destination, excluding enclosing | |||
| present, with backslash-escapes in effect as described above. The | `<...>` if present, with backslash-escapes in effect as described | |||
| link's title consists of the link title, excluding its enclosing | above. The link's title consists of the link title, excluding its | |||
| delimiters, with backslash-escapes in effect as described above. | enclosing delimiters, with backslash-escapes in effect as described | |||
| above. | ||||
| Here is a simple inline link: | Here is a simple inline link: | |||
| . | . | |||
| [link](/uri "title") | [link](/uri "title") | |||
| . | . | |||
| <p><a href="/uri" title="title">link</a></p> | <p><a href="/uri" title="title">link</a></p> | |||
| . | . | |||
| The title may be omitted: | The title may be omitted: | |||
| skipping to change at line 5310 | skipping to change at line 5315 | |||
| Whitespace is allowed around the destination and title: | Whitespace is allowed around the destination and title: | |||
| . | . | |||
| [link]( /uri | [link]( /uri | |||
| "title" ) | "title" ) | |||
| . | . | |||
| <p><a href="/uri" title="title">link</a></p> | <p><a href="/uri" title="title">link</a></p> | |||
| . | . | |||
| But it is not allowed between the link label and the | But it is not allowed between the link text and the | |||
| following parenthesis: | following parenthesis: | |||
| . | . | |||
| [link] (/uri) | [link] (/uri) | |||
| . | . | |||
| <p>[link] (/uri)</p> | <p>[link] (/uri)</p> | |||
| . | . | |||
| Note that this is not a link, because the closing `]` occurs in | The link text may contain balanced brackets, but not unbalanced ones, | |||
| an HTML tag: | unless they are escaped: | |||
| . | ||||
| [link [foo [bar]]](/uri) | ||||
| . | ||||
| <p><a href="/uri">link [foo [bar]]</a></p> | ||||
| . | ||||
| . | ||||
| [link] bar](/uri) | ||||
| . | ||||
| <p>[link] bar](/uri)</p> | ||||
| . | ||||
| . | ||||
| [link [bar](/uri) | ||||
| . | ||||
| <p>[link <a href="/uri">bar</a></p> | ||||
| . | ||||
| . | ||||
| [link \[bar](/uri) | ||||
| . | ||||
| <p><a href="/uri">link [bar</a></p> | ||||
| . | ||||
| The link text may contain inline content: | ||||
| . | ||||
| [link *foo **bar** `#`*](/uri) | ||||
| . | ||||
| <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p> | ||||
| . | ||||
| . | ||||
| [](/uri) | ||||
| . | ||||
| <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p> | ||||
| . | ||||
| However, links may not contain other links, at any level of nesting. | ||||
| . | ||||
| [foo [bar](/uri)](/uri) | ||||
| . | ||||
| <p>[foo <a href="/uri">bar</a>](/uri)</p> | ||||
| . | ||||
| . | ||||
| [foo *[bar [baz](/uri)](/uri)*](/uri) | ||||
| . | ||||
| <p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p> | ||||
| . | ||||
| These cases illustrate the precedence of link text grouping over | ||||
| emphasis grouping: | ||||
| . | ||||
| *[foo*](/uri) | ||||
| . | ||||
| <p>*<a href="/uri">foo*</a></p> | ||||
| . | ||||
| . | ||||
| [foo *bar](baz*) | ||||
| . | ||||
| <p><a href="baz*">foo *bar</a></p> | ||||
| . | ||||
| These cases illustrate the precedence of HTML tags, code spans, | ||||
| and autolinks over link grouping: | ||||
| . | . | |||
| [foo <bar attr="](baz)"> | [foo <bar attr="](baz)"> | |||
| . | . | |||
| <p>[foo <bar attr="](baz)"></p> | <p>[foo <bar attr="](baz)"></p> | |||
| . | . | |||
| There are three kinds of [reference links](#reference-link): | . | |||
| <a id="reference-link"></a> | [foo`](/uri)` | |||
| . | ||||
| <p>[foo<code>](/uri)</code></p> | ||||
| . | ||||
| A [full reference link](#full-reference-link) <a id="full-reference-link"></a> | . | |||
| consists of a [link label](#link-label), optional whitespace, and | [foo<http://example.com?search=](uri)> | |||
| another [link label](#link-label) that [matches](#matches) a | . | |||
| <p>[foo<a href="http://example.com?search=%5D(uri)">http://example.com?search=]( | ||||
| uri)</a></p> | ||||
| . | ||||
| There are three kinds of [reference links](@reference-link): | ||||
| [full](#full-reference-link), [collapsed](#collapsed-reference-link), | ||||
| and [shortcut](#shortcut-reference-link). | ||||
| A [full reference link](@full-reference-link) | ||||
| consists of a [link text](#link-text), optional whitespace, and | ||||
| a [link label](#link-label) that [matches](#matches) a | ||||
| [link reference definition](#link-reference-definition) elsewhere in the | [link reference definition](#link-reference-definition) elsewhere in the | |||
| document. | document. | |||
| One label [matches](#matches) <a id="matches"></a> | A [link label](@link-label) begins with a left bracket (`[`) and ends | |||
| with the first right bracket (`]`) that is not backslash-escaped. | ||||
| Unescaped square bracket characters are not allowed in | ||||
| [link labels](#link-label). A link label can have at most 999 | ||||
| characters inside the square brackets. | ||||
| One label [matches](@matches) | ||||
| another just in case their normalized forms are equal. To normalize a | another just in case their normalized forms are equal. To normalize a | |||
| label, perform the *unicode case fold* and collapse consecutive internal | label, perform the *unicode case fold* and collapse consecutive internal | |||
| whitespace to a single space. If there are multiple matching reference | whitespace to a single space. If there are multiple matching reference | |||
| link definitions, the one that comes first in the document is used. (It | link definitions, the one that comes first in the document is used. (It | |||
| is desirable in such cases to emit a warning.) | is desirable in such cases to emit a warning.) | |||
| The contents of the first link label are parsed as inlines, which are | The contents of the first link label are parsed as inlines, which are | |||
| used as the link's text. The link's URI and title are provided by the | used as the link's text. The link's URI and title are provided by the | |||
| matching [link reference definition](#link-reference-definition). | matching [link reference definition](#link-reference-definition). | |||
| Here is a simple example: | Here is a simple example: | |||
| . | . | |||
| [foo][bar] | [foo][bar] | |||
| [bar]: /url "title" | [bar]: /url "title" | |||
| . | . | |||
| <p><a href="/url" title="title">foo</a></p> | <p><a href="/url" title="title">foo</a></p> | |||
| . | . | |||
| The first label can contain inline content: | The rules for the [link text](#link-text) are the same as with | |||
| [inline links](#inline-link). Thus: | ||||
| The link text may contain balanced brackets, but not unbalanced ones, | ||||
| unless they are escaped: | ||||
| . | . | |||
| [*foo\!*][bar] | [link [foo [bar]]][ref] | |||
| [bar]: /url "title" | [ref]: /uri | |||
| . | . | |||
| <p><a href="/url" title="title"><em>foo!</em></a></p> | <p><a href="/uri">link [foo [bar]]</a></p> | |||
| . | ||||
| . | ||||
| [link \[bar][ref] | ||||
| [ref]: /uri | ||||
| . | ||||
| <p><a href="/uri">link [bar</a></p> | ||||
| . | ||||
| The link text may contain inline content: | ||||
| . | ||||
| [link *foo **bar** `#`*][ref] | ||||
| [ref]: /uri | ||||
| . | ||||
| <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p> | ||||
| . | ||||
| . | ||||
| [][ref] | ||||
| [ref]: /uri | ||||
| . | ||||
| <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p> | ||||
| . | ||||
| However, links may not contain other links, at any level of nesting. | ||||
| . | ||||
| [foo [bar](/uri)][ref] | ||||
| [ref]: /uri | ||||
| . | ||||
| <p>[foo <a href="/uri">bar</a>]<a href="/uri">ref</a></p> | ||||
| . | ||||
| . | ||||
| [foo *bar [baz][ref]*][ref] | ||||
| [ref]: /uri | ||||
| . | ||||
| <p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p> | ||||
| . | ||||
| (In the examples above, we have two [shortcut reference | ||||
| links](#shortcut-reference-link) instead of one [full reference | ||||
| link](#full-reference-link).) | ||||
| The following cases illustrate the precedence of link text grouping over | ||||
| emphasis grouping: | ||||
| . | ||||
| *[foo*][ref] | ||||
| [ref]: /uri | ||||
| . | ||||
| <p>*<a href="/uri">foo*</a></p> | ||||
| . | ||||
| . | ||||
| [foo *bar][ref] | ||||
| [ref]: /uri | ||||
| . | ||||
| <p><a href="/uri">foo *bar</a></p> | ||||
| . | ||||
| These cases illustrate the precedence of HTML tags, code spans, | ||||
| and autolinks over link grouping: | ||||
| . | ||||
| [foo <bar attr="][ref]"> | ||||
| [ref]: /uri | ||||
| . | ||||
| <p>[foo <bar attr="][ref]"></p> | ||||
| . | ||||
| . | ||||
| [foo`][ref]` | ||||
| [ref]: /uri | ||||
| . | ||||
| <p>[foo<code>][ref]</code></p> | ||||
| . | ||||
| . | ||||
| [foo<http://example.com?search=][ref]> | ||||
| [ref]: /uri | ||||
| . | ||||
| <p>[foo<a href="http://example.com?search=%5D%5Bref%5D">http://example.com?searc | ||||
| h=][ref]</a></p> | ||||
| . | . | |||
| Matching is case-insensitive: | Matching is case-insensitive: | |||
| . | . | |||
| [foo][BaR] | [foo][BaR] | |||
| [bar]: /url "title" | [bar]: /url "title" | |||
| . | . | |||
| <p><a href="/url" title="title">foo</a></p> | <p><a href="/url" title="title">foo</a></p> | |||
| skipping to change at line 5400 | skipping to change at line 5592 | |||
| . | . | |||
| [Foo | [Foo | |||
| bar]: /url | bar]: /url | |||
| [Baz][Foo bar] | [Baz][Foo bar] | |||
| . | . | |||
| <p><a href="/url">Baz</a></p> | <p><a href="/url">Baz</a></p> | |||
| . | . | |||
| There can be whitespace between the two labels: | There can be whitespace between the [link text](#link-text) and the | |||
| [link label](#link-label): | ||||
| . | . | |||
| [foo] [bar] | [foo] [bar] | |||
| [bar]: /url "title" | [bar]: /url "title" | |||
| . | . | |||
| <p><a href="/url" title="title">foo</a></p> | <p><a href="/url" title="title">foo</a></p> | |||
| . | . | |||
| . | . | |||
| skipping to change at line 5444 | skipping to change at line 5637 | |||
| labels define equivalent inline content: | labels define equivalent inline content: | |||
| . | . | |||
| [bar][foo\!] | [bar][foo\!] | |||
| [foo!]: /url | [foo!]: /url | |||
| . | . | |||
| <p>[bar][foo!]</p> | <p>[bar][foo!]</p> | |||
| . | . | |||
| A [collapsed reference link](#collapsed-reference-link) | [Link labels](#link-label) cannot contain brackets, unless they are | |||
| <a id="collapsed-reference-link"></a> consists of a [link | backslash-escaped: | |||
| . | ||||
| [foo][ref[] | ||||
| [ref[]: /uri | ||||
| . | ||||
| <p>[foo][ref[]</p> | ||||
| <p>[ref[]: /uri</p> | ||||
| . | ||||
| . | ||||
| [foo][ref[bar]] | ||||
| [ref[bar]]: /uri | ||||
| . | ||||
| <p>[foo][ref[bar]]</p> | ||||
| <p>[ref[bar]]: /uri</p> | ||||
| . | ||||
| . | ||||
| [[[foo]]] | ||||
| [[[foo]]]: /url | ||||
| . | ||||
| <p>[[[foo]]]</p> | ||||
| <p>[[[foo]]]: /url</p> | ||||
| . | ||||
| . | ||||
| [foo][ref\[] | ||||
| [ref\[]: /uri | ||||
| . | ||||
| <p><a href="/uri">foo</a></p> | ||||
| . | ||||
| A [collapsed reference link](@collapsed-reference-link) | ||||
| consists of a [link | ||||
| label](#link-label) that [matches](#matches) a [link reference | label](#link-label) that [matches](#matches) a [link reference | |||
| definition](#link-reference-definition) elsewhere in the | definition](#link-reference-definition) elsewhere in the | |||
| document, optional whitespace, and the string `[]`. The contents of the | document, optional whitespace, and the string `[]`. The contents of the | |||
| first link label are parsed as inlines, which are used as the link's | first link label are parsed as inlines, which are used as the link's | |||
| text. The link's URI and title are provided by the matching reference | text. The link's URI and title are provided by the matching reference | |||
| link definition. Thus, `[foo][]` is equivalent to `[foo][foo]`. | link definition. Thus, `[foo][]` is equivalent to `[foo][foo]`. | |||
| . | . | |||
| [foo][] | [foo][] | |||
| skipping to change at line 5491 | skipping to change at line 5722 | |||
| . | . | |||
| [foo] | [foo] | |||
| [] | [] | |||
| [foo]: /url "title" | [foo]: /url "title" | |||
| . | . | |||
| <p><a href="/url" title="title">foo</a></p> | <p><a href="/url" title="title">foo</a></p> | |||
| . | . | |||
| A [shortcut reference link](#shortcut-reference-link) | A [shortcut reference link](@shortcut-reference-link) | |||
| <a id="shortcut-reference-link"></a> consists of a [link | consists of a [link | |||
| label](#link-label) that [matches](#matches) a [link reference | label](#link-label) that [matches](#matches) a [link reference | |||
| definition](#link-reference-definition) elsewhere in the | definition](#link-reference-definition) elsewhere in the | |||
| document and is not followed by `[]` or a link label. | document and is not followed by `[]` or a link label. | |||
| The contents of the first link label are parsed as inlines, | The contents of the first link label are parsed as inlines, | |||
| which are used as the link's text. the link's URI and title | which are used as the link's text. the link's URI and title | |||
| are provided by the matching link reference definition. | are provided by the matching link reference definition. | |||
| Thus, `[foo]` is equivalent to `[foo][]`. | Thus, `[foo]` is equivalent to `[foo][]`. | |||
| . | . | |||
| [foo] | [foo] | |||
| skipping to change at line 5546 | skipping to change at line 5777 | |||
| opening bracket to avoid links: | opening bracket to avoid links: | |||
| . | . | |||
| \[foo] | \[foo] | |||
| [foo]: /url "title" | [foo]: /url "title" | |||
| . | . | |||
| <p>[foo]</p> | <p>[foo]</p> | |||
| . | . | |||
| Note that this is a link, because link labels bind more tightly | Note that this is a link, because a link label ends with the first | |||
| than emphasis: | following closing bracket: | |||
| . | . | |||
| [foo*]: /url | [foo*]: /url | |||
| *[foo*] | *[foo*] | |||
| . | . | |||
| <p>*<a href="/url">foo*</a></p> | <p>*<a href="/url">foo*</a></p> | |||
| . | . | |||
| However, this is not, because link labels bind less | This is a link too, for the same reason: | |||
| tightly than code backticks: | ||||
| . | . | |||
| [foo`]: /url | [foo`]: /url | |||
| [foo`]` | [foo`]` | |||
| . | . | |||
| <p>[foo<code>]</code></p> | <p>[foo<code>]</code></p> | |||
| . | . | |||
| Link labels can contain matched square brackets: | ||||
| . | ||||
| [[[foo]]] | ||||
| [[[foo]]]: /url | ||||
| . | ||||
| <p><a href="/url">[[foo]]</a></p> | ||||
| . | ||||
| . | ||||
| [[[foo]]] | ||||
| [[[foo]]]: /url1 | ||||
| [foo]: /url2 | ||||
| . | ||||
| <p><a href="/url1">[[foo]]</a></p> | ||||
| . | ||||
| For non-matching brackets, use backslash escapes: | ||||
| . | ||||
| [\[foo] | ||||
| [\[foo]: /url | ||||
| . | ||||
| <p><a href="/url">[foo</a></p> | ||||
| . | ||||
| Full references take precedence over shortcut references: | Full references take precedence over shortcut references: | |||
| . | . | |||
| [foo][bar] | [foo][bar] | |||
| [foo]: /url1 | [foo]: /url1 | |||
| [bar]: /url2 | [bar]: /url2 | |||
| . | . | |||
| <p><a href="/url2">foo</a></p> | <p><a href="/url2">foo</a></p> | |||
| . | . | |||
| skipping to change at line 5645 | skipping to change at line 5846 | |||
| [foo][bar][baz] | [foo][bar][baz] | |||
| [baz]: /url1 | [baz]: /url1 | |||
| [foo]: /url2 | [foo]: /url2 | |||
| . | . | |||
| <p>[foo]<a href="/url1">bar</a></p> | <p>[foo]<a href="/url1">bar</a></p> | |||
| . | . | |||
| ## Images | ## Images | |||
| An (unescaped) exclamation mark (`!`) followed by a reference or | Syntax for images is like the syntax for links, with one | |||
| inline link will be parsed as an image. The plain string content | difference. Instead of [link text](#link-text), we have an [image | |||
| of the link label will be used as the image's alt text, and the link | description](@image-description). The rules for this are the | |||
| title, if any, will be used as the image's title. | same as for [link text](#link-text), except that (a) an | |||
| image description starts with ` |  | |||
| . | . | |||
| <p><img src="/url" alt="foo" title="title" /></p> | <p><img src="/url" alt="foo" title="title" /></p> | |||
| . | . | |||
| . | . | |||
| ![foo *bar*] | ![foo *bar*] | |||
| [foo *bar*]: train.jpg "train & tracks" | [foo *bar*]: train.jpg "train & tracks" | |||
| . | . | |||
| <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> | <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> | |||
| . | . | |||
| Note that in the above example, the alt text is `foo bar`, not `foo | . | |||
| *bar*` or `foo <em>bar</em>` or `foo <em>bar</em>`. Only | ](/url2) | |||
| the plain string content is rendered, without formatting. | . | |||
| <p></p> | ||||
| . | ||||
| . | ||||
| ](/url2) | ||||
| . | ||||
| <p><img src="/url2" alt="foo bar" /></p> | ||||
| . | ||||
| Though this spec is concerned with parsing, not rendering, it is | ||||
| recommended that in rendering to HTML, only the plain string content | ||||
| of the [image description](#image-description) be used. Note that in | ||||
| the above example, the alt attribute's value is `foo bar`, not `foo | ||||
| [bar](/url)` or `foo <a href="/url">bar</a>`. Only the plain string | ||||
| content is rendered, without formatting. | ||||
| . | . | |||
| ![foo *bar*][] | ![foo *bar*][] | |||
| [foo *bar*]: train.jpg "train & tracks" | [foo *bar*]: train.jpg "train & tracks" | |||
| . | . | |||
| <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> | <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> | |||
| . | . | |||
| . | . | |||
| skipping to change at line 5784 | skipping to change at line 6005 | |||
| . | . | |||
| . | . | |||
| ![*foo* bar] | ![*foo* bar] | |||
| [*foo* bar]: /url "title" | [*foo* bar]: /url "title" | |||
| . | . | |||
| <p><img src="/url" alt="foo bar" title="title" /></p> | <p><img src="/url" alt="foo bar" title="title" /></p> | |||
| . | . | |||
| Note that link labels cannot contain unescaped brackets: | ||||
| . | . | |||
| ![[foo]] | ![[foo]] | |||
| [[foo]]: /url "title" | [[foo]]: /url "title" | |||
| . | . | |||
| <p><img src="/url" alt="[foo]" title="title" /></p> | <p>![[foo]]</p> | |||
| <p>[[foo]]: /url "title"</p> | ||||
| . | . | |||
| The link labels are case-insensitive: | The link labels are case-insensitive: | |||
| . | . | |||
| ![Foo] | ![Foo] | |||
| [foo]: /url "title" | [foo]: /url "title" | |||
| . | . | |||
| <p><img src="/url" alt="Foo" title="title" /></p> | <p><img src="/url" alt="Foo" title="title" /></p> | |||
| skipping to change at line 5826 | skipping to change at line 6050 | |||
| . | . | |||
| \![foo] | \![foo] | |||
| [foo]: /url "title" | [foo]: /url "title" | |||
| . | . | |||
| <p>!<a href="/url" title="title">foo</a></p> | <p>!<a href="/url" title="title">foo</a></p> | |||
| . | . | |||
| ## Autolinks | ## Autolinks | |||
| Autolinks are absolute URIs and email addresses inside `<` and `>`. | [Autolinks](@autolink) are absolute URIs and email addresses inside `<` and `>`. | |||
| They are parsed as links, with the URL or email address as the link | They are parsed as links, with the URL or email address as the link | |||
| label. | label. | |||
| A [URI autolink](#uri-autolink) <a id="uri-autolink"></a> | A [URI autolink](@uri-autolink) | |||
| consists of `<`, followed by an [absolute | consists of `<`, followed by an [absolute | |||
| URI](#absolute-uri) not containing `<`, followed by `>`. It is parsed | URI](#absolute-uri) not containing `<`, followed by `>`. It is parsed | |||
| as a link to the URI, with the URI as the link's label. | as a link to the URI, with the URI as the link's label. | |||
| An [absolute URI](#absolute-uri), <a id="absolute-uri"></a> | An [absolute URI](@absolute-uri), | |||
| for these purposes, consists of a [scheme](#scheme) followed by a colon (`:`) | for these purposes, consists of a [scheme](#scheme) followed by a colon (`:`) | |||
| followed by zero or more characters other than ASCII whitespace and | followed by zero or more characters other than ASCII whitespace and | |||
| control characters, `<`, and `>`. If the URI includes these characters, | control characters, `<`, and `>`. If the URI includes these characters, | |||
| you must use percent-encoding (e.g. `%20` for a space). | you must use percent-encoding (e.g. `%20` for a space). | |||
| The following [schemes](#scheme) <a id="scheme"></a> | The following [schemes](@scheme) | |||
| are recognized (case-insensitive): | are recognized (case-insensitive): | |||
| `coap`, `doi`, `javascript`, `aaa`, `aaas`, `about`, `acap`, `cap`, | `coap`, `doi`, `javascript`, `aaa`, `aaas`, `about`, `acap`, `cap`, | |||
| `cid`, `crid`, `data`, `dav`, `dict`, `dns`, `file`, `ftp`, `geo`, `go`, | `cid`, `crid`, `data`, `dav`, `dict`, `dns`, `file`, `ftp`, `geo`, `go`, | |||
| `gopher`, `h323`, `http`, `https`, `iax`, `icap`, `im`, `imap`, `info`, | `gopher`, `h323`, `http`, `https`, `iax`, `icap`, `im`, `imap`, `info`, | |||
| `ipp`, `iris`, `iris.beep`, `iris.xpc`, `iris.xpcs`, `iris.lwz`, `ldap`, | `ipp`, `iris`, `iris.beep`, `iris.xpc`, `iris.xpcs`, `iris.lwz`, `ldap`, | |||
| `mailto`, `mid`, `msrp`, `msrps`, `mtqp`, `mupdate`, `news`, `nfs`, | `mailto`, `mid`, `msrp`, `msrps`, `mtqp`, `mupdate`, `news`, `nfs`, | |||
| `ni`, `nih`, `nntp`, `opaquelocktoken`, `pop`, `pres`, `rtsp`, | `ni`, `nih`, `nntp`, `opaquelocktoken`, `pop`, `pres`, `rtsp`, | |||
| `service`, `session`, `shttp`, `sieve`, `sip`, `sips`, `sms`, `snmp`,` | `service`, `session`, `shttp`, `sieve`, `sip`, `sips`, `sms`, `snmp`,` | |||
| soap.beep`, `soap.beeps`, `tag`, `tel`, `telnet`, `tftp`, `thismessage`, | soap.beep`, `soap.beeps`, `tag`, `tel`, `telnet`, `tftp`, `thismessage`, | |||
| `tn3270`, `tip`, `tv`, `urn`, `vemmi`, `ws`, `wss`, `xcon`, | `tn3270`, `tip`, `tv`, `urn`, `vemmi`, `ws`, `wss`, `xcon`, | |||
| skipping to change at line 5903 | skipping to change at line 6127 | |||
| . | . | |||
| Spaces are not allowed in autolinks: | Spaces are not allowed in autolinks: | |||
| . | . | |||
| <http://foo.bar/baz bim> | <http://foo.bar/baz bim> | |||
| . | . | |||
| <p><http://foo.bar/baz bim></p> | <p><http://foo.bar/baz bim></p> | |||
| . | . | |||
| An [email autolink](#email-autolink) <a id="email-autolink"></a> | An [email autolink](@email-autolink) | |||
| consists of `<`, followed by an [email address](#email-address), | consists of `<`, followed by an [email address](#email-address), | |||
| followed by `>`. The link's label is the email address, | followed by `>`. The link's label is the email address, | |||
| and the URL is `mailto:` followed by the email address. | and the URL is `mailto:` followed by the email address. | |||
| An [email address](#email-address), <a id="email-address"></a> | An [email address](@email-address), | |||
| for these purposes, is anything that matches | for these purposes, is anything that matches | |||
| the [non-normative regex from the HTML5 | the [non-normative regex from the HTML5 | |||
| spec](http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#e-m ail-state-%28type=email%29): | spec](http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#e-m ail-state-%28type=email%29): | |||
| /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0- 9])? | /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0- 9])? | |||
| (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/ | (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/ | |||
| Examples of email autolinks: | Examples of email autolinks: | |||
| . | . | |||
| skipping to change at line 5983 | skipping to change at line 6207 | |||
| ## Raw HTML | ## Raw HTML | |||
| Text between `<` and `>` that looks like an HTML tag is parsed as a | Text between `<` and `>` that looks like an HTML tag is parsed as a | |||
| raw HTML tag and will be rendered in HTML without escaping. | raw HTML tag and will be rendered in HTML without escaping. | |||
| Tag and attribute names are not limited to current HTML tags, | Tag and attribute names are not limited to current HTML tags, | |||
| so custom tags (and even, say, DocBook tags) may be used. | so custom tags (and even, say, DocBook tags) may be used. | |||
| Here is the grammar for tags: | Here is the grammar for tags: | |||
| A [tag name](#tag-name) <a id="tag-name"></a> consists of an ASCII letter | A [tag name](@tag-name) consists of an ASCII letter | |||
| followed by zero or more ASCII letters or digits. | followed by zero or more ASCII letters or digits. | |||
| An [attribute](#attribute) <a id="attribute"></a> consists of whitespace, | An [attribute](@attribute) consists of whitespace, | |||
| an **attribute name**, and an optional **attribute value | an [attribute name](#attribute-name), and an optional | |||
| specification**. | [attribute value specification](#attribute-value-specification). | |||
| An [attribute name](#attribute-name) <a id="attribute-name"></a> | An [attribute name](@attribute-name) | |||
| consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII | consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII | |||
| letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML | letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML | |||
| specification restricted to ASCII. HTML5 is laxer.) | specification restricted to ASCII. HTML5 is laxer.) | |||
| An [attribute value specification](#attribute-value-specification) | An [attribute value specification](@attribute-value-specification) | |||
| <a id="attribute-value-specification"></a> consists of optional whitespace, | consists of optional whitespace, | |||
| a `=` character, optional whitespace, and an [attribute | a `=` character, optional whitespace, and an [attribute | |||
| value](#attribute-value). | value](#attribute-value). | |||
| An [attribute value](#attribute-value) <a id="attribute-value"></a> | An [attribute value](@attribute-value) | |||
| consists of an [unquoted attribute value](#unquoted-attribute-value), | consists of an [unquoted attribute value](#unquoted-attribute-value), | |||
| a [single-quoted attribute value](#single-quoted-attribute-value), | a [single-quoted attribute value](#single-quoted-attribute-value), | |||
| or a [double-quoted attribute value](#double-quoted-attribute-value). | or a [double-quoted attribute value](#double-quoted-attribute-value). | |||
| An [unquoted attribute value](#unquoted-attribute-value) | An [unquoted attribute value](@unquoted-attribute-value) | |||
| <a id="unquoted-attribute-value"></a> is a nonempty string of characters not | is a nonempty string of characters not | |||
| including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``. | including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``. | |||
| A [single-quoted attribute value](#single-quoted-attribute-value) | A [single-quoted attribute value](@single-quoted-attribute-value) | |||
| <a id="single-quoted-attribute-value"></a> consists of `'`, zero or more | consists of `'`, zero or more | |||
| characters not including `'`, and a final `'`. | characters not including `'`, and a final `'`. | |||
| A [double-quoted attribute value](#double-quoted-attribute-value) | A [double-quoted attribute value](@double-quoted-attribute-value) | |||
| <a id="double-quoted-attribute-value"></a> consists of `"`, zero or more | consists of `"`, zero or more | |||
| characters not including `"`, and a final `"`. | characters not including `"`, and a final `"`. | |||
| An [open tag](#open-tag) <a id="open-tag"></a> consists of a `<` character, | An [open tag](@open-tag) consists of a `<` character, | |||
| a [tag name](#tag-name), zero or more [attributes](#attribute), | a [tag name](#tag-name), zero or more [attributes](#attribute), | |||
| optional whitespace, an optional `/` character, and a `>` character. | optional whitespace, an optional `/` character, and a `>` character. | |||
| A [closing tag](#closing-tag) <a id="closing-tag"></a> consists of the | A [closing tag](@closing-tag) consists of the | |||
| string `</`, a [tag name](#tag-name), optional whitespace, and the | string `</`, a [tag name](#tag-name), optional whitespace, and the | |||
| character `>`. | character `>`. | |||
| An [HTML comment](#html-comment) <a id="html-comment"></a> consists of the | An [HTML comment](@html-comment) consists of the | |||
| string `<!--`, a string of characters not including the string `--`, and | string `<!--`, a string of characters not including the string `--`, and | |||
| the string `-->`. | the string `-->`. | |||
| A [processing instruction](#processing-instruction) | A [processing instruction](@processing-instruction) | |||
| <a id="processing-instruction"></a> consists of the string `<?`, a string | consists of the string `<?`, a string | |||
| of characters not including the string `?>`, and the string | of characters not including the string `?>`, and the string | |||
| `?>`. | `?>`. | |||
| A [declaration](#declaration) <a id="declaration"></a> consists of the | A [declaration](@declaration) consists of the | |||
| string `<!`, a name consisting of one or more uppercase ASCII letters, | string `<!`, a name consisting of one or more uppercase ASCII letters, | |||
| whitespace, a string of characters not including the character `>`, and | whitespace, a string of characters not including the character `>`, and | |||
| the character `>`. | the character `>`. | |||
| A [CDATA section](#cdata-section) <a id="cdata-section"></a> consists of | A [CDATA section](@cdata-section) consists of | |||
| the string `<![CDATA[`, a string of characters not including the string | the string `<![CDATA[`, a string of characters not including the string | |||
| `]]>`, and the string `]]>`. | `]]>`, and the string `]]>`. | |||
| An [HTML tag](#html-tag) <a id="html-tag"></a> consists of an [open | An [HTML tag](@html-tag) consists of an [open | |||
| tag](#open-tag), a [closing tag](#closing-tag), an [HTML | tag](#open-tag), a [closing tag](#closing-tag), an [HTML | |||
| comment](#html-comment), a [processing | comment](#html-comment), a [processing | |||
| instruction](#processing-instruction), an [element type | instruction](#processing-instruction), an [element type | |||
| declaration](#element-type-declaration), or a [CDATA | declaration](#element-type-declaration), or a [CDATA | |||
| section](#cdata-section). | section](#cdata-section). | |||
| Here are some simple open tags: | Here are some simple open tags: | |||
| . | . | |||
| <a><bab><c2c> | <a><bab><c2c> | |||
| skipping to change at line 6211 | skipping to change at line 6435 | |||
| . | . | |||
| <a href="\""> | <a href="\""> | |||
| . | . | |||
| <p><a href="""></p> | <p><a href="""></p> | |||
| . | . | |||
| ## Hard line breaks | ## Hard line breaks | |||
| A line break (not in a code span or HTML tag) that is preceded | A line break (not in a code span or HTML tag) that is preceded | |||
| by two or more spaces is parsed as a [hard line | by two or more spaces and does not occur at the end of a block | |||
| break](#hard-line-break)<a id="hard-line-break"></a> (rendered | is parsed as a [hard line break](@hard-line-break) (rendered | |||
| in HTML as a `<br />` tag): | in HTML as a `<br />` tag): | |||
| . | . | |||
| foo | foo | |||
| baz | baz | |||
| . | . | |||
| <p>foo<br /> | <p>foo<br /> | |||
| baz</p> | baz</p> | |||
| . | . | |||
| skipping to change at line 6315 | skipping to change at line 6539 | |||
| . | . | |||
| . | . | |||
| <a href="foo\ | <a href="foo\ | |||
| bar"> | bar"> | |||
| . | . | |||
| <p><a href="foo\ | <p><a href="foo\ | |||
| bar"></p> | bar"></p> | |||
| . | . | |||
| Hard line breaks are for separating inline content within a block. | ||||
| Neither syntax for hard line breaks works at the end of a paragraph or | ||||
| other block element: | ||||
| . | ||||
| foo\ | ||||
| . | ||||
| <p>foo\</p> | ||||
| . | ||||
| . | ||||
| foo | ||||
| . | ||||
| <p>foo</p> | ||||
| . | ||||
| . | ||||
| ### foo\ | ||||
| . | ||||
| <h3>foo\</h3> | ||||
| . | ||||
| . | ||||
| ### foo | ||||
| . | ||||
| <h3>foo</h3> | ||||
| . | ||||
| ## Soft line breaks | ## Soft line breaks | |||
| A regular line break (not in a code span or HTML tag) that is not | A regular line break (not in a code span or HTML tag) that is not | |||
| preceded by two or more spaces is parsed as a softbreak. (A | preceded by two or more spaces is parsed as a softbreak. (A | |||
| softbreak may be rendered in HTML either as a newline or as a space. | softbreak may be rendered in HTML either as a newline or as a space. | |||
| The result will be the same in browsers. In the examples here, a | The result will be the same in browsers. In the examples here, a | |||
| newline will be used.) | newline will be used.) | |||
| . | . | |||
| foo | foo | |||
| End of changes. 97 change blocks. | ||||
| 178 lines changed or deleted | 430 lines changed or added | |||
This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||