spec.txt | spec.txt | |||
---|---|---|---|---|
--- | --- | |||
title: CommonMark Spec | title: CommonMark Spec | |||
author: | author: | |||
- John MacFarlane | - John MacFarlane | |||
version: 0.10 | version: 0.11 | |||
date: 2014-11-06 | date: 2014-11-10 | |||
... | ... | |||
# Introduction | # Introduction | |||
## What is Markdown? | ## What is Markdown? | |||
Markdown is a plain text format for writing structured documents, | Markdown is a plain text format for writing structured documents, | |||
based on conventions used for indicating formatting in email and | based on conventions used for indicating formatting in email and | |||
usenet posts. It was developed in 2004 by John Gruber, who wrote | usenet posts. It was developed in 2004 by John Gruber, who wrote | |||
the first Markdown-to-HTML converter in perl, and it soon became | the first Markdown-to-HTML converter in perl, and it soon became | |||
skipping to change at line 194 | skipping to change at line 194 | |||
This document is generated from a text file, `spec.txt`, written | This document is generated from a text file, `spec.txt`, written | |||
in Markdown with a small extension for the side-by-side tests. | in Markdown with a small extension for the side-by-side tests. | |||
The script `spec2md.pl` can be used to turn `spec.txt` into pandoc | The script `spec2md.pl` can be used to turn `spec.txt` into pandoc | |||
Markdown, which can then be converted into other formats. | Markdown, which can then be converted into other formats. | |||
In the examples, the `→` character is used to represent tabs. | In the examples, the `→` character is used to represent tabs. | |||
# Preprocessing | # Preprocessing | |||
A [line](#line) <a id="line"></a> | A [line](@line) | |||
is a sequence of zero or more [characters](#character) followed by a | is a sequence of zero or more [characters](#character) followed by a | |||
line ending (CR, LF, or CRLF) or by the end of file. | line ending (CR, LF, or CRLF) or by the end of file. | |||
A [character](#character)<a id="character"></a> is a unicode code point. | A [character](@character) is a unicode code point. | |||
This spec does not specify an encoding; it thinks of lines as composed | This spec does not specify an encoding; it thinks of lines as composed | |||
of characters rather than bytes. A conforming parser may be limited | of characters rather than bytes. A conforming parser may be limited | |||
to a certain encoding. | to a certain encoding. | |||
Tabs in lines are expanded to spaces, with a tab stop of 4 characters: | Tabs in lines are expanded to spaces, with a tab stop of 4 characters: | |||
. | . | |||
→foo→baz→→bim | →foo→baz→→bim | |||
. | . | |||
<pre><code>foo baz bim | <pre><code>foo baz bim | |||
skipping to change at line 224 | skipping to change at line 224 | |||
ὐ→a | ὐ→a | |||
. | . | |||
<pre><code>a a | <pre><code>a a | |||
ὐ a | ὐ a | |||
</code></pre> | </code></pre> | |||
. | . | |||
Line endings are replaced by newline characters (LF). | Line endings are replaced by newline characters (LF). | |||
A line containing no characters, or a line containing only spaces (after | A line containing no characters, or a line containing only spaces (after | |||
tab expansion), is called a [blank line](#blank-line). | tab expansion), is called a [blank line](@blank-line). | |||
<a id="blank-line"></a> | ||||
# Blocks and inlines | # Blocks and inlines | |||
We can think of a document as a sequence of [blocks](#block)<a | We can think of a document as a sequence of | |||
id="block"></a>---structural elements like paragraphs, block quotations, | [blocks](@block)---structural | |||
elements like paragraphs, block quotations, | ||||
lists, headers, rules, and code blocks. Blocks can contain other | lists, headers, rules, and code blocks. Blocks can contain other | |||
blocks, or they can contain [inline](#inline)<a id="inline"></a> content: | blocks, or they can contain [inline](@inline) content: | |||
words, spaces, links, emphasized text, images, and inline code. | words, spaces, links, emphasized text, images, and inline code. | |||
## Precedence | ## Precedence | |||
Indicators of block structure always take precedence over indicators | Indicators of block structure always take precedence over indicators | |||
of inline structure. So, for example, the following is a list with | of inline structure. So, for example, the following is a list with | |||
two items, not a list with one item containing a code span: | two items, not a list with one item containing a code span: | |||
. | . | |||
- `one | - `one | |||
skipping to change at line 263 | skipping to change at line 263 | |||
paragraphs, headers, and other block constructs can be parsed for inline | paragraphs, headers, and other block constructs can be parsed for inline | |||
structure. The second step requires information about link reference | structure. The second step requires information about link reference | |||
definitions that will be available only at the end of the first | definitions that will be available only at the end of the first | |||
step. Note that the first step requires processing lines in sequence, | step. Note that the first step requires processing lines in sequence, | |||
but the second can be parallelized, since the inline parsing of | but the second can be parallelized, since the inline parsing of | |||
one block element does not affect the inline parsing of any other. | one block element does not affect the inline parsing of any other. | |||
## Container blocks and leaf blocks | ## Container blocks and leaf blocks | |||
We can divide blocks into two types: | We can divide blocks into two types: | |||
[container blocks](#container-block), <a id="container-block"></a> | [container blocks](@container-block), | |||
which can contain other blocks, and [leaf blocks](#leaf-block), | which can contain other blocks, and [leaf blocks](@leaf-block), | |||
<a id="leaf-block"></a> which cannot. | which cannot. | |||
# Leaf blocks | # Leaf blocks | |||
This section describes the different kinds of leaf block that make up a | This section describes the different kinds of leaf block that make up a | |||
Markdown document. | Markdown document. | |||
## Horizontal rules | ## Horizontal rules | |||
A line consisting of 0-3 spaces of indentation, followed by a sequence | A line consisting of 0-3 spaces of indentation, followed by a sequence | |||
of three or more matching `-`, `_`, or `*` characters, each followed | of three or more matching `-`, `_`, or `*` characters, each followed | |||
optionally by any number of spaces, forms a [horizontal | optionally by any number of spaces, forms a [horizontal | |||
rule](#horizontal-rule). <a id="horizontal-rule"></a> | rule](@horizontal-rule). | |||
. | . | |||
*** | *** | |||
--- | --- | |||
___ | ___ | |||
. | . | |||
<hr /> | <hr /> | |||
<hr /> | <hr /> | |||
<hr /> | <hr /> | |||
. | . | |||
skipping to change at line 477 | skipping to change at line 477 | |||
- * * * | - * * * | |||
. | . | |||
<ul> | <ul> | |||
<li>Foo</li> | <li>Foo</li> | |||
<li><hr /></li> | <li><hr /></li> | |||
</ul> | </ul> | |||
. | . | |||
## ATX headers | ## ATX headers | |||
An [ATX header](#atx-header) <a id="atx-header"></a> | An [ATX header](@atx-header) | |||
consists of a string of characters, parsed as inline content, between an | consists of a string of characters, parsed as inline content, between an | |||
opening sequence of 1--6 unescaped `#` characters and an optional | opening sequence of 1--6 unescaped `#` characters and an optional | |||
closing sequence of any number of `#` characters. The opening sequence | closing sequence of any number of `#` characters. The opening sequence | |||
of `#` characters cannot be followed directly by a nonspace character. | of `#` characters cannot be followed directly by a nonspace character. | |||
The optional closing sequence of `#`s must be preceded by a space and may be | The optional closing sequence of `#`s must be preceded by a space and may be | |||
followed by spaces only. The opening `#` character may be indented 0-3 | followed by spaces only. The opening `#` character may be indented 0-3 | |||
spaces. The raw contents of the header are stripped of leading and | spaces. The raw contents of the header are stripped of leading and | |||
trailing spaces before being parsed as inline content. The header level | trailing spaces before being parsed as inline content. The header level | |||
is equal to the number of `#` characters in the opening sequence. | is equal to the number of `#` characters in the opening sequence. | |||
skipping to change at line 675 | skipping to change at line 675 | |||
# | # | |||
### ### | ### ### | |||
. | . | |||
<h2></h2> | <h2></h2> | |||
<h1></h1> | <h1></h1> | |||
<h3></h3> | <h3></h3> | |||
. | . | |||
## Setext headers | ## Setext headers | |||
A [setext header](#setext-header) <a id="setext-header"></a> | A [setext header](@setext-header) | |||
consists of a line of text, containing at least one nonspace character, | consists of a line of text, containing at least one nonspace character, | |||
with no more than 3 spaces indentation, followed by a [setext header | with no more than 3 spaces indentation, followed by a [setext header | |||
underline](#setext-header-underline). The line of text must be | underline](#setext-header-underline). The line of text must be | |||
one that, were it not followed by the setext header underline, | one that, were it not followed by the setext header underline, | |||
would be interpreted as part of a paragraph: it cannot be a code | would be interpreted as part of a paragraph: it cannot be a code | |||
block, header, blockquote, horizontal rule, or list. A [setext header | block, header, blockquote, horizontal rule, or list. A [setext header | |||
underline](#setext-header-underline) <a id="setext-header-underline"></a> | underline](@setext-header-underline) | |||
is a sequence of `=` characters or a sequence of `-` characters, with no | is a sequence of `=` characters or a sequence of `-` characters, with no | |||
more than 3 spaces indentation and any number of trailing | more than 3 spaces indentation and any number of trailing | |||
spaces. The header is a level 1 header if `=` characters are used, and | spaces. The header is a level 1 header if `=` characters are used, and | |||
a level 2 header if `-` characters are used. The contents of the header | a level 2 header if `-` characters are used. The contents of the header | |||
are the result of parsing the first line as Markdown inline content. | are the result of parsing the first line as Markdown inline content. | |||
In general, a setext header need not be preceded or followed by a | In general, a setext header need not be preceded or followed by a | |||
blank line. However, it cannot interrupt a paragraph, so when a | blank line. However, it cannot interrupt a paragraph, so when a | |||
setext header comes after a paragraph, a blank line is needed between | setext header comes after a paragraph, a blank line is needed between | |||
them. | them. | |||
skipping to change at line 946 | skipping to change at line 946 | |||
. | . | |||
\> foo | \> foo | |||
------ | ------ | |||
. | . | |||
<h2>> foo</h2> | <h2>> foo</h2> | |||
. | . | |||
## Indented code blocks | ## Indented code blocks | |||
An [indented code block](#indented-code-block) | An [indented code block](@indented-code-block) | |||
<a id="indented-code-block"></a> is composed of one or more | is composed of one or more | |||
[indented chunks](#indented-chunk) separated by blank lines. | [indented chunks](#indented-chunk) separated by blank lines. | |||
An [indented chunk](#indented-chunk) <a id="indented-chunk"></a> | An [indented chunk](@indented-chunk) | |||
is a sequence of non-blank lines, each indented four or more | is a sequence of non-blank lines, each indented four or more | |||
spaces. An indented code block cannot interrupt a paragraph, so | spaces. An indented code block cannot interrupt a paragraph, so | |||
if it occurs before or after a paragraph, there must be an | if it occurs before or after a paragraph, there must be an | |||
intervening blank line. The contents of the code block are | intervening blank line. The contents of the code block are | |||
the literal contents of the lines, including trailing newlines, | the literal contents of the lines, including trailing newlines, | |||
minus four spaces of indentation. An indented code block has no | minus four spaces of indentation. An indented code block has no | |||
attributes. | attributes. | |||
. | . | |||
a simple | a simple | |||
skipping to change at line 1092 | skipping to change at line 1092 | |||
. | . | |||
foo | foo | |||
. | . | |||
<pre><code>foo | <pre><code>foo | |||
</code></pre> | </code></pre> | |||
. | . | |||
## Fenced code blocks | ## Fenced code blocks | |||
A [code fence](#code-fence) <a id="code-fence"></a> is a sequence | A [code fence](@code-fence) is a sequence | |||
of at least three consecutive backtick characters (`` ` ``) or | of at least three consecutive backtick characters (`` ` ``) or | |||
tildes (`~`). (Tildes and backticks cannot be mixed.) | tildes (`~`). (Tildes and backticks cannot be mixed.) | |||
A [fenced code block](#fenced-code-block) <a id="fenced-code-block"></a> | A [fenced code block](@fenced-code-block) | |||
begins with a code fence, indented no more than three spaces. | begins with a code fence, indented no more than three spaces. | |||
The line with the opening code fence may optionally contain some text | The line with the opening code fence may optionally contain some text | |||
following the code fence; this is trimmed of leading and trailing | following the code fence; this is trimmed of leading and trailing | |||
spaces and called the [info string](#info-string). | spaces and called the [info string](@info-string). | |||
<a id="info-string"></a> The info string may not contain any backtick | The info string may not contain any backtick | |||
characters. (The reason for this restriction is that otherwise | characters. (The reason for this restriction is that otherwise | |||
some inline code would be incorrectly interpreted as the | some inline code would be incorrectly interpreted as the | |||
beginning of a fenced code block.) | beginning of a fenced code block.) | |||
The content of the code block consists of all subsequent lines, until | The content of the code block consists of all subsequent lines, until | |||
a closing [code fence](#code-fence) of the same type as the code block | a closing [code fence](#code-fence) of the same type as the code block | |||
began with (backticks or tildes), and with at least as many backticks | began with (backticks or tildes), and with at least as many backticks | |||
or tildes as the opening code fence. If the leading code fence is | or tildes as the opening code fence. If the leading code fence is | |||
indented N spaces, then up to N spaces of indentation are removed from | indented N spaces, then up to N spaces of indentation are removed from | |||
each line of the content (if present). (If a content line is not | each line of the content (if present). (If a content line is not | |||
skipping to change at line 1451 | skipping to change at line 1451 | |||
``` | ``` | |||
``` aaa | ``` aaa | |||
``` | ``` | |||
. | . | |||
<pre><code>``` aaa | <pre><code>``` aaa | |||
</code></pre> | </code></pre> | |||
. | . | |||
## HTML blocks | ## HTML blocks | |||
An [HTML block tag](#html-block-tag) <a id="html-block-tag"></a> is | An [HTML block tag](@html-block-tag) is | |||
an [open tag](#open-tag) or [closing tag](#closing-tag) whose tag | an [open tag](#open-tag) or [closing tag](#closing-tag) whose tag | |||
name is one of the following (case-insensitive): | name is one of the following (case-insensitive): | |||
`article`, `header`, `aside`, `hgroup`, `blockquote`, `hr`, `iframe`, | `article`, `header`, `aside`, `hgroup`, `blockquote`, `hr`, `iframe`, | |||
`body`, `li`, `map`, `button`, `object`, `canvas`, `ol`, `caption`, | `body`, `li`, `map`, `button`, `object`, `canvas`, `ol`, `caption`, | |||
`output`, `col`, `p`, `colgroup`, `pre`, `dd`, `progress`, `div`, | `output`, `col`, `p`, `colgroup`, `pre`, `dd`, `progress`, `div`, | |||
`section`, `dl`, `table`, `td`, `dt`, `tbody`, `embed`, `textarea`, | `section`, `dl`, `table`, `td`, `dt`, `tbody`, `embed`, `textarea`, | |||
`fieldset`, `tfoot`, `figcaption`, `th`, `figure`, `thead`, `footer`, | `fieldset`, `tfoot`, `figcaption`, `th`, `figure`, `thead`, `footer`, | |||
`tr`, `form`, `ul`, `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `video`, | `tr`, `form`, `ul`, `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `video`, | |||
`script`, `style`. | `script`, `style`. | |||
An [HTML block](#html-block) <a id="html-block"></a> begins with an | An [HTML block](@html-block) begins with an | |||
[HTML block tag](#html-block-tag), [HTML comment](#html-comment), | [HTML block tag](#html-block-tag), [HTML comment](#html-comment), | |||
[processing instruction](#processing-instruction), | [processing instruction](#processing-instruction), | |||
[declaration](#declaration), or [CDATA section](#cdata-section). | [declaration](#declaration), or [CDATA section](#cdata-section). | |||
It ends when a [blank line](#blank-line) or the end of the | It ends when a [blank line](#blank-line) or the end of the | |||
input is encountered. The initial line may be indented up to three | input is encountered. The initial line may be indented up to three | |||
spaces, and subsequent lines may have any indentation. The contents | spaces, and subsequent lines may have any indentation. The contents | |||
of the HTML block are interpreted as raw HTML, and will not be escaped | of the HTML block are interpreted as raw HTML, and will not be escaped | |||
in HTML output. | in HTML output. | |||
Some simple examples: | Some simple examples: | |||
skipping to change at line 1736 | skipping to change at line 1736 | |||
. | . | |||
Moreover, blank lines are usually not necessary and can be | Moreover, blank lines are usually not necessary and can be | |||
deleted. The exception is inside `<pre>` tags; here, one can | deleted. The exception is inside `<pre>` tags; here, one can | |||
replace the blank lines with ` ` entities. | replace the blank lines with ` ` entities. | |||
So there is no important loss of expressive power with the new rule. | So there is no important loss of expressive power with the new rule. | |||
## Link reference definitions | ## Link reference definitions | |||
A [link reference definition](#link-reference-definition) | A [link reference definition](@link-reference-definition) | |||
<a id="link-reference-definition"></a> consists of a [link | consists of a [link | |||
label](#link-label), indented up to three spaces, followed | label](#link-label), indented up to three spaces, followed | |||
by a colon (`:`), optional blank space (including up to one | by a colon (`:`), optional blank space (including up to one | |||
newline), a [link destination](#link-destination), optional | newline), a [link destination](#link-destination), optional | |||
blank space (including up to one newline), and an optional [link | blank space (including up to one newline), and an optional [link | |||
title](#link-title), which if it is present must be separated | title](#link-title), which if it is present must be separated | |||
from the [link destination](#link-destination) by whitespace. | from the [link destination](#link-destination) by whitespace. | |||
No further non-space characters may occur on the line. | No further non-space characters may occur on the line. | |||
A [link reference-definition](#link-reference-definition) | A [link reference-definition](#link-reference-definition) | |||
does not correspond to a structural element of a document. Instead, it | does not correspond to a structural element of a document. Instead, it | |||
skipping to change at line 1961 | skipping to change at line 1961 | |||
> [foo]: /url | > [foo]: /url | |||
. | . | |||
<p><a href="/url">foo</a></p> | <p><a href="/url">foo</a></p> | |||
<blockquote> | <blockquote> | |||
</blockquote> | </blockquote> | |||
. | . | |||
## Paragraphs | ## Paragraphs | |||
A sequence of non-blank lines that cannot be interpreted as other | A sequence of non-blank lines that cannot be interpreted as other | |||
kinds of blocks forms a [paragraph](#paragraph).<a id="paragraph"></a> | kinds of blocks forms a [paragraph](@paragraph). | |||
The contents of the paragraph are the result of parsing the | The contents of the paragraph are the result of parsing the | |||
paragraph's raw content as inlines. The paragraph's raw content | paragraph's raw content as inlines. The paragraph's raw content | |||
is formed by concatenating the lines and removing initial and final | is formed by concatenating the lines and removing initial and final | |||
spaces. | spaces. | |||
A simple example with two paragraphs: | A simple example with two paragraphs: | |||
. | . | |||
aaa | aaa | |||
skipping to change at line 2100 | skipping to change at line 2100 | |||
> with these blocks as its content. | > with these blocks as its content. | |||
So, we explain what counts as a block quote or list item by explaining | So, we explain what counts as a block quote or list item by explaining | |||
how these can be *generated* from their contents. This should suffice | how these can be *generated* from their contents. This should suffice | |||
to define the syntax, although it does not give a recipe for *parsing* | to define the syntax, although it does not give a recipe for *parsing* | |||
these constructions. (A recipe is provided below in the section entitled | these constructions. (A recipe is provided below in the section entitled | |||
[A parsing strategy](#appendix-a-a-parsing-strategy).) | [A parsing strategy](#appendix-a-a-parsing-strategy).) | |||
## Block quotes | ## Block quotes | |||
A [block quote marker](#block-quote-marker) <a id="block-quote-marker"></a> | A [block quote marker](@block-quote-marker) | |||
consists of 0-3 spaces of initial indent, plus (a) the character `>` together | consists of 0-3 spaces of initial indent, plus (a) the character `>` together | |||
with a following space, or (b) a single character `>` not followed by a space. | with a following space, or (b) a single character `>` not followed by a space. | |||
The following rules define [block quotes](#block-quote): | The following rules define [block quotes](@block-quote): | |||
<a id="block-quote"></a> | ||||
1. **Basic case.** If a string of lines *Ls* constitute a sequence | 1. **Basic case.** If a string of lines *Ls* constitute a sequence | |||
of blocks *Bs*, then the result of prepending a [block quote | of blocks *Bs*, then the result of prepending a [block quote | |||
marker](#block-quote-marker) to the beginning of each line in *Ls* | marker](#block-quote-marker) to the beginning of each line in *Ls* | |||
is a [block quote](#block-quote) containing *Bs*. | is a [block quote](#block-quote) containing *Bs*. | |||
2. **Laziness.** If a string of lines *Ls* constitute a [block | 2. **Laziness.** If a string of lines *Ls* constitute a [block | |||
quote](#block-quote) with contents *Bs*, then the result of deleting | quote](#block-quote) with contents *Bs*, then the result of deleting | |||
the initial [block quote marker](#block-quote-marker) from one or | the initial [block quote marker](#block-quote-marker) from one or | |||
more lines in which the next non-space character after the [block | more lines in which the next non-space character after the [block | |||
quote marker](#block-quote-marker) is [paragraph continuation | quote marker](#block-quote-marker) is [paragraph continuation | |||
text](#paragraph-continuation-text) is a block quote with *Bs* as | text](#paragraph-continuation-text) is a block quote with *Bs* as | |||
its content. <a id="paragraph-continuation-text"></a> | its content. | |||
[Paragraph continuation text](#paragraph-continuation-text) is text | [Paragraph continuation text](@paragraph-continuation-text) is text | |||
that will be parsed as part of the content of a paragraph, but does | that will be parsed as part of the content of a paragraph, but does | |||
not occur at the beginning of the paragraph. | not occur at the beginning of the paragraph. | |||
3. **Consecutiveness.** A document cannot contain two [block | 3. **Consecutiveness.** A document cannot contain two [block | |||
quotes](#block-quote) in a row unless there is a [blank | quotes](#block-quote) in a row unless there is a [blank | |||
line](#blank-line) between them. | line](#blank-line) between them. | |||
Nothing else counts as a [block quote](#block-quote). | Nothing else counts as a [block quote](#block-quote). | |||
Here is a simple example: | Here is a simple example: | |||
skipping to change at line 2461 | skipping to change at line 2460 | |||
<pre><code>code | <pre><code>code | |||
</code></pre> | </code></pre> | |||
</blockquote> | </blockquote> | |||
<blockquote> | <blockquote> | |||
<p>not code</p> | <p>not code</p> | |||
</blockquote> | </blockquote> | |||
. | . | |||
## List items | ## List items | |||
A [list marker](#list-marker) <a id="list-marker"></a> is a | A [list marker](@list-marker) is a | |||
[bullet list marker](#bullet-list-marker) or an [ordered list | [bullet list marker](#bullet-list-marker) or an [ordered list | |||
marker](#ordered-list-marker). | marker](#ordered-list-marker). | |||
A [bullet list marker](#bullet-list-marker) <a id="bullet-list-marker"></a> | A [bullet list marker](@bullet-list-marker) | |||
is a `-`, `+`, or `*` character. | is a `-`, `+`, or `*` character. | |||
An [ordered list marker](#ordered-list-marker) <a id="ordered-list-marker"></a> | An [ordered list marker](@ordered-list-marker) | |||
is a sequence of one of more digits (`0-9`), followed by either a | is a sequence of one of more digits (`0-9`), followed by either a | |||
`.` character or a `)` character. | `.` character or a `)` character. | |||
The following rules define [list items](#list-item):<a | The following rules define [list items](@list-item): | |||
id="list-item"></a> | ||||
1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of | 1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of | |||
blocks *Bs* starting with a non-space character and not separated | blocks *Bs* starting with a non-space character and not separated | |||
from each other by more than one blank line, and *M* is a list | from each other by more than one blank line, and *M* is a list | |||
marker *M* of width *W* followed by 0 < *N* < 5 spaces, then the result | marker *M* of width *W* followed by 0 < *N* < 5 spaces, then the result | |||
of prepending *M* and the following spaces to the first line of | of prepending *M* and the following spaces to the first line of | |||
*Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a | *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a | |||
list item with *Bs* as its contents. The type of the list item | list item with *Bs* as its contents. The type of the list item | |||
(bullet or ordered) is determined by the type of its list marker. | (bullet or ordered) is determined by the type of its list marker. | |||
If the list item is ordered, then it is also assigned a start | If the list item is ordered, then it is also assigned a start | |||
skipping to change at line 2919 | skipping to change at line 2917 | |||
> A block quote. | > A block quote. | |||
</code></pre> | </code></pre> | |||
. | . | |||
4. **Laziness.** If a string of lines *Ls* constitute a [list | 4. **Laziness.** If a string of lines *Ls* constitute a [list | |||
item](#list-item) with contents *Bs*, then the result of deleting | item](#list-item) with contents *Bs*, then the result of deleting | |||
some or all of the indentation from one or more lines in which the | some or all of the indentation from one or more lines in which the | |||
next non-space character after the indentation is | next non-space character after the indentation is | |||
[paragraph continuation text](#paragraph-continuation-text) is a | [paragraph continuation text](#paragraph-continuation-text) is a | |||
list item with the same contents and attributes.<a | list item with the same contents and attributes. The unindented | |||
id="lazy-continuation-line"></a> | lines are called | |||
[lazy continuation lines](@lazy-continuation-line). | ||||
Here is an example with [lazy continuation | Here is an example with [lazy continuation | |||
lines](#lazy-continuation-line): | lines](#lazy-continuation-line): | |||
. | . | |||
1. A paragraph | 1. A paragraph | |||
with two lines. | with two lines. | |||
indented code | indented code | |||
skipping to change at line 3296 | skipping to change at line 3295 | |||
The one case that needs special treatment is a list item that *starts* | The one case that needs special treatment is a list item that *starts* | |||
with indented code. How much indentation is required in that case, since | with indented code. How much indentation is required in that case, since | |||
we don't have a "first paragraph" to measure from? Rule #2 simply stipulates | we don't have a "first paragraph" to measure from? Rule #2 simply stipulates | |||
that in such cases, we require one space indentation from the list marker | that in such cases, we require one space indentation from the list marker | |||
(and then the normal four spaces for the indented code). This will match the | (and then the normal four spaces for the indented code). This will match the | |||
four-space rule in cases where the list marker plus its initial indentation | four-space rule in cases where the list marker plus its initial indentation | |||
takes four spaces (a common case), but diverge in other cases. | takes four spaces (a common case), but diverge in other cases. | |||
## Lists | ## Lists | |||
A [list](#list) <a id="list"></a> is a sequence of one or more | A [list](@list) is a sequence of one or more | |||
list items [of the same type](#of-the-same-type). The list items | list items [of the same type](#of-the-same-type). The list items | |||
may be separated by single [blank lines](#blank-line), but two | may be separated by single [blank lines](#blank-line), but two | |||
blank lines end all containing lists. | blank lines end all containing lists. | |||
Two list items are [of the same type](#of-the-same-type) | Two list items are [of the same type](@of-the-same-type) | |||
<a id="of-the-same-type"></a> if they begin with a [list | if they begin with a [list | |||
marker](#list-marker) of the same type. Two list markers are of the | marker](#list-marker) of the same type. Two list markers are of the | |||
same type if (a) they are bullet list markers using the same character | same type if (a) they are bullet list markers using the same character | |||
(`-`, `+`, or `*`) or (b) they are ordered list numbers with the same | (`-`, `+`, or `*`) or (b) they are ordered list numbers with the same | |||
delimiter (either `.` or `)`). | delimiter (either `.` or `)`). | |||
A list is an [ordered list](#ordered-list) <a id="ordered-list"></a> | A list is an [ordered list](@ordered-list) | |||
if its constituent list items begin with | if its constituent list items begin with | |||
[ordered list markers](#ordered-list-marker), and a [bullet | [ordered list markers](#ordered-list-marker), and a [bullet | |||
list](#bullet-list) <a id="bullet-list"></a> if its constituent list | list](@bullet-list) if its constituent list | |||
items begin with [bullet list markers](#bullet-list-marker). | items begin with [bullet list markers](#bullet-list-marker). | |||
The [start number](#start-number) <a id="start-number"></a> | The [start number](@start-number) | |||
of an [ordered list](#ordered-list) is determined by the list number of | of an [ordered list](#ordered-list) is determined by the list number of | |||
its initial list item. The numbers of subsequent list items are | its initial list item. The numbers of subsequent list items are | |||
disregarded. | disregarded. | |||
A list is [loose](#loose)<a id="loose"></a> if it any of its constituent | A list is [loose](@loose) if it any of its constituent | |||
list items are separated by blank lines, or if any of its constituent | list items are separated by blank lines, or if any of its constituent | |||
list items directly contain two block-level elements with a blank line | list items directly contain two block-level elements with a blank line | |||
between them. Otherwise a list is [tight](#tight).<a id="tight"></a> | between them. Otherwise a list is [tight](@tight). | |||
(The difference in HTML output is that paragraphs in a loose list are | (The difference in HTML output is that paragraphs in a loose list are | |||
wrapped in `<p>` tags, while paragraphs in a tight list are not.) | wrapped in `<p>` tags, while paragraphs in a tight list are not.) | |||
Changing the bullet or ordered list delimiter starts a new list: | Changing the bullet or ordered list delimiter starts a new list: | |||
. | . | |||
- foo | - foo | |||
- bar | - bar | |||
+ baz | + baz | |||
. | . | |||
skipping to change at line 3400 | skipping to change at line 3399 | |||
First, it is natural and not uncommon for people to start lists without | First, it is natural and not uncommon for people to start lists without | |||
blank lines: | blank lines: | |||
I need to buy | I need to buy | |||
- new shoes | - new shoes | |||
- a coat | - a coat | |||
- a plane ticket | - a plane ticket | |||
Second, we are attracted to a | Second, we are attracted to a | |||
> [principle of uniformity](#principle-of-uniformity):<a | > [principle of uniformity](@principle-of-uniformity): | |||
> id="principle-of-uniformity"></a> if a span of text has a certain | > if a span of text has a certain | |||
> meaning, it will continue to have the same meaning when put into a list | > meaning, it will continue to have the same meaning when put into a list | |||
> item. | > item. | |||
(Indeed, the spec for [list items](#list-item) presupposes this.) | (Indeed, the spec for [list items](#list-item) presupposes this.) | |||
This principle implies that if | This principle implies that if | |||
* I need to buy | * I need to buy | |||
- new shoes | - new shoes | |||
- a coat | - a coat | |||
- a plane ticket | - a plane ticket | |||
skipping to change at line 3919 | skipping to change at line 3918 | |||
With the goal of making this standard as HTML-agnostic as possible, all | With the goal of making this standard as HTML-agnostic as possible, all | |||
valid HTML entities in any context are recognized as such and | valid HTML entities in any context are recognized as such and | |||
converted into unicode characters before they are stored in the AST. | converted into unicode characters before they are stored in the AST. | |||
This allows implementations that target HTML output to trivially escape | This allows implementations that target HTML output to trivially escape | |||
the entities when generating HTML, and simplifies the job of | the entities when generating HTML, and simplifies the job of | |||
implementations targetting other languages, as these will only need to | implementations targetting other languages, as these will only need to | |||
handle the unicode chars and need not be HTML-entity aware. | handle the unicode chars and need not be HTML-entity aware. | |||
[Named entities](#name-entities) <a id="named-entities"></a> consist of `&` | [Named entities](@name-entities) consist of `&` | |||
+ any of the valid HTML5 entity names + `;`. The | + any of the valid HTML5 entity names + `;`. The | |||
[following document](http://www.whatwg.org/specs/web-apps/current-work/multipage /entities.json) | [following document](http://www.whatwg.org/specs/web-apps/current-work/multipage /entities.json) | |||
is used as an authoritative source of the valid entity names and their | is used as an authoritative source of the valid entity names and their | |||
corresponding codepoints. | corresponding codepoints. | |||
Conforming implementations that target HTML don't need to generate | Conforming implementations that target HTML don't need to generate | |||
entities for all the valid named entities that exist, with the exception | entities for all the valid named entities that exist, with the exception | |||
of `"` (`"`), `&` (`&`), `<` (`<`) and `>` (`>`), which | of `"` (`"`), `&` (`&`), `<` (`<`) and `>` (`>`), which | |||
always need to be written as entities for security reasons. | always need to be written as entities for security reasons. | |||
. | . | |||
& © Æ Ď ¾ ℋ ⅆ &Cl ockwiseContourIntegral; | & © Æ Ď ¾ ℋ ⅆ &Cl ockwiseContourIntegral; | |||
. | . | |||
<p> & © Æ Ď ¾ ℋ ⅆ ∲</p> | <p> & © Æ Ď ¾ ℋ ⅆ ∲</p> | |||
. | . | |||
[Decimal entities](#decimal-entities) <a id="decimal-entities"></a> | [Decimal entities](@decimal-entities) | |||
consist of `&#` + a string of 1--8 arabic digits + `;`. Again, these | consist of `&#` + a string of 1--8 arabic digits + `;`. Again, these | |||
entities need to be recognised and tranformed into their corresponding | entities need to be recognised and tranformed into their corresponding | |||
UTF8 codepoints. Invalid Unicode codepoints will be written as the | UTF8 codepoints. Invalid Unicode codepoints will be written as the | |||
"unknown codepoint" character (`0xFFFD`) | "unknown codepoint" character (`0xFFFD`) | |||
. | . | |||
# Ӓ Ϡ � | # Ӓ Ϡ � | |||
. | . | |||
<p># Ӓ Ϡ �</p> | <p># Ӓ Ϡ �</p> | |||
. | . | |||
[Hexadecimal entities](#hexadecimal-entities) <a id="hexadecimal-entities"></a> | [Hexadecimal entities](@hexadecimal-entities) | |||
consist of `&#` + either `X` or `x` + a string of 1-8 hexadecimal digits | consist of `&#` + either `X` or `x` + a string of 1-8 hexadecimal digits | |||
+ `;`. They will also be parsed and turned into their corresponding UTF8 values in the AST. | + `;`. They will also be parsed and turned into their corresponding UTF8 values in the AST. | |||
. | . | |||
" ആ ಫ | " ആ ಫ | |||
. | . | |||
<p>" ആ ಫ</p> | <p>" ആ ಫ</p> | |||
. | . | |||
Here are some nonentities: | Here are some nonentities: | |||
skipping to change at line 4035 | skipping to change at line 4034 | |||
. | . | |||
föfö | föfö | |||
. | . | |||
<pre><code>f&ouml;f&ouml; | <pre><code>f&ouml;f&ouml; | |||
</code></pre> | </code></pre> | |||
. | . | |||
## Code span | ## Code span | |||
A [backtick string](#backtick-string) <a id="backtick-string"></a> | A [backtick string](@backtick-string) | |||
is a string of one or more backtick characters (`` ` ``) that is neither | is a string of one or more backtick characters (`` ` ``) that is neither | |||
preceded nor followed by a backtick. | preceded nor followed by a backtick. | |||
A code span begins with a backtick string and ends with a backtick | A [code span](@code-span) begins with a backtick string and ends with a backtick | |||
string of equal length. The contents of the code span are the | string of equal length. The contents of the code span are the | |||
characters between the two backtick strings, with leading and trailing | characters between the two backtick strings, with leading and trailing | |||
spaces and newlines removed, and consecutive spaces and newlines | spaces and newlines removed, and consecutive spaces and newlines | |||
collapsed to single spaces. | collapsed to single spaces. | |||
This is a simple code span: | This is a simple code span: | |||
. | . | |||
`foo` | `foo` | |||
. | . | |||
skipping to change at line 4219 | skipping to change at line 4218 | |||
spans, but users often do not.) | spans, but users often do not.) | |||
``` markdown | ``` markdown | |||
internal emphasis: foo*bar*baz | internal emphasis: foo*bar*baz | |||
no emphasis: foo_bar_baz | no emphasis: foo_bar_baz | |||
``` | ``` | |||
The following rules capture all of these patterns, while allowing | The following rules capture all of these patterns, while allowing | |||
for efficient parsing strategies that do not backtrack: | for efficient parsing strategies that do not backtrack: | |||
1. A single `*` character [can open emphasis](#can-open-emphasis) | 1. A single `*` character [can open emphasis](@can-open-emphasis) | |||
<a id="can-open-emphasis"></a> iff it is not followed by | iff it is not followed by | |||
whitespace. | whitespace. | |||
2. A single `_` character [can open emphasis](#can-open-emphasis) iff | 2. A single `_` character [can open emphasis](#can-open-emphasis) iff | |||
it is not followed by whitespace and it is not preceded by an | it is not followed by whitespace and it is not preceded by an | |||
ASCII alphanumeric character. | ASCII alphanumeric character. | |||
3. A single `*` character [can close emphasis](#can-close-emphasis) | 3. A single `*` character [can close emphasis](@can-close-emphasis) | |||
<a id="can-close-emphasis"></a> iff it is not preceded by whitespace. | iff it is not preceded by whitespace. | |||
4. A single `_` character [can close emphasis](#can-close-emphasis) iff | 4. A single `_` character [can close emphasis](#can-close-emphasis) iff | |||
it is not preceded by whitespace and it is not followed by an | it is not preceded by whitespace and it is not followed by an | |||
ASCII alphanumeric character. | ASCII alphanumeric character. | |||
5. A double `**` [can open strong emphasis](#can-open-strong-emphasis) | 5. A double `**` [can open strong emphasis](@can-open-strong-emphasis) | |||
<a id="can-open-strong-emphasis" ></a> iff it is not followed by | iff it is not followed by | |||
whitespace. | whitespace. | |||
6. A double `__` [can open strong emphasis](#can-open-strong-emphasis) | 6. A double `__` [can open strong emphasis](#can-open-strong-emphasis) | |||
iff it is not followed by whitespace and it is not preceded by an | iff it is not followed by whitespace and it is not preceded by an | |||
ASCII alphanumeric character. | ASCII alphanumeric character. | |||
7. A double `**` [can close strong emphasis](#can-close-strong-emphasis) | 7. A double `**` [can close strong emphasis](@can-close-strong-emphasis) | |||
<a id="can-close-strong-emphasis" ></a> iff it is not preceded by | iff it is not preceded by | |||
whitespace. | whitespace. | |||
8. A double `__` [can close strong emphasis](#can-close-strong-emphasis) | 8. A double `__` [can close strong emphasis](#can-close-strong-emphasis) | |||
iff it is not preceded by whitespace and it is not followed by an | iff it is not preceded by whitespace and it is not followed by an | |||
ASCII alphanumeric character. | ASCII alphanumeric character. | |||
9. Emphasis begins with a delimiter that [can open | 9. Emphasis begins with a delimiter that [can open | |||
emphasis](#can-open-emphasis) and ends with a delimiter that [can close | emphasis](#can-open-emphasis) and ends with a delimiter that [can close | |||
emphasis](#can-close-emphasis), and that uses the same | emphasis](#can-close-emphasis), and that uses the same | |||
character (`_` or `*`) as the opening delimiter. There must | character (`_` or `*`) as the opening delimiter. There must | |||
skipping to change at line 5075 | skipping to change at line 5074 | |||
. | . | |||
. | . | |||
__a<http://foo.bar?q=__> | __a<http://foo.bar?q=__> | |||
. | . | |||
<p>__a<a href="http://foo.bar?q=__">http://foo.bar?q=__</a></p> | <p>__a<a href="http://foo.bar?q=__">http://foo.bar?q=__</a></p> | |||
. | . | |||
## Links | ## Links | |||
A link contains a [link label](#link-label) (the visible text), | A link contains [link text](#link-label) (the visible text), | |||
a [destination](#destination) (the URI that is the link destination), | a [destination](#destination) (the URI that is the link destination), | |||
and optionally a [link title](#link-title). There are two basic kinds | and optionally a [link title](#link-title). There are two basic kinds | |||
of links in Markdown. In [inline links](#inline-links) the destination | of links in Markdown. In [inline links](#inline-links) the destination | |||
and title are given immediately after the label. In [reference | and title are given immediately after the link text. In [reference | |||
links](#reference-links) the destination and title are defined elsewhere | links](#reference-links) the destination and title are defined elsewhere | |||
in the document. | in the document. | |||
A [link label](#link-label) <a id="link-label"></a> consists of | A [link text](@link-text) consists of a sequence of zero or more | |||
inline elements enclosed by square brackets (`[` and `]`). The | ||||
following rules apply: | ||||
- an opening `[`, followed by | - Links may not contain other links, at any level of nesting. | |||
- zero or more backtick code spans, autolinks, HTML tags, link labels, | ||||
backslash-escaped ASCII punctuation characters, or non-`]` characters, | ||||
followed by | ||||
- a closing `]`. | ||||
<span class="insert">Links may not contain other links, at any level of nesting | - Brackets are allowed in the link text only if (a) they are | |||
.</span> | backslash-escaped or (b) they appear as a matched pair of brackets, | |||
These rules are motivated by the following intuitive ideas: | with an open bracket `[`, a sequence of zero or more inlines, and | |||
a close bracket `]`. | ||||
- A link label is a container for inline elements. | - Backtick [code spans](#code-span), [autolinks](#autolink), and | |||
- The square brackets bind more tightly than emphasis markers, | raw [HTML tags](#html-tag) bind more tightly | |||
but less tightly than `<>` or `` ` ``. | than the brackets in link text. Thus, for example, | |||
- Link labels may contain material in matching square brackets. | `` [foo`]` `` could not be a link text, since the second `]` | |||
is part of a code span. | ||||
A [link destination](#link-destination) <a id="link-destination"></a> | - The brackets in link text bind more tightly than markers for | |||
consists of either | [emphasis and strong emphasis](#emphasis-and-strong-emphasis). | |||
Thus, for example, `*[foo*](url)` is a link. | ||||
A [link destination](@link-destination) consists of either | ||||
- a sequence of zero or more characters between an opening `<` and a | - a sequence of zero or more characters between an opening `<` and a | |||
closing `>` that contains no line breaks or unescaped `<` or `>` | closing `>` that contains no line breaks or unescaped `<` or `>` | |||
characters, or | characters, or | |||
- a nonempty sequence of characters that does not include | - a nonempty sequence of characters that does not include | |||
ASCII space or control characters, and includes parentheses | ASCII space or control characters, and includes parentheses | |||
only if (a) they are backslash-escaped or (b) they are part of | only if (a) they are backslash-escaped or (b) they are part of | |||
a balanced pair of unescaped parentheses that is not itself | a balanced pair of unescaped parentheses that is not itself | |||
inside a balanced pair of unescaped paretheses. | inside a balanced pair of unescaped paretheses. | |||
A [link title](#link-title) <a id="link-title"></a> consists of either | A [link title](@link-title) consists of either | |||
- a sequence of zero or more characters between straight double-quote | - a sequence of zero or more characters between straight double-quote | |||
characters (`"`), including a `"` character only if it is | characters (`"`), including a `"` character only if it is | |||
backslash-escaped, or | backslash-escaped, or | |||
- a sequence of zero or more characters between straight single-quote | - a sequence of zero or more characters between straight single-quote | |||
characters (`'`), including a `'` character only if it is | characters (`'`), including a `'` character only if it is | |||
backslash-escaped, or | backslash-escaped, or | |||
- a sequence of zero or more characters between matching parentheses | - a sequence of zero or more characters between matching parentheses | |||
(`(...)`), including a `)` character only if it is backslash-escaped. | (`(...)`), including a `)` character only if it is backslash-escaped. | |||
An [inline link](#inline-link) <a id="inline-link"></a> | An [inline link](@inline-link) | |||
consists of a [link label](#link-label) followed immediately | consists of a [link text](#link-text) followed immediately | |||
by a left parenthesis `(`, optional whitespace, | by a left parenthesis `(`, optional whitespace, | |||
an optional [link destination](#link-destination), | an optional [link destination](#link-destination), | |||
an optional [link title](#link-title) separated from the link | an optional [link title](#link-title) separated from the link | |||
destination by whitespace, optional whitespace, and a right | destination by whitespace, optional whitespace, and a right | |||
parenthesis `)`. The link's text consists of the label (excluding | parenthesis `)`. The link's text consists of the inlines contained | |||
the enclosing square brackets) parsed as inlines. The link's | in the [link text](#link-text) (excluding the enclosing square brackets). | |||
URI consists of the link destination, excluding enclosing `<...>` if | The link's URI consists of the link destination, excluding enclosing | |||
present, with backslash-escapes in effect as described above. The | `<...>` if present, with backslash-escapes in effect as described | |||
link's title consists of the link title, excluding its enclosing | above. The link's title consists of the link title, excluding its | |||
delimiters, with backslash-escapes in effect as described above. | enclosing delimiters, with backslash-escapes in effect as described | |||
above. | ||||
Here is a simple inline link: | Here is a simple inline link: | |||
. | . | |||
[link](/uri "title") | [link](/uri "title") | |||
. | . | |||
<p><a href="/uri" title="title">link</a></p> | <p><a href="/uri" title="title">link</a></p> | |||
. | . | |||
The title may be omitted: | The title may be omitted: | |||
skipping to change at line 5310 | skipping to change at line 5315 | |||
Whitespace is allowed around the destination and title: | Whitespace is allowed around the destination and title: | |||
. | . | |||
[link]( /uri | [link]( /uri | |||
"title" ) | "title" ) | |||
. | . | |||
<p><a href="/uri" title="title">link</a></p> | <p><a href="/uri" title="title">link</a></p> | |||
. | . | |||
But it is not allowed between the link label and the | But it is not allowed between the link text and the | |||
following parenthesis: | following parenthesis: | |||
. | . | |||
[link] (/uri) | [link] (/uri) | |||
. | . | |||
<p>[link] (/uri)</p> | <p>[link] (/uri)</p> | |||
. | . | |||
Note that this is not a link, because the closing `]` occurs in | The link text may contain balanced brackets, but not unbalanced ones, | |||
an HTML tag: | unless they are escaped: | |||
. | ||||
[link [foo [bar]]](/uri) | ||||
. | ||||
<p><a href="/uri">link [foo [bar]]</a></p> | ||||
. | ||||
. | ||||
[link] bar](/uri) | ||||
. | ||||
<p>[link] bar](/uri)</p> | ||||
. | ||||
. | ||||
[link [bar](/uri) | ||||
. | ||||
<p>[link <a href="/uri">bar</a></p> | ||||
. | ||||
. | ||||
[link \[bar](/uri) | ||||
. | ||||
<p><a href="/uri">link [bar</a></p> | ||||
. | ||||
The link text may contain inline content: | ||||
. | ||||
[link *foo **bar** `#`*](/uri) | ||||
. | ||||
<p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p> | ||||
. | ||||
. | ||||
[![moon](moon.jpg)](/uri) | ||||
. | ||||
<p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p> | ||||
. | ||||
However, links may not contain other links, at any level of nesting. | ||||
. | ||||
[foo [bar](/uri)](/uri) | ||||
. | ||||
<p>[foo <a href="/uri">bar</a>](/uri)</p> | ||||
. | ||||
. | ||||
[foo *[bar [baz](/uri)](/uri)*](/uri) | ||||
. | ||||
<p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p> | ||||
. | ||||
These cases illustrate the precedence of link text grouping over | ||||
emphasis grouping: | ||||
. | ||||
*[foo*](/uri) | ||||
. | ||||
<p>*<a href="/uri">foo*</a></p> | ||||
. | ||||
. | ||||
[foo *bar](baz*) | ||||
. | ||||
<p><a href="baz*">foo *bar</a></p> | ||||
. | ||||
These cases illustrate the precedence of HTML tags, code spans, | ||||
and autolinks over link grouping: | ||||
. | . | |||
[foo <bar attr="](baz)"> | [foo <bar attr="](baz)"> | |||
. | . | |||
<p>[foo <bar attr="](baz)"></p> | <p>[foo <bar attr="](baz)"></p> | |||
. | . | |||
There are three kinds of [reference links](#reference-link): | . | |||
<a id="reference-link"></a> | [foo`](/uri)` | |||
. | ||||
<p>[foo<code>](/uri)</code></p> | ||||
. | ||||
A [full reference link](#full-reference-link) <a id="full-reference-link"></a> | . | |||
consists of a [link label](#link-label), optional whitespace, and | [foo<http://example.com?search=](uri)> | |||
another [link label](#link-label) that [matches](#matches) a | . | |||
<p>[foo<a href="http://example.com?search=%5D(uri)">http://example.com?search=]( | ||||
uri)</a></p> | ||||
. | ||||
There are three kinds of [reference links](@reference-link): | ||||
[full](#full-reference-link), [collapsed](#collapsed-reference-link), | ||||
and [shortcut](#shortcut-reference-link). | ||||
A [full reference link](@full-reference-link) | ||||
consists of a [link text](#link-text), optional whitespace, and | ||||
a [link label](#link-label) that [matches](#matches) a | ||||
[link reference definition](#link-reference-definition) elsewhere in the | [link reference definition](#link-reference-definition) elsewhere in the | |||
document. | document. | |||
One label [matches](#matches) <a id="matches"></a> | A [link label](@link-label) begins with a left bracket (`[`) and ends | |||
with the first right bracket (`]`) that is not backslash-escaped. | ||||
Unescaped square bracket characters are not allowed in | ||||
[link labels](#link-label). A link label can have at most 999 | ||||
characters inside the square brackets. | ||||
One label [matches](@matches) | ||||
another just in case their normalized forms are equal. To normalize a | another just in case their normalized forms are equal. To normalize a | |||
label, perform the *unicode case fold* and collapse consecutive internal | label, perform the *unicode case fold* and collapse consecutive internal | |||
whitespace to a single space. If there are multiple matching reference | whitespace to a single space. If there are multiple matching reference | |||
link definitions, the one that comes first in the document is used. (It | link definitions, the one that comes first in the document is used. (It | |||
is desirable in such cases to emit a warning.) | is desirable in such cases to emit a warning.) | |||
The contents of the first link label are parsed as inlines, which are | The contents of the first link label are parsed as inlines, which are | |||
used as the link's text. The link's URI and title are provided by the | used as the link's text. The link's URI and title are provided by the | |||
matching [link reference definition](#link-reference-definition). | matching [link reference definition](#link-reference-definition). | |||
Here is a simple example: | Here is a simple example: | |||
. | . | |||
[foo][bar] | [foo][bar] | |||
[bar]: /url "title" | [bar]: /url "title" | |||
. | . | |||
<p><a href="/url" title="title">foo</a></p> | <p><a href="/url" title="title">foo</a></p> | |||
. | . | |||
The first label can contain inline content: | The rules for the [link text](#link-text) are the same as with | |||
[inline links](#inline-link). Thus: | ||||
The link text may contain balanced brackets, but not unbalanced ones, | ||||
unless they are escaped: | ||||
. | . | |||
[*foo\!*][bar] | [link [foo [bar]]][ref] | |||
[bar]: /url "title" | [ref]: /uri | |||
. | . | |||
<p><a href="/url" title="title"><em>foo!</em></a></p> | <p><a href="/uri">link [foo [bar]]</a></p> | |||
. | ||||
. | ||||
[link \[bar][ref] | ||||
[ref]: /uri | ||||
. | ||||
<p><a href="/uri">link [bar</a></p> | ||||
. | ||||
The link text may contain inline content: | ||||
. | ||||
[link *foo **bar** `#`*][ref] | ||||
[ref]: /uri | ||||
. | ||||
<p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p> | ||||
. | ||||
. | ||||
[![moon](moon.jpg)][ref] | ||||
[ref]: /uri | ||||
. | ||||
<p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p> | ||||
. | ||||
However, links may not contain other links, at any level of nesting. | ||||
. | ||||
[foo [bar](/uri)][ref] | ||||
[ref]: /uri | ||||
. | ||||
<p>[foo <a href="/uri">bar</a>]<a href="/uri">ref</a></p> | ||||
. | ||||
. | ||||
[foo *bar [baz][ref]*][ref] | ||||
[ref]: /uri | ||||
. | ||||
<p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p> | ||||
. | ||||
(In the examples above, we have two [shortcut reference | ||||
links](#shortcut-reference-link) instead of one [full reference | ||||
link](#full-reference-link).) | ||||
The following cases illustrate the precedence of link text grouping over | ||||
emphasis grouping: | ||||
. | ||||
*[foo*][ref] | ||||
[ref]: /uri | ||||
. | ||||
<p>*<a href="/uri">foo*</a></p> | ||||
. | ||||
. | ||||
[foo *bar][ref] | ||||
[ref]: /uri | ||||
. | ||||
<p><a href="/uri">foo *bar</a></p> | ||||
. | ||||
These cases illustrate the precedence of HTML tags, code spans, | ||||
and autolinks over link grouping: | ||||
. | ||||
[foo <bar attr="][ref]"> | ||||
[ref]: /uri | ||||
. | ||||
<p>[foo <bar attr="][ref]"></p> | ||||
. | ||||
. | ||||
[foo`][ref]` | ||||
[ref]: /uri | ||||
. | ||||
<p>[foo<code>][ref]</code></p> | ||||
. | ||||
. | ||||
[foo<http://example.com?search=][ref]> | ||||
[ref]: /uri | ||||
. | ||||
<p>[foo<a href="http://example.com?search=%5D%5Bref%5D">http://example.com?searc | ||||
h=][ref]</a></p> | ||||
. | . | |||
Matching is case-insensitive: | Matching is case-insensitive: | |||
. | . | |||
[foo][BaR] | [foo][BaR] | |||
[bar]: /url "title" | [bar]: /url "title" | |||
. | . | |||
<p><a href="/url" title="title">foo</a></p> | <p><a href="/url" title="title">foo</a></p> | |||
skipping to change at line 5400 | skipping to change at line 5592 | |||
. | . | |||
[Foo | [Foo | |||
bar]: /url | bar]: /url | |||
[Baz][Foo bar] | [Baz][Foo bar] | |||
. | . | |||
<p><a href="/url">Baz</a></p> | <p><a href="/url">Baz</a></p> | |||
. | . | |||
There can be whitespace between the two labels: | There can be whitespace between the [link text](#link-text) and the | |||
[link label](#link-label): | ||||
. | . | |||
[foo] [bar] | [foo] [bar] | |||
[bar]: /url "title" | [bar]: /url "title" | |||
. | . | |||
<p><a href="/url" title="title">foo</a></p> | <p><a href="/url" title="title">foo</a></p> | |||
. | . | |||
. | . | |||
skipping to change at line 5444 | skipping to change at line 5637 | |||
labels define equivalent inline content: | labels define equivalent inline content: | |||
. | . | |||
[bar][foo\!] | [bar][foo\!] | |||
[foo!]: /url | [foo!]: /url | |||
. | . | |||
<p>[bar][foo!]</p> | <p>[bar][foo!]</p> | |||
. | . | |||
A [collapsed reference link](#collapsed-reference-link) | [Link labels](#link-label) cannot contain brackets, unless they are | |||
<a id="collapsed-reference-link"></a> consists of a [link | backslash-escaped: | |||
. | ||||
[foo][ref[] | ||||
[ref[]: /uri | ||||
. | ||||
<p>[foo][ref[]</p> | ||||
<p>[ref[]: /uri</p> | ||||
. | ||||
. | ||||
[foo][ref[bar]] | ||||
[ref[bar]]: /uri | ||||
. | ||||
<p>[foo][ref[bar]]</p> | ||||
<p>[ref[bar]]: /uri</p> | ||||
. | ||||
. | ||||
[[[foo]]] | ||||
[[[foo]]]: /url | ||||
. | ||||
<p>[[[foo]]]</p> | ||||
<p>[[[foo]]]: /url</p> | ||||
. | ||||
. | ||||
[foo][ref\[] | ||||
[ref\[]: /uri | ||||
. | ||||
<p><a href="/uri">foo</a></p> | ||||
. | ||||
A [collapsed reference link](@collapsed-reference-link) | ||||
consists of a [link | ||||
label](#link-label) that [matches](#matches) a [link reference | label](#link-label) that [matches](#matches) a [link reference | |||
definition](#link-reference-definition) elsewhere in the | definition](#link-reference-definition) elsewhere in the | |||
document, optional whitespace, and the string `[]`. The contents of the | document, optional whitespace, and the string `[]`. The contents of the | |||
first link label are parsed as inlines, which are used as the link's | first link label are parsed as inlines, which are used as the link's | |||
text. The link's URI and title are provided by the matching reference | text. The link's URI and title are provided by the matching reference | |||
link definition. Thus, `[foo][]` is equivalent to `[foo][foo]`. | link definition. Thus, `[foo][]` is equivalent to `[foo][foo]`. | |||
. | . | |||
[foo][] | [foo][] | |||
skipping to change at line 5491 | skipping to change at line 5722 | |||
. | . | |||
[foo] | [foo] | |||
[] | [] | |||
[foo]: /url "title" | [foo]: /url "title" | |||
. | . | |||
<p><a href="/url" title="title">foo</a></p> | <p><a href="/url" title="title">foo</a></p> | |||
. | . | |||
A [shortcut reference link](#shortcut-reference-link) | A [shortcut reference link](@shortcut-reference-link) | |||
<a id="shortcut-reference-link"></a> consists of a [link | consists of a [link | |||
label](#link-label) that [matches](#matches) a [link reference | label](#link-label) that [matches](#matches) a [link reference | |||
definition](#link-reference-definition) elsewhere in the | definition](#link-reference-definition) elsewhere in the | |||
document and is not followed by `[]` or a link label. | document and is not followed by `[]` or a link label. | |||
The contents of the first link label are parsed as inlines, | The contents of the first link label are parsed as inlines, | |||
which are used as the link's text. the link's URI and title | which are used as the link's text. the link's URI and title | |||
are provided by the matching link reference definition. | are provided by the matching link reference definition. | |||
Thus, `[foo]` is equivalent to `[foo][]`. | Thus, `[foo]` is equivalent to `[foo][]`. | |||
. | . | |||
[foo] | [foo] | |||
skipping to change at line 5546 | skipping to change at line 5777 | |||
opening bracket to avoid links: | opening bracket to avoid links: | |||
. | . | |||
\[foo] | \[foo] | |||
[foo]: /url "title" | [foo]: /url "title" | |||
. | . | |||
<p>[foo]</p> | <p>[foo]</p> | |||
. | . | |||
Note that this is a link, because link labels bind more tightly | Note that this is a link, because a link label ends with the first | |||
than emphasis: | following closing bracket: | |||
. | . | |||
[foo*]: /url | [foo*]: /url | |||
*[foo*] | *[foo*] | |||
. | . | |||
<p>*<a href="/url">foo*</a></p> | <p>*<a href="/url">foo*</a></p> | |||
. | . | |||
However, this is not, because link labels bind less | This is a link too, for the same reason: | |||
tightly than code backticks: | ||||
. | . | |||
[foo`]: /url | [foo`]: /url | |||
[foo`]` | [foo`]` | |||
. | . | |||
<p>[foo<code>]</code></p> | <p>[foo<code>]</code></p> | |||
. | . | |||
Link labels can contain matched square brackets: | ||||
. | ||||
[[[foo]]] | ||||
[[[foo]]]: /url | ||||
. | ||||
<p><a href="/url">[[foo]]</a></p> | ||||
. | ||||
. | ||||
[[[foo]]] | ||||
[[[foo]]]: /url1 | ||||
[foo]: /url2 | ||||
. | ||||
<p><a href="/url1">[[foo]]</a></p> | ||||
. | ||||
For non-matching brackets, use backslash escapes: | ||||
. | ||||
[\[foo] | ||||
[\[foo]: /url | ||||
. | ||||
<p><a href="/url">[foo</a></p> | ||||
. | ||||
Full references take precedence over shortcut references: | Full references take precedence over shortcut references: | |||
. | . | |||
[foo][bar] | [foo][bar] | |||
[foo]: /url1 | [foo]: /url1 | |||
[bar]: /url2 | [bar]: /url2 | |||
. | . | |||
<p><a href="/url2">foo</a></p> | <p><a href="/url2">foo</a></p> | |||
. | . | |||
skipping to change at line 5645 | skipping to change at line 5846 | |||
[foo][bar][baz] | [foo][bar][baz] | |||
[baz]: /url1 | [baz]: /url1 | |||
[foo]: /url2 | [foo]: /url2 | |||
. | . | |||
<p>[foo]<a href="/url1">bar</a></p> | <p>[foo]<a href="/url1">bar</a></p> | |||
. | . | |||
## Images | ## Images | |||
An (unescaped) exclamation mark (`!`) followed by a reference or | Syntax for images is like the syntax for links, with one | |||
inline link will be parsed as an image. The plain string content | difference. Instead of [link text](#link-text), we have an [image | |||
of the link label will be used as the image's alt text, and the link | description](@image-description). The rules for this are the | |||
title, if any, will be used as the image's title. | same as for [link text](#link-text), except that (a) an | |||
image description starts with `![` rather than `[`, and | ||||
(b) an image description may contain links, but not images | ||||
(even deeply nested). An image description has inline elements | ||||
as its contents. When an image is rendered to HTML, | ||||
this is standardly used as the image's `alt` attribute. | ||||
. | . | |||
![foo](/url "title") | ![foo](/url "title") | |||
. | . | |||
<p><img src="/url" alt="foo" title="title" /></p> | <p><img src="/url" alt="foo" title="title" /></p> | |||
. | . | |||
. | . | |||
![foo *bar*] | ![foo *bar*] | |||
[foo *bar*]: train.jpg "train & tracks" | [foo *bar*]: train.jpg "train & tracks" | |||
. | . | |||
<p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> | <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> | |||
. | . | |||
Note that in the above example, the alt text is `foo bar`, not `foo | . | |||
*bar*` or `foo <em>bar</em>` or `foo <em>bar</em>`. Only | ![foo ![bar](/url)](/url2) | |||
the plain string content is rendered, without formatting. | . | |||
<p>![foo <img src="/url" alt="bar" />](/url2)</p> | ||||
. | ||||
. | ||||
![foo [bar](/url)](/url2) | ||||
. | ||||
<p><img src="/url2" alt="foo bar" /></p> | ||||
. | ||||
Though this spec is concerned with parsing, not rendering, it is | ||||
recommended that in rendering to HTML, only the plain string content | ||||
of the [image description](#image-description) be used. Note that in | ||||
the above example, the alt attribute's value is `foo bar`, not `foo | ||||
[bar](/url)` or `foo <a href="/url">bar</a>`. Only the plain string | ||||
content is rendered, without formatting. | ||||
. | . | |||
![foo *bar*][] | ![foo *bar*][] | |||
[foo *bar*]: train.jpg "train & tracks" | [foo *bar*]: train.jpg "train & tracks" | |||
. | . | |||
<p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> | <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> | |||
. | . | |||
. | . | |||
skipping to change at line 5784 | skipping to change at line 6005 | |||
. | . | |||
. | . | |||
![*foo* bar] | ![*foo* bar] | |||
[*foo* bar]: /url "title" | [*foo* bar]: /url "title" | |||
. | . | |||
<p><img src="/url" alt="foo bar" title="title" /></p> | <p><img src="/url" alt="foo bar" title="title" /></p> | |||
. | . | |||
Note that link labels cannot contain unescaped brackets: | ||||
. | . | |||
![[foo]] | ![[foo]] | |||
[[foo]]: /url "title" | [[foo]]: /url "title" | |||
. | . | |||
<p><img src="/url" alt="[foo]" title="title" /></p> | <p>![[foo]]</p> | |||
<p>[[foo]]: /url "title"</p> | ||||
. | . | |||
The link labels are case-insensitive: | The link labels are case-insensitive: | |||
. | . | |||
![Foo] | ![Foo] | |||
[foo]: /url "title" | [foo]: /url "title" | |||
. | . | |||
<p><img src="/url" alt="Foo" title="title" /></p> | <p><img src="/url" alt="Foo" title="title" /></p> | |||
skipping to change at line 5826 | skipping to change at line 6050 | |||
. | . | |||
\![foo] | \![foo] | |||
[foo]: /url "title" | [foo]: /url "title" | |||
. | . | |||
<p>!<a href="/url" title="title">foo</a></p> | <p>!<a href="/url" title="title">foo</a></p> | |||
. | . | |||
## Autolinks | ## Autolinks | |||
Autolinks are absolute URIs and email addresses inside `<` and `>`. | [Autolinks](@autolink) are absolute URIs and email addresses inside `<` and `>`. | |||
They are parsed as links, with the URL or email address as the link | They are parsed as links, with the URL or email address as the link | |||
label. | label. | |||
A [URI autolink](#uri-autolink) <a id="uri-autolink"></a> | A [URI autolink](@uri-autolink) | |||
consists of `<`, followed by an [absolute | consists of `<`, followed by an [absolute | |||
URI](#absolute-uri) not containing `<`, followed by `>`. It is parsed | URI](#absolute-uri) not containing `<`, followed by `>`. It is parsed | |||
as a link to the URI, with the URI as the link's label. | as a link to the URI, with the URI as the link's label. | |||
An [absolute URI](#absolute-uri), <a id="absolute-uri"></a> | An [absolute URI](@absolute-uri), | |||
for these purposes, consists of a [scheme](#scheme) followed by a colon (`:`) | for these purposes, consists of a [scheme](#scheme) followed by a colon (`:`) | |||
followed by zero or more characters other than ASCII whitespace and | followed by zero or more characters other than ASCII whitespace and | |||
control characters, `<`, and `>`. If the URI includes these characters, | control characters, `<`, and `>`. If the URI includes these characters, | |||
you must use percent-encoding (e.g. `%20` for a space). | you must use percent-encoding (e.g. `%20` for a space). | |||
The following [schemes](#scheme) <a id="scheme"></a> | The following [schemes](@scheme) | |||
are recognized (case-insensitive): | are recognized (case-insensitive): | |||
`coap`, `doi`, `javascript`, `aaa`, `aaas`, `about`, `acap`, `cap`, | `coap`, `doi`, `javascript`, `aaa`, `aaas`, `about`, `acap`, `cap`, | |||
`cid`, `crid`, `data`, `dav`, `dict`, `dns`, `file`, `ftp`, `geo`, `go`, | `cid`, `crid`, `data`, `dav`, `dict`, `dns`, `file`, `ftp`, `geo`, `go`, | |||
`gopher`, `h323`, `http`, `https`, `iax`, `icap`, `im`, `imap`, `info`, | `gopher`, `h323`, `http`, `https`, `iax`, `icap`, `im`, `imap`, `info`, | |||
`ipp`, `iris`, `iris.beep`, `iris.xpc`, `iris.xpcs`, `iris.lwz`, `ldap`, | `ipp`, `iris`, `iris.beep`, `iris.xpc`, `iris.xpcs`, `iris.lwz`, `ldap`, | |||
`mailto`, `mid`, `msrp`, `msrps`, `mtqp`, `mupdate`, `news`, `nfs`, | `mailto`, `mid`, `msrp`, `msrps`, `mtqp`, `mupdate`, `news`, `nfs`, | |||
`ni`, `nih`, `nntp`, `opaquelocktoken`, `pop`, `pres`, `rtsp`, | `ni`, `nih`, `nntp`, `opaquelocktoken`, `pop`, `pres`, `rtsp`, | |||
`service`, `session`, `shttp`, `sieve`, `sip`, `sips`, `sms`, `snmp`,` | `service`, `session`, `shttp`, `sieve`, `sip`, `sips`, `sms`, `snmp`,` | |||
soap.beep`, `soap.beeps`, `tag`, `tel`, `telnet`, `tftp`, `thismessage`, | soap.beep`, `soap.beeps`, `tag`, `tel`, `telnet`, `tftp`, `thismessage`, | |||
`tn3270`, `tip`, `tv`, `urn`, `vemmi`, `ws`, `wss`, `xcon`, | `tn3270`, `tip`, `tv`, `urn`, `vemmi`, `ws`, `wss`, `xcon`, | |||
skipping to change at line 5903 | skipping to change at line 6127 | |||
. | . | |||
Spaces are not allowed in autolinks: | Spaces are not allowed in autolinks: | |||
. | . | |||
<http://foo.bar/baz bim> | <http://foo.bar/baz bim> | |||
. | . | |||
<p><http://foo.bar/baz bim></p> | <p><http://foo.bar/baz bim></p> | |||
. | . | |||
An [email autolink](#email-autolink) <a id="email-autolink"></a> | An [email autolink](@email-autolink) | |||
consists of `<`, followed by an [email address](#email-address), | consists of `<`, followed by an [email address](#email-address), | |||
followed by `>`. The link's label is the email address, | followed by `>`. The link's label is the email address, | |||
and the URL is `mailto:` followed by the email address. | and the URL is `mailto:` followed by the email address. | |||
An [email address](#email-address), <a id="email-address"></a> | An [email address](@email-address), | |||
for these purposes, is anything that matches | for these purposes, is anything that matches | |||
the [non-normative regex from the HTML5 | the [non-normative regex from the HTML5 | |||
spec](http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#e-m ail-state-%28type=email%29): | spec](http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#e-m ail-state-%28type=email%29): | |||
/^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0- 9])? | /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0- 9])? | |||
(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/ | (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/ | |||
Examples of email autolinks: | Examples of email autolinks: | |||
. | . | |||
skipping to change at line 5983 | skipping to change at line 6207 | |||
## Raw HTML | ## Raw HTML | |||
Text between `<` and `>` that looks like an HTML tag is parsed as a | Text between `<` and `>` that looks like an HTML tag is parsed as a | |||
raw HTML tag and will be rendered in HTML without escaping. | raw HTML tag and will be rendered in HTML without escaping. | |||
Tag and attribute names are not limited to current HTML tags, | Tag and attribute names are not limited to current HTML tags, | |||
so custom tags (and even, say, DocBook tags) may be used. | so custom tags (and even, say, DocBook tags) may be used. | |||
Here is the grammar for tags: | Here is the grammar for tags: | |||
A [tag name](#tag-name) <a id="tag-name"></a> consists of an ASCII letter | A [tag name](@tag-name) consists of an ASCII letter | |||
followed by zero or more ASCII letters or digits. | followed by zero or more ASCII letters or digits. | |||
An [attribute](#attribute) <a id="attribute"></a> consists of whitespace, | An [attribute](@attribute) consists of whitespace, | |||
an **attribute name**, and an optional **attribute value | an [attribute name](#attribute-name), and an optional | |||
specification**. | [attribute value specification](#attribute-value-specification). | |||
An [attribute name](#attribute-name) <a id="attribute-name"></a> | An [attribute name](@attribute-name) | |||
consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII | consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII | |||
letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML | letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML | |||
specification restricted to ASCII. HTML5 is laxer.) | specification restricted to ASCII. HTML5 is laxer.) | |||
An [attribute value specification](#attribute-value-specification) | An [attribute value specification](@attribute-value-specification) | |||
<a id="attribute-value-specification"></a> consists of optional whitespace, | consists of optional whitespace, | |||
a `=` character, optional whitespace, and an [attribute | a `=` character, optional whitespace, and an [attribute | |||
value](#attribute-value). | value](#attribute-value). | |||
An [attribute value](#attribute-value) <a id="attribute-value"></a> | An [attribute value](@attribute-value) | |||
consists of an [unquoted attribute value](#unquoted-attribute-value), | consists of an [unquoted attribute value](#unquoted-attribute-value), | |||
a [single-quoted attribute value](#single-quoted-attribute-value), | a [single-quoted attribute value](#single-quoted-attribute-value), | |||
or a [double-quoted attribute value](#double-quoted-attribute-value). | or a [double-quoted attribute value](#double-quoted-attribute-value). | |||
An [unquoted attribute value](#unquoted-attribute-value) | An [unquoted attribute value](@unquoted-attribute-value) | |||
<a id="unquoted-attribute-value"></a> is a nonempty string of characters not | is a nonempty string of characters not | |||
including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``. | including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``. | |||
A [single-quoted attribute value](#single-quoted-attribute-value) | A [single-quoted attribute value](@single-quoted-attribute-value) | |||
<a id="single-quoted-attribute-value"></a> consists of `'`, zero or more | consists of `'`, zero or more | |||
characters not including `'`, and a final `'`. | characters not including `'`, and a final `'`. | |||
A [double-quoted attribute value](#double-quoted-attribute-value) | A [double-quoted attribute value](@double-quoted-attribute-value) | |||
<a id="double-quoted-attribute-value"></a> consists of `"`, zero or more | consists of `"`, zero or more | |||
characters not including `"`, and a final `"`. | characters not including `"`, and a final `"`. | |||
An [open tag](#open-tag) <a id="open-tag"></a> consists of a `<` character, | An [open tag](@open-tag) consists of a `<` character, | |||
a [tag name](#tag-name), zero or more [attributes](#attribute), | a [tag name](#tag-name), zero or more [attributes](#attribute), | |||
optional whitespace, an optional `/` character, and a `>` character. | optional whitespace, an optional `/` character, and a `>` character. | |||
A [closing tag](#closing-tag) <a id="closing-tag"></a> consists of the | A [closing tag](@closing-tag) consists of the | |||
string `</`, a [tag name](#tag-name), optional whitespace, and the | string `</`, a [tag name](#tag-name), optional whitespace, and the | |||
character `>`. | character `>`. | |||
An [HTML comment](#html-comment) <a id="html-comment"></a> consists of the | An [HTML comment](@html-comment) consists of the | |||
string `<!--`, a string of characters not including the string `--`, and | string `<!--`, a string of characters not including the string `--`, and | |||
the string `-->`. | the string `-->`. | |||
A [processing instruction](#processing-instruction) | A [processing instruction](@processing-instruction) | |||
<a id="processing-instruction"></a> consists of the string `<?`, a string | consists of the string `<?`, a string | |||
of characters not including the string `?>`, and the string | of characters not including the string `?>`, and the string | |||
`?>`. | `?>`. | |||
A [declaration](#declaration) <a id="declaration"></a> consists of the | A [declaration](@declaration) consists of the | |||
string `<!`, a name consisting of one or more uppercase ASCII letters, | string `<!`, a name consisting of one or more uppercase ASCII letters, | |||
whitespace, a string of characters not including the character `>`, and | whitespace, a string of characters not including the character `>`, and | |||
the character `>`. | the character `>`. | |||
A [CDATA section](#cdata-section) <a id="cdata-section"></a> consists of | A [CDATA section](@cdata-section) consists of | |||
the string `<![CDATA[`, a string of characters not including the string | the string `<![CDATA[`, a string of characters not including the string | |||
`]]>`, and the string `]]>`. | `]]>`, and the string `]]>`. | |||
An [HTML tag](#html-tag) <a id="html-tag"></a> consists of an [open | An [HTML tag](@html-tag) consists of an [open | |||
tag](#open-tag), a [closing tag](#closing-tag), an [HTML | tag](#open-tag), a [closing tag](#closing-tag), an [HTML | |||
comment](#html-comment), a [processing | comment](#html-comment), a [processing | |||
instruction](#processing-instruction), an [element type | instruction](#processing-instruction), an [element type | |||
declaration](#element-type-declaration), or a [CDATA | declaration](#element-type-declaration), or a [CDATA | |||
section](#cdata-section). | section](#cdata-section). | |||
Here are some simple open tags: | Here are some simple open tags: | |||
. | . | |||
<a><bab><c2c> | <a><bab><c2c> | |||
skipping to change at line 6211 | skipping to change at line 6435 | |||
. | . | |||
<a href="\""> | <a href="\""> | |||
. | . | |||
<p><a href="""></p> | <p><a href="""></p> | |||
. | . | |||
## Hard line breaks | ## Hard line breaks | |||
A line break (not in a code span or HTML tag) that is preceded | A line break (not in a code span or HTML tag) that is preceded | |||
by two or more spaces is parsed as a [hard line | by two or more spaces and does not occur at the end of a block | |||
break](#hard-line-break)<a id="hard-line-break"></a> (rendered | is parsed as a [hard line break](@hard-line-break) (rendered | |||
in HTML as a `<br />` tag): | in HTML as a `<br />` tag): | |||
. | . | |||
foo | foo | |||
baz | baz | |||
. | . | |||
<p>foo<br /> | <p>foo<br /> | |||
baz</p> | baz</p> | |||
. | . | |||
skipping to change at line 6315 | skipping to change at line 6539 | |||
. | . | |||
. | . | |||
<a href="foo\ | <a href="foo\ | |||
bar"> | bar"> | |||
. | . | |||
<p><a href="foo\ | <p><a href="foo\ | |||
bar"></p> | bar"></p> | |||
. | . | |||
Hard line breaks are for separating inline content within a block. | ||||
Neither syntax for hard line breaks works at the end of a paragraph or | ||||
other block element: | ||||
. | ||||
foo\ | ||||
. | ||||
<p>foo\</p> | ||||
. | ||||
. | ||||
foo | ||||
. | ||||
<p>foo</p> | ||||
. | ||||
. | ||||
### foo\ | ||||
. | ||||
<h3>foo\</h3> | ||||
. | ||||
. | ||||
### foo | ||||
. | ||||
<h3>foo</h3> | ||||
. | ||||
## Soft line breaks | ## Soft line breaks | |||
A regular line break (not in a code span or HTML tag) that is not | A regular line break (not in a code span or HTML tag) that is not | |||
preceded by two or more spaces is parsed as a softbreak. (A | preceded by two or more spaces is parsed as a softbreak. (A | |||
softbreak may be rendered in HTML either as a newline or as a space. | softbreak may be rendered in HTML either as a newline or as a space. | |||
The result will be the same in browsers. In the examples here, a | The result will be the same in browsers. In the examples here, a | |||
newline will be used.) | newline will be used.) | |||
. | . | |||
foo | foo | |||
End of changes. 97 change blocks. | ||||
178 lines changed or deleted | 430 lines changed or added | |||
This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |