spec.txt   spec.txt 
--- ---
title: CommonMark Spec title: CommonMark Spec
author: author:
- John MacFarlane - John MacFarlane
version: 0.10 version: 0.11
date: 2014-11-06 date: 2014-11-10
... ...
# Introduction # Introduction
## What is Markdown? ## What is Markdown?
Markdown is a plain text format for writing structured documents, Markdown is a plain text format for writing structured documents,
based on conventions used for indicating formatting in email and based on conventions used for indicating formatting in email and
usenet posts. It was developed in 2004 by John Gruber, who wrote usenet posts. It was developed in 2004 by John Gruber, who wrote
the first Markdown-to-HTML converter in perl, and it soon became the first Markdown-to-HTML converter in perl, and it soon became
skipping to change at line 194 skipping to change at line 194
This document is generated from a text file, `spec.txt`, written This document is generated from a text file, `spec.txt`, written
in Markdown with a small extension for the side-by-side tests. in Markdown with a small extension for the side-by-side tests.
The script `spec2md.pl` can be used to turn `spec.txt` into pandoc The script `spec2md.pl` can be used to turn `spec.txt` into pandoc
Markdown, which can then be converted into other formats. Markdown, which can then be converted into other formats.
In the examples, the `→` character is used to represent tabs. In the examples, the `→` character is used to represent tabs.
# Preprocessing # Preprocessing
A [line](#line) <a id="line"></a> A [line](@line)
is a sequence of zero or more [characters](#character) followed by a is a sequence of zero or more [characters](#character) followed by a
line ending (CR, LF, or CRLF) or by the end of file. line ending (CR, LF, or CRLF) or by the end of file.
A [character](#character)<a id="character"></a> is a unicode code point. A [character](@character) is a unicode code point.
This spec does not specify an encoding; it thinks of lines as composed This spec does not specify an encoding; it thinks of lines as composed
of characters rather than bytes. A conforming parser may be limited of characters rather than bytes. A conforming parser may be limited
to a certain encoding. to a certain encoding.
Tabs in lines are expanded to spaces, with a tab stop of 4 characters: Tabs in lines are expanded to spaces, with a tab stop of 4 characters:
. .
→foo→baz→→bim →foo→baz→→bim
. .
<pre><code>foo baz bim <pre><code>foo baz bim
skipping to change at line 224 skipping to change at line 224
ὐ→a ὐ→a
. .
<pre><code>a a <pre><code>a a
ὐ a ὐ a
</code></pre> </code></pre>
. .
Line endings are replaced by newline characters (LF). Line endings are replaced by newline characters (LF).
A line containing no characters, or a line containing only spaces (after A line containing no characters, or a line containing only spaces (after
tab expansion), is called a [blank line](#blank-line). tab expansion), is called a [blank line](@blank-line).
<a id="blank-line"></a>
# Blocks and inlines # Blocks and inlines
We can think of a document as a sequence of [blocks](#block)<a We can think of a document as a sequence of
id="block"></a>---structural elements like paragraphs, block quotations, [blocks](@block)---structural
elements like paragraphs, block quotations,
lists, headers, rules, and code blocks. Blocks can contain other lists, headers, rules, and code blocks. Blocks can contain other
blocks, or they can contain [inline](#inline)<a id="inline"></a> content: blocks, or they can contain [inline](@inline) content:
words, spaces, links, emphasized text, images, and inline code. words, spaces, links, emphasized text, images, and inline code.
## Precedence ## Precedence
Indicators of block structure always take precedence over indicators Indicators of block structure always take precedence over indicators
of inline structure. So, for example, the following is a list with of inline structure. So, for example, the following is a list with
two items, not a list with one item containing a code span: two items, not a list with one item containing a code span:
. .
- `one - `one
skipping to change at line 263 skipping to change at line 263
paragraphs, headers, and other block constructs can be parsed for inline paragraphs, headers, and other block constructs can be parsed for inline
structure. The second step requires information about link reference structure. The second step requires information about link reference
definitions that will be available only at the end of the first definitions that will be available only at the end of the first
step. Note that the first step requires processing lines in sequence, step. Note that the first step requires processing lines in sequence,
but the second can be parallelized, since the inline parsing of but the second can be parallelized, since the inline parsing of
one block element does not affect the inline parsing of any other. one block element does not affect the inline parsing of any other.
## Container blocks and leaf blocks ## Container blocks and leaf blocks
We can divide blocks into two types: We can divide blocks into two types:
[container blocks](#container-block), <a id="container-block"></a> [container blocks](@container-block),
which can contain other blocks, and [leaf blocks](#leaf-block), which can contain other blocks, and [leaf blocks](@leaf-block),
<a id="leaf-block"></a> which cannot. which cannot.
# Leaf blocks # Leaf blocks
This section describes the different kinds of leaf block that make up a This section describes the different kinds of leaf block that make up a
Markdown document. Markdown document.
## Horizontal rules ## Horizontal rules
A line consisting of 0-3 spaces of indentation, followed by a sequence A line consisting of 0-3 spaces of indentation, followed by a sequence
of three or more matching `-`, `_`, or `*` characters, each followed of three or more matching `-`, `_`, or `*` characters, each followed
optionally by any number of spaces, forms a [horizontal optionally by any number of spaces, forms a [horizontal
rule](#horizontal-rule). <a id="horizontal-rule"></a> rule](@horizontal-rule).
. .
*** ***
--- ---
___ ___
. .
<hr /> <hr />
<hr /> <hr />
<hr /> <hr />
. .
skipping to change at line 477 skipping to change at line 477
- * * * - * * *
. .
<ul> <ul>
<li>Foo</li> <li>Foo</li>
<li><hr /></li> <li><hr /></li>
</ul> </ul>
. .
## ATX headers ## ATX headers
An [ATX header](#atx-header) <a id="atx-header"></a> An [ATX header](@atx-header)
consists of a string of characters, parsed as inline content, between an consists of a string of characters, parsed as inline content, between an
opening sequence of 1--6 unescaped `#` characters and an optional opening sequence of 1--6 unescaped `#` characters and an optional
closing sequence of any number of `#` characters. The opening sequence closing sequence of any number of `#` characters. The opening sequence
of `#` characters cannot be followed directly by a nonspace character. of `#` characters cannot be followed directly by a nonspace character.
The optional closing sequence of `#`s must be preceded by a space and may be The optional closing sequence of `#`s must be preceded by a space and may be
followed by spaces only. The opening `#` character may be indented 0-3 followed by spaces only. The opening `#` character may be indented 0-3
spaces. The raw contents of the header are stripped of leading and spaces. The raw contents of the header are stripped of leading and
trailing spaces before being parsed as inline content. The header level trailing spaces before being parsed as inline content. The header level
is equal to the number of `#` characters in the opening sequence. is equal to the number of `#` characters in the opening sequence.
skipping to change at line 675 skipping to change at line 675
# #
### ### ### ###
. .
<h2></h2> <h2></h2>
<h1></h1> <h1></h1>
<h3></h3> <h3></h3>
. .
## Setext headers ## Setext headers
A [setext header](#setext-header) <a id="setext-header"></a> A [setext header](@setext-header)
consists of a line of text, containing at least one nonspace character, consists of a line of text, containing at least one nonspace character,
with no more than 3 spaces indentation, followed by a [setext header with no more than 3 spaces indentation, followed by a [setext header
underline](#setext-header-underline). The line of text must be underline](#setext-header-underline). The line of text must be
one that, were it not followed by the setext header underline, one that, were it not followed by the setext header underline,
would be interpreted as part of a paragraph: it cannot be a code would be interpreted as part of a paragraph: it cannot be a code
block, header, blockquote, horizontal rule, or list. A [setext header block, header, blockquote, horizontal rule, or list. A [setext header
underline](#setext-header-underline) <a id="setext-header-underline"></a> underline](@setext-header-underline)
is a sequence of `=` characters or a sequence of `-` characters, with no is a sequence of `=` characters or a sequence of `-` characters, with no
more than 3 spaces indentation and any number of trailing more than 3 spaces indentation and any number of trailing
spaces. The header is a level 1 header if `=` characters are used, and spaces. The header is a level 1 header if `=` characters are used, and
a level 2 header if `-` characters are used. The contents of the header a level 2 header if `-` characters are used. The contents of the header
are the result of parsing the first line as Markdown inline content. are the result of parsing the first line as Markdown inline content.
In general, a setext header need not be preceded or followed by a In general, a setext header need not be preceded or followed by a
blank line. However, it cannot interrupt a paragraph, so when a blank line. However, it cannot interrupt a paragraph, so when a
setext header comes after a paragraph, a blank line is needed between setext header comes after a paragraph, a blank line is needed between
them. them.
skipping to change at line 946 skipping to change at line 946
. .
\> foo \> foo
------ ------
. .
<h2>&gt; foo</h2> <h2>&gt; foo</h2>
. .
## Indented code blocks ## Indented code blocks
An [indented code block](#indented-code-block) An [indented code block](@indented-code-block)
<a id="indented-code-block"></a> is composed of one or more is composed of one or more
[indented chunks](#indented-chunk) separated by blank lines. [indented chunks](#indented-chunk) separated by blank lines.
An [indented chunk](#indented-chunk) <a id="indented-chunk"></a> An [indented chunk](@indented-chunk)
is a sequence of non-blank lines, each indented four or more is a sequence of non-blank lines, each indented four or more
spaces. An indented code block cannot interrupt a paragraph, so spaces. An indented code block cannot interrupt a paragraph, so
if it occurs before or after a paragraph, there must be an if it occurs before or after a paragraph, there must be an
intervening blank line. The contents of the code block are intervening blank line. The contents of the code block are
the literal contents of the lines, including trailing newlines, the literal contents of the lines, including trailing newlines,
minus four spaces of indentation. An indented code block has no minus four spaces of indentation. An indented code block has no
attributes. attributes.
. .
a simple a simple
skipping to change at line 1092 skipping to change at line 1092
. .
foo foo
. .
<pre><code>foo <pre><code>foo
</code></pre> </code></pre>
. .
## Fenced code blocks ## Fenced code blocks
A [code fence](#code-fence) <a id="code-fence"></a> is a sequence A [code fence](@code-fence) is a sequence
of at least three consecutive backtick characters (`` ` ``) or of at least three consecutive backtick characters (`` ` ``) or
tildes (`~`). (Tildes and backticks cannot be mixed.) tildes (`~`). (Tildes and backticks cannot be mixed.)
A [fenced code block](#fenced-code-block) <a id="fenced-code-block"></a> A [fenced code block](@fenced-code-block)
begins with a code fence, indented no more than three spaces. begins with a code fence, indented no more than three spaces.
The line with the opening code fence may optionally contain some text The line with the opening code fence may optionally contain some text
following the code fence; this is trimmed of leading and trailing following the code fence; this is trimmed of leading and trailing
spaces and called the [info string](#info-string). spaces and called the [info string](@info-string).
<a id="info-string"></a> The info string may not contain any backtick The info string may not contain any backtick
characters. (The reason for this restriction is that otherwise characters. (The reason for this restriction is that otherwise
some inline code would be incorrectly interpreted as the some inline code would be incorrectly interpreted as the
beginning of a fenced code block.) beginning of a fenced code block.)
The content of the code block consists of all subsequent lines, until The content of the code block consists of all subsequent lines, until
a closing [code fence](#code-fence) of the same type as the code block a closing [code fence](#code-fence) of the same type as the code block
began with (backticks or tildes), and with at least as many backticks began with (backticks or tildes), and with at least as many backticks
or tildes as the opening code fence. If the leading code fence is or tildes as the opening code fence. If the leading code fence is
indented N spaces, then up to N spaces of indentation are removed from indented N spaces, then up to N spaces of indentation are removed from
each line of the content (if present). (If a content line is not each line of the content (if present). (If a content line is not
skipping to change at line 1451 skipping to change at line 1451
``` ```
``` aaa ``` aaa
``` ```
. .
<pre><code>``` aaa <pre><code>``` aaa
</code></pre> </code></pre>
. .
## HTML blocks ## HTML blocks
An [HTML block tag](#html-block-tag) <a id="html-block-tag"></a> is An [HTML block tag](@html-block-tag) is
an [open tag](#open-tag) or [closing tag](#closing-tag) whose tag an [open tag](#open-tag) or [closing tag](#closing-tag) whose tag
name is one of the following (case-insensitive): name is one of the following (case-insensitive):
`article`, `header`, `aside`, `hgroup`, `blockquote`, `hr`, `iframe`, `article`, `header`, `aside`, `hgroup`, `blockquote`, `hr`, `iframe`,
`body`, `li`, `map`, `button`, `object`, `canvas`, `ol`, `caption`, `body`, `li`, `map`, `button`, `object`, `canvas`, `ol`, `caption`,
`output`, `col`, `p`, `colgroup`, `pre`, `dd`, `progress`, `div`, `output`, `col`, `p`, `colgroup`, `pre`, `dd`, `progress`, `div`,
`section`, `dl`, `table`, `td`, `dt`, `tbody`, `embed`, `textarea`, `section`, `dl`, `table`, `td`, `dt`, `tbody`, `embed`, `textarea`,
`fieldset`, `tfoot`, `figcaption`, `th`, `figure`, `thead`, `footer`, `fieldset`, `tfoot`, `figcaption`, `th`, `figure`, `thead`, `footer`,
`tr`, `form`, `ul`, `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `video`, `tr`, `form`, `ul`, `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `video`,
`script`, `style`. `script`, `style`.
An [HTML block](#html-block) <a id="html-block"></a> begins with an An [HTML block](@html-block) begins with an
[HTML block tag](#html-block-tag), [HTML comment](#html-comment), [HTML block tag](#html-block-tag), [HTML comment](#html-comment),
[processing instruction](#processing-instruction), [processing instruction](#processing-instruction),
[declaration](#declaration), or [CDATA section](#cdata-section). [declaration](#declaration), or [CDATA section](#cdata-section).
It ends when a [blank line](#blank-line) or the end of the It ends when a [blank line](#blank-line) or the end of the
input is encountered. The initial line may be indented up to three input is encountered. The initial line may be indented up to three
spaces, and subsequent lines may have any indentation. The contents spaces, and subsequent lines may have any indentation. The contents
of the HTML block are interpreted as raw HTML, and will not be escaped of the HTML block are interpreted as raw HTML, and will not be escaped
in HTML output. in HTML output.
Some simple examples: Some simple examples:
skipping to change at line 1736 skipping to change at line 1736
. .
Moreover, blank lines are usually not necessary and can be Moreover, blank lines are usually not necessary and can be
deleted. The exception is inside `<pre>` tags; here, one can deleted. The exception is inside `<pre>` tags; here, one can
replace the blank lines with `&#10;` entities. replace the blank lines with `&#10;` entities.
So there is no important loss of expressive power with the new rule. So there is no important loss of expressive power with the new rule.
## Link reference definitions ## Link reference definitions
A [link reference definition](#link-reference-definition) A [link reference definition](@link-reference-definition)
<a id="link-reference-definition"></a> consists of a [link consists of a [link
label](#link-label), indented up to three spaces, followed label](#link-label), indented up to three spaces, followed
by a colon (`:`), optional blank space (including up to one by a colon (`:`), optional blank space (including up to one
newline), a [link destination](#link-destination), optional newline), a [link destination](#link-destination), optional
blank space (including up to one newline), and an optional [link blank space (including up to one newline), and an optional [link
title](#link-title), which if it is present must be separated title](#link-title), which if it is present must be separated
from the [link destination](#link-destination) by whitespace. from the [link destination](#link-destination) by whitespace.
No further non-space characters may occur on the line. No further non-space characters may occur on the line.
A [link reference-definition](#link-reference-definition) A [link reference-definition](#link-reference-definition)
does not correspond to a structural element of a document. Instead, it does not correspond to a structural element of a document. Instead, it
skipping to change at line 1961 skipping to change at line 1961
> [foo]: /url > [foo]: /url
. .
<p><a href="/url">foo</a></p> <p><a href="/url">foo</a></p>
<blockquote> <blockquote>
</blockquote> </blockquote>
. .
## Paragraphs ## Paragraphs
A sequence of non-blank lines that cannot be interpreted as other A sequence of non-blank lines that cannot be interpreted as other
kinds of blocks forms a [paragraph](#paragraph).<a id="paragraph"></a> kinds of blocks forms a [paragraph](@paragraph).
The contents of the paragraph are the result of parsing the The contents of the paragraph are the result of parsing the
paragraph's raw content as inlines. The paragraph's raw content paragraph's raw content as inlines. The paragraph's raw content
is formed by concatenating the lines and removing initial and final is formed by concatenating the lines and removing initial and final
spaces. spaces.
A simple example with two paragraphs: A simple example with two paragraphs:
. .
aaa aaa
skipping to change at line 2100 skipping to change at line 2100
> with these blocks as its content. > with these blocks as its content.
So, we explain what counts as a block quote or list item by explaining So, we explain what counts as a block quote or list item by explaining
how these can be *generated* from their contents. This should suffice how these can be *generated* from their contents. This should suffice
to define the syntax, although it does not give a recipe for *parsing* to define the syntax, although it does not give a recipe for *parsing*
these constructions. (A recipe is provided below in the section entitled these constructions. (A recipe is provided below in the section entitled
[A parsing strategy](#appendix-a-a-parsing-strategy).) [A parsing strategy](#appendix-a-a-parsing-strategy).)
## Block quotes ## Block quotes
A [block quote marker](#block-quote-marker) <a id="block-quote-marker"></a> A [block quote marker](@block-quote-marker)
consists of 0-3 spaces of initial indent, plus (a) the character `>` together consists of 0-3 spaces of initial indent, plus (a) the character `>` together
with a following space, or (b) a single character `>` not followed by a space. with a following space, or (b) a single character `>` not followed by a space.
The following rules define [block quotes](#block-quote): The following rules define [block quotes](@block-quote):
<a id="block-quote"></a>
1. **Basic case.** If a string of lines *Ls* constitute a sequence 1. **Basic case.** If a string of lines *Ls* constitute a sequence
of blocks *Bs*, then the result of prepending a [block quote of blocks *Bs*, then the result of prepending a [block quote
marker](#block-quote-marker) to the beginning of each line in *Ls* marker](#block-quote-marker) to the beginning of each line in *Ls*
is a [block quote](#block-quote) containing *Bs*. is a [block quote](#block-quote) containing *Bs*.
2. **Laziness.** If a string of lines *Ls* constitute a [block 2. **Laziness.** If a string of lines *Ls* constitute a [block
quote](#block-quote) with contents *Bs*, then the result of deleting quote](#block-quote) with contents *Bs*, then the result of deleting
the initial [block quote marker](#block-quote-marker) from one or the initial [block quote marker](#block-quote-marker) from one or
more lines in which the next non-space character after the [block more lines in which the next non-space character after the [block
quote marker](#block-quote-marker) is [paragraph continuation quote marker](#block-quote-marker) is [paragraph continuation
text](#paragraph-continuation-text) is a block quote with *Bs* as text](#paragraph-continuation-text) is a block quote with *Bs* as
its content. <a id="paragraph-continuation-text"></a> its content.
[Paragraph continuation text](#paragraph-continuation-text) is text [Paragraph continuation text](@paragraph-continuation-text) is text
that will be parsed as part of the content of a paragraph, but does that will be parsed as part of the content of a paragraph, but does
not occur at the beginning of the paragraph. not occur at the beginning of the paragraph.
3. **Consecutiveness.** A document cannot contain two [block 3. **Consecutiveness.** A document cannot contain two [block
quotes](#block-quote) in a row unless there is a [blank quotes](#block-quote) in a row unless there is a [blank
line](#blank-line) between them. line](#blank-line) between them.
Nothing else counts as a [block quote](#block-quote). Nothing else counts as a [block quote](#block-quote).
Here is a simple example: Here is a simple example:
skipping to change at line 2461 skipping to change at line 2460
<pre><code>code <pre><code>code
</code></pre> </code></pre>
</blockquote> </blockquote>
<blockquote> <blockquote>
<p>not code</p> <p>not code</p>
</blockquote> </blockquote>
. .
## List items ## List items
A [list marker](#list-marker) <a id="list-marker"></a> is a A [list marker](@list-marker) is a
[bullet list marker](#bullet-list-marker) or an [ordered list [bullet list marker](#bullet-list-marker) or an [ordered list
marker](#ordered-list-marker). marker](#ordered-list-marker).
A [bullet list marker](#bullet-list-marker) <a id="bullet-list-marker"></a> A [bullet list marker](@bullet-list-marker)
is a `-`, `+`, or `*` character. is a `-`, `+`, or `*` character.
An [ordered list marker](#ordered-list-marker) <a id="ordered-list-marker"></a> An [ordered list marker](@ordered-list-marker)
is a sequence of one of more digits (`0-9`), followed by either a is a sequence of one of more digits (`0-9`), followed by either a
`.` character or a `)` character. `.` character or a `)` character.
The following rules define [list items](#list-item):<a The following rules define [list items](@list-item):
id="list-item"></a>
1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of 1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of
blocks *Bs* starting with a non-space character and not separated blocks *Bs* starting with a non-space character and not separated
from each other by more than one blank line, and *M* is a list from each other by more than one blank line, and *M* is a list
marker *M* of width *W* followed by 0 < *N* < 5 spaces, then the result marker *M* of width *W* followed by 0 < *N* < 5 spaces, then the result
of prepending *M* and the following spaces to the first line of of prepending *M* and the following spaces to the first line of
*Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a
list item with *Bs* as its contents. The type of the list item list item with *Bs* as its contents. The type of the list item
(bullet or ordered) is determined by the type of its list marker. (bullet or ordered) is determined by the type of its list marker.
If the list item is ordered, then it is also assigned a start If the list item is ordered, then it is also assigned a start
skipping to change at line 2919 skipping to change at line 2917
&gt; A block quote. &gt; A block quote.
</code></pre> </code></pre>
. .
4. **Laziness.** If a string of lines *Ls* constitute a [list 4. **Laziness.** If a string of lines *Ls* constitute a [list
item](#list-item) with contents *Bs*, then the result of deleting item](#list-item) with contents *Bs*, then the result of deleting
some or all of the indentation from one or more lines in which the some or all of the indentation from one or more lines in which the
next non-space character after the indentation is next non-space character after the indentation is
[paragraph continuation text](#paragraph-continuation-text) is a [paragraph continuation text](#paragraph-continuation-text) is a
list item with the same contents and attributes.<a list item with the same contents and attributes. The unindented
id="lazy-continuation-line"></a> lines are called
[lazy continuation lines](@lazy-continuation-line).
Here is an example with [lazy continuation Here is an example with [lazy continuation
lines](#lazy-continuation-line): lines](#lazy-continuation-line):
. .
1. A paragraph 1. A paragraph
with two lines. with two lines.
indented code indented code
skipping to change at line 3296 skipping to change at line 3295
The one case that needs special treatment is a list item that *starts* The one case that needs special treatment is a list item that *starts*
with indented code. How much indentation is required in that case, since with indented code. How much indentation is required in that case, since
we don't have a "first paragraph" to measure from? Rule #2 simply stipulates we don't have a "first paragraph" to measure from? Rule #2 simply stipulates
that in such cases, we require one space indentation from the list marker that in such cases, we require one space indentation from the list marker
(and then the normal four spaces for the indented code). This will match the (and then the normal four spaces for the indented code). This will match the
four-space rule in cases where the list marker plus its initial indentation four-space rule in cases where the list marker plus its initial indentation
takes four spaces (a common case), but diverge in other cases. takes four spaces (a common case), but diverge in other cases.
## Lists ## Lists
A [list](#list) <a id="list"></a> is a sequence of one or more A [list](@list) is a sequence of one or more
list items [of the same type](#of-the-same-type). The list items list items [of the same type](#of-the-same-type). The list items
may be separated by single [blank lines](#blank-line), but two may be separated by single [blank lines](#blank-line), but two
blank lines end all containing lists. blank lines end all containing lists.
Two list items are [of the same type](#of-the-same-type) Two list items are [of the same type](@of-the-same-type)
<a id="of-the-same-type"></a> if they begin with a [list if they begin with a [list
marker](#list-marker) of the same type. Two list markers are of the marker](#list-marker) of the same type. Two list markers are of the
same type if (a) they are bullet list markers using the same character same type if (a) they are bullet list markers using the same character
(`-`, `+`, or `*`) or (b) they are ordered list numbers with the same (`-`, `+`, or `*`) or (b) they are ordered list numbers with the same
delimiter (either `.` or `)`). delimiter (either `.` or `)`).
A list is an [ordered list](#ordered-list) <a id="ordered-list"></a> A list is an [ordered list](@ordered-list)
if its constituent list items begin with if its constituent list items begin with
[ordered list markers](#ordered-list-marker), and a [bullet [ordered list markers](#ordered-list-marker), and a [bullet
list](#bullet-list) <a id="bullet-list"></a> if its constituent list list](@bullet-list) if its constituent list
items begin with [bullet list markers](#bullet-list-marker). items begin with [bullet list markers](#bullet-list-marker).
The [start number](#start-number) <a id="start-number"></a> The [start number](@start-number)
of an [ordered list](#ordered-list) is determined by the list number of of an [ordered list](#ordered-list) is determined by the list number of
its initial list item. The numbers of subsequent list items are its initial list item. The numbers of subsequent list items are
disregarded. disregarded.
A list is [loose](#loose)<a id="loose"></a> if it any of its constituent A list is [loose](@loose) if it any of its constituent
list items are separated by blank lines, or if any of its constituent list items are separated by blank lines, or if any of its constituent
list items directly contain two block-level elements with a blank line list items directly contain two block-level elements with a blank line
between them. Otherwise a list is [tight](#tight).<a id="tight"></a> between them. Otherwise a list is [tight](@tight).
(The difference in HTML output is that paragraphs in a loose list are (The difference in HTML output is that paragraphs in a loose list are
wrapped in `<p>` tags, while paragraphs in a tight list are not.) wrapped in `<p>` tags, while paragraphs in a tight list are not.)
Changing the bullet or ordered list delimiter starts a new list: Changing the bullet or ordered list delimiter starts a new list:
. .
- foo - foo
- bar - bar
+ baz + baz
. .
skipping to change at line 3400 skipping to change at line 3399
First, it is natural and not uncommon for people to start lists without First, it is natural and not uncommon for people to start lists without
blank lines: blank lines:
I need to buy I need to buy
- new shoes - new shoes
- a coat - a coat
- a plane ticket - a plane ticket
Second, we are attracted to a Second, we are attracted to a
> [principle of uniformity](#principle-of-uniformity):<a > [principle of uniformity](@principle-of-uniformity):
> id="principle-of-uniformity"></a> if a span of text has a certain > if a span of text has a certain
> meaning, it will continue to have the same meaning when put into a list > meaning, it will continue to have the same meaning when put into a list
> item. > item.
(Indeed, the spec for [list items](#list-item) presupposes this.) (Indeed, the spec for [list items](#list-item) presupposes this.)
This principle implies that if This principle implies that if
* I need to buy * I need to buy
- new shoes - new shoes
- a coat - a coat
- a plane ticket - a plane ticket
skipping to change at line 3919 skipping to change at line 3918
With the goal of making this standard as HTML-agnostic as possible, all With the goal of making this standard as HTML-agnostic as possible, all
valid HTML entities in any context are recognized as such and valid HTML entities in any context are recognized as such and
converted into unicode characters before they are stored in the AST. converted into unicode characters before they are stored in the AST.
This allows implementations that target HTML output to trivially escape This allows implementations that target HTML output to trivially escape
the entities when generating HTML, and simplifies the job of the entities when generating HTML, and simplifies the job of
implementations targetting other languages, as these will only need to implementations targetting other languages, as these will only need to
handle the unicode chars and need not be HTML-entity aware. handle the unicode chars and need not be HTML-entity aware.
[Named entities](#name-entities) <a id="named-entities"></a> consist of `&` [Named entities](@name-entities) consist of `&`
+ any of the valid HTML5 entity names + `;`. The + any of the valid HTML5 entity names + `;`. The
[following document](http://www.whatwg.org/specs/web-apps/current-work/multipage /entities.json) [following document](http://www.whatwg.org/specs/web-apps/current-work/multipage /entities.json)
is used as an authoritative source of the valid entity names and their is used as an authoritative source of the valid entity names and their
corresponding codepoints. corresponding codepoints.
Conforming implementations that target HTML don't need to generate Conforming implementations that target HTML don't need to generate
entities for all the valid named entities that exist, with the exception entities for all the valid named entities that exist, with the exception
of `"` (`&quot;`), `&` (`&amp;`), `<` (`&lt;`) and `>` (`&gt;`), which of `"` (`&quot;`), `&` (`&amp;`), `<` (`&lt;`) and `>` (`&gt;`), which
always need to be written as entities for security reasons. always need to be written as entities for security reasons.
. .
&nbsp; &amp; &copy; &AElig; &Dcaron; &frac34; &HilbertSpace; &DifferentialD; &Cl ockwiseContourIntegral; &nbsp; &amp; &copy; &AElig; &Dcaron; &frac34; &HilbertSpace; &DifferentialD; &Cl ockwiseContourIntegral;
. .
<p>  &amp; © Æ Ď ¾ ℋ ⅆ ∲</p> <p>  &amp; © Æ Ď ¾ ℋ ⅆ ∲</p>
. .
[Decimal entities](#decimal-entities) <a id="decimal-entities"></a> [Decimal entities](@decimal-entities)
consist of `&#` + a string of 1--8 arabic digits + `;`. Again, these consist of `&#` + a string of 1--8 arabic digits + `;`. Again, these
entities need to be recognised and tranformed into their corresponding entities need to be recognised and tranformed into their corresponding
UTF8 codepoints. Invalid Unicode codepoints will be written as the UTF8 codepoints. Invalid Unicode codepoints will be written as the
"unknown codepoint" character (`0xFFFD`) "unknown codepoint" character (`0xFFFD`)
. .
&#35; &#1234; &#992; &#98765432; &#35; &#1234; &#992; &#98765432;
. .
<p># Ӓ Ϡ �</p> <p># Ӓ Ϡ �</p>
. .
[Hexadecimal entities](#hexadecimal-entities) <a id="hexadecimal-entities"></a> [Hexadecimal entities](@hexadecimal-entities)
consist of `&#` + either `X` or `x` + a string of 1-8 hexadecimal digits consist of `&#` + either `X` or `x` + a string of 1-8 hexadecimal digits
+ `;`. They will also be parsed and turned into their corresponding UTF8 values in the AST. + `;`. They will also be parsed and turned into their corresponding UTF8 values in the AST.
. .
&#X22; &#XD06; &#xcab; &#X22; &#XD06; &#xcab;
. .
<p>&quot; ആ ಫ</p> <p>&quot; ആ ಫ</p>
. .
Here are some nonentities: Here are some nonentities:
skipping to change at line 4035 skipping to change at line 4034
. .
f&ouml;f&ouml; f&ouml;f&ouml;
. .
<pre><code>f&amp;ouml;f&amp;ouml; <pre><code>f&amp;ouml;f&amp;ouml;
</code></pre> </code></pre>
. .
## Code span ## Code span
A [backtick string](#backtick-string) <a id="backtick-string"></a> A [backtick string](@backtick-string)
is a string of one or more backtick characters (`` ` ``) that is neither is a string of one or more backtick characters (`` ` ``) that is neither
preceded nor followed by a backtick. preceded nor followed by a backtick.
A code span begins with a backtick string and ends with a backtick A [code span](@code-span) begins with a backtick string and ends with a backtick
string of equal length. The contents of the code span are the string of equal length. The contents of the code span are the
characters between the two backtick strings, with leading and trailing characters between the two backtick strings, with leading and trailing
spaces and newlines removed, and consecutive spaces and newlines spaces and newlines removed, and consecutive spaces and newlines
collapsed to single spaces. collapsed to single spaces.
This is a simple code span: This is a simple code span:
. .
`foo` `foo`
. .
skipping to change at line 4219 skipping to change at line 4218
spans, but users often do not.) spans, but users often do not.)
``` markdown ``` markdown
internal emphasis: foo*bar*baz internal emphasis: foo*bar*baz
no emphasis: foo_bar_baz no emphasis: foo_bar_baz
``` ```
The following rules capture all of these patterns, while allowing The following rules capture all of these patterns, while allowing
for efficient parsing strategies that do not backtrack: for efficient parsing strategies that do not backtrack:
1. A single `*` character [can open emphasis](#can-open-emphasis) 1. A single `*` character [can open emphasis](@can-open-emphasis)
<a id="can-open-emphasis"></a> iff it is not followed by iff it is not followed by
whitespace. whitespace.
2. A single `_` character [can open emphasis](#can-open-emphasis) iff 2. A single `_` character [can open emphasis](#can-open-emphasis) iff
it is not followed by whitespace and it is not preceded by an it is not followed by whitespace and it is not preceded by an
ASCII alphanumeric character. ASCII alphanumeric character.
3. A single `*` character [can close emphasis](#can-close-emphasis) 3. A single `*` character [can close emphasis](@can-close-emphasis)
<a id="can-close-emphasis"></a> iff it is not preceded by whitespace. iff it is not preceded by whitespace.
4. A single `_` character [can close emphasis](#can-close-emphasis) iff 4. A single `_` character [can close emphasis](#can-close-emphasis) iff
it is not preceded by whitespace and it is not followed by an it is not preceded by whitespace and it is not followed by an
ASCII alphanumeric character. ASCII alphanumeric character.
5. A double `**` [can open strong emphasis](#can-open-strong-emphasis) 5. A double `**` [can open strong emphasis](@can-open-strong-emphasis)
<a id="can-open-strong-emphasis" ></a> iff it is not followed by iff it is not followed by
whitespace. whitespace.
6. A double `__` [can open strong emphasis](#can-open-strong-emphasis) 6. A double `__` [can open strong emphasis](#can-open-strong-emphasis)
iff it is not followed by whitespace and it is not preceded by an iff it is not followed by whitespace and it is not preceded by an
ASCII alphanumeric character. ASCII alphanumeric character.
7. A double `**` [can close strong emphasis](#can-close-strong-emphasis) 7. A double `**` [can close strong emphasis](@can-close-strong-emphasis)
<a id="can-close-strong-emphasis" ></a> iff it is not preceded by iff it is not preceded by
whitespace. whitespace.
8. A double `__` [can close strong emphasis](#can-close-strong-emphasis) 8. A double `__` [can close strong emphasis](#can-close-strong-emphasis)
iff it is not preceded by whitespace and it is not followed by an iff it is not preceded by whitespace and it is not followed by an
ASCII alphanumeric character. ASCII alphanumeric character.
9. Emphasis begins with a delimiter that [can open 9. Emphasis begins with a delimiter that [can open
emphasis](#can-open-emphasis) and ends with a delimiter that [can close emphasis](#can-open-emphasis) and ends with a delimiter that [can close
emphasis](#can-close-emphasis), and that uses the same emphasis](#can-close-emphasis), and that uses the same
character (`_` or `*`) as the opening delimiter. There must character (`_` or `*`) as the opening delimiter. There must
skipping to change at line 5075 skipping to change at line 5074
. .
. .
__a<http://foo.bar?q=__> __a<http://foo.bar?q=__>
. .
<p>__a<a href="http://foo.bar?q=__">http://foo.bar?q=__</a></p> <p>__a<a href="http://foo.bar?q=__">http://foo.bar?q=__</a></p>
. .
## Links ## Links
A link contains a [link label](#link-label) (the visible text), A link contains [link text](#link-label) (the visible text),
a [destination](#destination) (the URI that is the link destination), a [destination](#destination) (the URI that is the link destination),
and optionally a [link title](#link-title). There are two basic kinds and optionally a [link title](#link-title). There are two basic kinds
of links in Markdown. In [inline links](#inline-links) the destination of links in Markdown. In [inline links](#inline-links) the destination
and title are given immediately after the label. In [reference and title are given immediately after the link text. In [reference
links](#reference-links) the destination and title are defined elsewhere links](#reference-links) the destination and title are defined elsewhere
in the document. in the document.
A [link label](#link-label) <a id="link-label"></a> consists of A [link text](@link-text) consists of a sequence of zero or more
inline elements enclosed by square brackets (`[` and `]`). The
following rules apply:
- an opening `[`, followed by - Links may not contain other links, at any level of nesting.
- zero or more backtick code spans, autolinks, HTML tags, link labels,
backslash-escaped ASCII punctuation characters, or non-`]` characters,
followed by
- a closing `]`.
<span class="insert">Links may not contain other links, at any level of nesting - Brackets are allowed in the link text only if (a) they are
.</span> backslash-escaped or (b) they appear as a matched pair of brackets,
These rules are motivated by the following intuitive ideas: with an open bracket `[`, a sequence of zero or more inlines, and
a close bracket `]`.
- A link label is a container for inline elements. - Backtick [code spans](#code-span), [autolinks](#autolink), and
- The square brackets bind more tightly than emphasis markers, raw [HTML tags](#html-tag) bind more tightly
but less tightly than `<>` or `` ` ``. than the brackets in link text. Thus, for example,
- Link labels may contain material in matching square brackets. `` [foo`]` `` could not be a link text, since the second `]`
is part of a code span.
A [link destination](#link-destination) <a id="link-destination"></a> - The brackets in link text bind more tightly than markers for
consists of either [emphasis and strong emphasis](#emphasis-and-strong-emphasis).
Thus, for example, `*[foo*](url)` is a link.
A [link destination](@link-destination) consists of either
- a sequence of zero or more characters between an opening `<` and a - a sequence of zero or more characters between an opening `<` and a
closing `>` that contains no line breaks or unescaped `<` or `>` closing `>` that contains no line breaks or unescaped `<` or `>`
characters, or characters, or
- a nonempty sequence of characters that does not include - a nonempty sequence of characters that does not include
ASCII space or control characters, and includes parentheses ASCII space or control characters, and includes parentheses
only if (a) they are backslash-escaped or (b) they are part of only if (a) they are backslash-escaped or (b) they are part of
a balanced pair of unescaped parentheses that is not itself a balanced pair of unescaped parentheses that is not itself
inside a balanced pair of unescaped paretheses. inside a balanced pair of unescaped paretheses.
A [link title](#link-title) <a id="link-title"></a> consists of either A [link title](@link-title) consists of either
- a sequence of zero or more characters between straight double-quote - a sequence of zero or more characters between straight double-quote
characters (`"`), including a `"` character only if it is characters (`"`), including a `"` character only if it is
backslash-escaped, or backslash-escaped, or
- a sequence of zero or more characters between straight single-quote - a sequence of zero or more characters between straight single-quote
characters (`'`), including a `'` character only if it is characters (`'`), including a `'` character only if it is
backslash-escaped, or backslash-escaped, or
- a sequence of zero or more characters between matching parentheses - a sequence of zero or more characters between matching parentheses
(`(...)`), including a `)` character only if it is backslash-escaped. (`(...)`), including a `)` character only if it is backslash-escaped.
An [inline link](#inline-link) <a id="inline-link"></a> An [inline link](@inline-link)
consists of a [link label](#link-label) followed immediately consists of a [link text](#link-text) followed immediately
by a left parenthesis `(`, optional whitespace, by a left parenthesis `(`, optional whitespace,
an optional [link destination](#link-destination), an optional [link destination](#link-destination),
an optional [link title](#link-title) separated from the link an optional [link title](#link-title) separated from the link
destination by whitespace, optional whitespace, and a right destination by whitespace, optional whitespace, and a right
parenthesis `)`. The link's text consists of the label (excluding parenthesis `)`. The link's text consists of the inlines contained
the enclosing square brackets) parsed as inlines. The link's in the [link text](#link-text) (excluding the enclosing square brackets).
URI consists of the link destination, excluding enclosing `<...>` if The link's URI consists of the link destination, excluding enclosing
present, with backslash-escapes in effect as described above. The `<...>` if present, with backslash-escapes in effect as described
link's title consists of the link title, excluding its enclosing above. The link's title consists of the link title, excluding its
delimiters, with backslash-escapes in effect as described above. enclosing delimiters, with backslash-escapes in effect as described
above.
Here is a simple inline link: Here is a simple inline link:
. .
[link](/uri "title") [link](/uri "title")
. .
<p><a href="/uri" title="title">link</a></p> <p><a href="/uri" title="title">link</a></p>
. .
The title may be omitted: The title may be omitted:
skipping to change at line 5310 skipping to change at line 5315
Whitespace is allowed around the destination and title: Whitespace is allowed around the destination and title:
. .
[link]( /uri [link]( /uri
"title" ) "title" )
. .
<p><a href="/uri" title="title">link</a></p> <p><a href="/uri" title="title">link</a></p>
. .
But it is not allowed between the link label and the But it is not allowed between the link text and the
following parenthesis: following parenthesis:
. .
[link] (/uri) [link] (/uri)
. .
<p>[link] (/uri)</p> <p>[link] (/uri)</p>
. .
Note that this is not a link, because the closing `]` occurs in The link text may contain balanced brackets, but not unbalanced ones,
an HTML tag: unless they are escaped:
.
[link [foo [bar]]](/uri)
.
<p><a href="/uri">link [foo [bar]]</a></p>
.
.
[link] bar](/uri)
.
<p>[link] bar](/uri)</p>
.
.
[link [bar](/uri)
.
<p>[link <a href="/uri">bar</a></p>
.
.
[link \[bar](/uri)
.
<p><a href="/uri">link [bar</a></p>
.
The link text may contain inline content:
.
[link *foo **bar** `#`*](/uri)
.
<p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
.
.
[![moon](moon.jpg)](/uri)
.
<p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>
.
However, links may not contain other links, at any level of nesting.
.
[foo [bar](/uri)](/uri)
.
<p>[foo <a href="/uri">bar</a>](/uri)</p>
.
.
[foo *[bar [baz](/uri)](/uri)*](/uri)
.
<p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p>
.
These cases illustrate the precedence of link text grouping over
emphasis grouping:
.
*[foo*](/uri)
.
<p>*<a href="/uri">foo*</a></p>
.
.
[foo *bar](baz*)
.
<p><a href="baz*">foo *bar</a></p>
.
These cases illustrate the precedence of HTML tags, code spans,
and autolinks over link grouping:
. .
[foo <bar attr="](baz)"> [foo <bar attr="](baz)">
. .
<p>[foo <bar attr="](baz)"></p> <p>[foo <bar attr="](baz)"></p>
. .
There are three kinds of [reference links](#reference-link): .
<a id="reference-link"></a> [foo`](/uri)`
.
<p>[foo<code>](/uri)</code></p>
.
A [full reference link](#full-reference-link) <a id="full-reference-link"></a> .
consists of a [link label](#link-label), optional whitespace, and [foo<http://example.com?search=](uri)>
another [link label](#link-label) that [matches](#matches) a .
<p>[foo<a href="http://example.com?search=%5D(uri)">http://example.com?search=](
uri)</a></p>
.
There are three kinds of [reference links](@reference-link):
[full](#full-reference-link), [collapsed](#collapsed-reference-link),
and [shortcut](#shortcut-reference-link).
A [full reference link](@full-reference-link)
consists of a [link text](#link-text), optional whitespace, and
a [link label](#link-label) that [matches](#matches) a
[link reference definition](#link-reference-definition) elsewhere in the [link reference definition](#link-reference-definition) elsewhere in the
document. document.
One label [matches](#matches) <a id="matches"></a> A [link label](@link-label) begins with a left bracket (`[`) and ends
with the first right bracket (`]`) that is not backslash-escaped.
Unescaped square bracket characters are not allowed in
[link labels](#link-label). A link label can have at most 999
characters inside the square brackets.
One label [matches](@matches)
another just in case their normalized forms are equal. To normalize a another just in case their normalized forms are equal. To normalize a
label, perform the *unicode case fold* and collapse consecutive internal label, perform the *unicode case fold* and collapse consecutive internal
whitespace to a single space. If there are multiple matching reference whitespace to a single space. If there are multiple matching reference
link definitions, the one that comes first in the document is used. (It link definitions, the one that comes first in the document is used. (It
is desirable in such cases to emit a warning.) is desirable in such cases to emit a warning.)
The contents of the first link label are parsed as inlines, which are The contents of the first link label are parsed as inlines, which are
used as the link's text. The link's URI and title are provided by the used as the link's text. The link's URI and title are provided by the
matching [link reference definition](#link-reference-definition). matching [link reference definition](#link-reference-definition).
Here is a simple example: Here is a simple example:
. .
[foo][bar] [foo][bar]
[bar]: /url "title" [bar]: /url "title"
. .
<p><a href="/url" title="title">foo</a></p> <p><a href="/url" title="title">foo</a></p>
. .
The first label can contain inline content: The rules for the [link text](#link-text) are the same as with
[inline links](#inline-link). Thus:
The link text may contain balanced brackets, but not unbalanced ones,
unless they are escaped:
. .
[*foo\!*][bar] [link [foo [bar]]][ref]
[bar]: /url "title" [ref]: /uri
. .
<p><a href="/url" title="title"><em>foo!</em></a></p> <p><a href="/uri">link [foo [bar]]</a></p>
.
.
[link \[bar][ref]
[ref]: /uri
.
<p><a href="/uri">link [bar</a></p>
.
The link text may contain inline content:
.
[link *foo **bar** `#`*][ref]
[ref]: /uri
.
<p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
.
.
[![moon](moon.jpg)][ref]
[ref]: /uri
.
<p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>
.
However, links may not contain other links, at any level of nesting.
.
[foo [bar](/uri)][ref]
[ref]: /uri
.
<p>[foo <a href="/uri">bar</a>]<a href="/uri">ref</a></p>
.
.
[foo *bar [baz][ref]*][ref]
[ref]: /uri
.
<p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p>
.
(In the examples above, we have two [shortcut reference
links](#shortcut-reference-link) instead of one [full reference
link](#full-reference-link).)
The following cases illustrate the precedence of link text grouping over
emphasis grouping:
.
*[foo*][ref]
[ref]: /uri
.
<p>*<a href="/uri">foo*</a></p>
.
.
[foo *bar][ref]
[ref]: /uri
.
<p><a href="/uri">foo *bar</a></p>
.
These cases illustrate the precedence of HTML tags, code spans,
and autolinks over link grouping:
.
[foo <bar attr="][ref]">
[ref]: /uri
.
<p>[foo <bar attr="][ref]"></p>
.
.
[foo`][ref]`
[ref]: /uri
.
<p>[foo<code>][ref]</code></p>
.
.
[foo<http://example.com?search=][ref]>
[ref]: /uri
.
<p>[foo<a href="http://example.com?search=%5D%5Bref%5D">http://example.com?searc
h=][ref]</a></p>
. .
Matching is case-insensitive: Matching is case-insensitive:
. .
[foo][BaR] [foo][BaR]
[bar]: /url "title" [bar]: /url "title"
. .
<p><a href="/url" title="title">foo</a></p> <p><a href="/url" title="title">foo</a></p>
skipping to change at line 5400 skipping to change at line 5592
. .
[Foo [Foo
bar]: /url bar]: /url
[Baz][Foo bar] [Baz][Foo bar]
. .
<p><a href="/url">Baz</a></p> <p><a href="/url">Baz</a></p>
. .
There can be whitespace between the two labels: There can be whitespace between the [link text](#link-text) and the
[link label](#link-label):
. .
[foo] [bar] [foo] [bar]
[bar]: /url "title" [bar]: /url "title"
. .
<p><a href="/url" title="title">foo</a></p> <p><a href="/url" title="title">foo</a></p>
. .
. .
skipping to change at line 5444 skipping to change at line 5637
labels define equivalent inline content: labels define equivalent inline content:
. .
[bar][foo\!] [bar][foo\!]
[foo!]: /url [foo!]: /url
. .
<p>[bar][foo!]</p> <p>[bar][foo!]</p>
. .
A [collapsed reference link](#collapsed-reference-link) [Link labels](#link-label) cannot contain brackets, unless they are
<a id="collapsed-reference-link"></a> consists of a [link backslash-escaped:
.
[foo][ref[]
[ref[]: /uri
.
<p>[foo][ref[]</p>
<p>[ref[]: /uri</p>
.
.
[foo][ref[bar]]
[ref[bar]]: /uri
.
<p>[foo][ref[bar]]</p>
<p>[ref[bar]]: /uri</p>
.
.
[[[foo]]]
[[[foo]]]: /url
.
<p>[[[foo]]]</p>
<p>[[[foo]]]: /url</p>
.
.
[foo][ref\[]
[ref\[]: /uri
.
<p><a href="/uri">foo</a></p>
.
A [collapsed reference link](@collapsed-reference-link)
consists of a [link
label](#link-label) that [matches](#matches) a [link reference label](#link-label) that [matches](#matches) a [link reference
definition](#link-reference-definition) elsewhere in the definition](#link-reference-definition) elsewhere in the
document, optional whitespace, and the string `[]`. The contents of the document, optional whitespace, and the string `[]`. The contents of the
first link label are parsed as inlines, which are used as the link's first link label are parsed as inlines, which are used as the link's
text. The link's URI and title are provided by the matching reference text. The link's URI and title are provided by the matching reference
link definition. Thus, `[foo][]` is equivalent to `[foo][foo]`. link definition. Thus, `[foo][]` is equivalent to `[foo][foo]`.
. .
[foo][] [foo][]
skipping to change at line 5491 skipping to change at line 5722
. .
[foo] [foo]
[] []
[foo]: /url "title" [foo]: /url "title"
. .
<p><a href="/url" title="title">foo</a></p> <p><a href="/url" title="title">foo</a></p>
. .
A [shortcut reference link](#shortcut-reference-link) A [shortcut reference link](@shortcut-reference-link)
<a id="shortcut-reference-link"></a> consists of a [link consists of a [link
label](#link-label) that [matches](#matches) a [link reference label](#link-label) that [matches](#matches) a [link reference
definition](#link-reference-definition) elsewhere in the definition](#link-reference-definition) elsewhere in the
document and is not followed by `[]` or a link label. document and is not followed by `[]` or a link label.
The contents of the first link label are parsed as inlines, The contents of the first link label are parsed as inlines,
which are used as the link's text. the link's URI and title which are used as the link's text. the link's URI and title
are provided by the matching link reference definition. are provided by the matching link reference definition.
Thus, `[foo]` is equivalent to `[foo][]`. Thus, `[foo]` is equivalent to `[foo][]`.
. .
[foo] [foo]
skipping to change at line 5546 skipping to change at line 5777
opening bracket to avoid links: opening bracket to avoid links:
. .
\[foo] \[foo]
[foo]: /url "title" [foo]: /url "title"
. .
<p>[foo]</p> <p>[foo]</p>
. .
Note that this is a link, because link labels bind more tightly Note that this is a link, because a link label ends with the first
than emphasis: following closing bracket:
. .
[foo*]: /url [foo*]: /url
*[foo*] *[foo*]
. .
<p>*<a href="/url">foo*</a></p> <p>*<a href="/url">foo*</a></p>
. .
However, this is not, because link labels bind less This is a link too, for the same reason:
tightly than code backticks:
. .
[foo`]: /url [foo`]: /url
[foo`]` [foo`]`
. .
<p>[foo<code>]</code></p> <p>[foo<code>]</code></p>
. .
Link labels can contain matched square brackets:
.
[[[foo]]]
[[[foo]]]: /url
.
<p><a href="/url">[[foo]]</a></p>
.
.
[[[foo]]]
[[[foo]]]: /url1
[foo]: /url2
.
<p><a href="/url1">[[foo]]</a></p>
.
For non-matching brackets, use backslash escapes:
.
[\[foo]
[\[foo]: /url
.
<p><a href="/url">[foo</a></p>
.
Full references take precedence over shortcut references: Full references take precedence over shortcut references:
. .
[foo][bar] [foo][bar]
[foo]: /url1 [foo]: /url1
[bar]: /url2 [bar]: /url2
. .
<p><a href="/url2">foo</a></p> <p><a href="/url2">foo</a></p>
. .
skipping to change at line 5645 skipping to change at line 5846
[foo][bar][baz] [foo][bar][baz]
[baz]: /url1 [baz]: /url1
[foo]: /url2 [foo]: /url2
. .
<p>[foo]<a href="/url1">bar</a></p> <p>[foo]<a href="/url1">bar</a></p>
. .
## Images ## Images
An (unescaped) exclamation mark (`!`) followed by a reference or Syntax for images is like the syntax for links, with one
inline link will be parsed as an image. The plain string content difference. Instead of [link text](#link-text), we have an [image
of the link label will be used as the image's alt text, and the link description](@image-description). The rules for this are the
title, if any, will be used as the image's title. same as for [link text](#link-text), except that (a) an
image description starts with `![` rather than `[`, and
(b) an image description may contain links, but not images
(even deeply nested). An image description has inline elements
as its contents. When an image is rendered to HTML,
this is standardly used as the image's `alt` attribute.
. .
![foo](/url "title") ![foo](/url "title")
. .
<p><img src="/url" alt="foo" title="title" /></p> <p><img src="/url" alt="foo" title="title" /></p>
. .
. .
![foo *bar*] ![foo *bar*]
[foo *bar*]: train.jpg "train & tracks" [foo *bar*]: train.jpg "train & tracks"
. .
<p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p> <p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
. .
Note that in the above example, the alt text is `foo bar`, not `foo .
*bar*` or `foo <em>bar</em>` or `foo &lt;em&gt;bar&lt;/em&gt;`. Only ![foo ![bar](/url)](/url2)
the plain string content is rendered, without formatting. .
<p>![foo <img src="/url" alt="bar" />](/url2)</p>
.
.
![foo [bar](/url)](/url2)
.
<p><img src="/url2" alt="foo bar" /></p>
.
Though this spec is concerned with parsing, not rendering, it is
recommended that in rendering to HTML, only the plain string content
of the [image description](#image-description) be used. Note that in
the above example, the alt attribute's value is `foo bar`, not `foo
[bar](/url)` or `foo <a href="/url">bar</a>`. Only the plain string
content is rendered, without formatting.
. .
![foo *bar*][] ![foo *bar*][]
[foo *bar*]: train.jpg "train & tracks" [foo *bar*]: train.jpg "train & tracks"
. .
<p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p> <p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
. .
. .
skipping to change at line 5784 skipping to change at line 6005
. .
. .
![*foo* bar] ![*foo* bar]
[*foo* bar]: /url "title" [*foo* bar]: /url "title"
. .
<p><img src="/url" alt="foo bar" title="title" /></p> <p><img src="/url" alt="foo bar" title="title" /></p>
. .
Note that link labels cannot contain unescaped brackets:
. .
![[foo]] ![[foo]]
[[foo]]: /url "title" [[foo]]: /url "title"
. .
<p><img src="/url" alt="[foo]" title="title" /></p> <p>![[foo]]</p>
<p>[[foo]]: /url &quot;title&quot;</p>
. .
The link labels are case-insensitive: The link labels are case-insensitive:
. .
![Foo] ![Foo]
[foo]: /url "title" [foo]: /url "title"
. .
<p><img src="/url" alt="Foo" title="title" /></p> <p><img src="/url" alt="Foo" title="title" /></p>
skipping to change at line 5826 skipping to change at line 6050
. .
\![foo] \![foo]
[foo]: /url "title" [foo]: /url "title"
. .
<p>!<a href="/url" title="title">foo</a></p> <p>!<a href="/url" title="title">foo</a></p>
. .
## Autolinks ## Autolinks
Autolinks are absolute URIs and email addresses inside `<` and `>`. [Autolinks](@autolink) are absolute URIs and email addresses inside `<` and `>`.
They are parsed as links, with the URL or email address as the link They are parsed as links, with the URL or email address as the link
label. label.
A [URI autolink](#uri-autolink) <a id="uri-autolink"></a> A [URI autolink](@uri-autolink)
consists of `<`, followed by an [absolute consists of `<`, followed by an [absolute
URI](#absolute-uri) not containing `<`, followed by `>`. It is parsed URI](#absolute-uri) not containing `<`, followed by `>`. It is parsed
as a link to the URI, with the URI as the link's label. as a link to the URI, with the URI as the link's label.
An [absolute URI](#absolute-uri), <a id="absolute-uri"></a> An [absolute URI](@absolute-uri),
for these purposes, consists of a [scheme](#scheme) followed by a colon (`:`) for these purposes, consists of a [scheme](#scheme) followed by a colon (`:`)
followed by zero or more characters other than ASCII whitespace and followed by zero or more characters other than ASCII whitespace and
control characters, `<`, and `>`. If the URI includes these characters, control characters, `<`, and `>`. If the URI includes these characters,
you must use percent-encoding (e.g. `%20` for a space). you must use percent-encoding (e.g. `%20` for a space).
The following [schemes](#scheme) <a id="scheme"></a> The following [schemes](@scheme)
are recognized (case-insensitive): are recognized (case-insensitive):
`coap`, `doi`, `javascript`, `aaa`, `aaas`, `about`, `acap`, `cap`, `coap`, `doi`, `javascript`, `aaa`, `aaas`, `about`, `acap`, `cap`,
`cid`, `crid`, `data`, `dav`, `dict`, `dns`, `file`, `ftp`, `geo`, `go`, `cid`, `crid`, `data`, `dav`, `dict`, `dns`, `file`, `ftp`, `geo`, `go`,
`gopher`, `h323`, `http`, `https`, `iax`, `icap`, `im`, `imap`, `info`, `gopher`, `h323`, `http`, `https`, `iax`, `icap`, `im`, `imap`, `info`,
`ipp`, `iris`, `iris.beep`, `iris.xpc`, `iris.xpcs`, `iris.lwz`, `ldap`, `ipp`, `iris`, `iris.beep`, `iris.xpc`, `iris.xpcs`, `iris.lwz`, `ldap`,
`mailto`, `mid`, `msrp`, `msrps`, `mtqp`, `mupdate`, `news`, `nfs`, `mailto`, `mid`, `msrp`, `msrps`, `mtqp`, `mupdate`, `news`, `nfs`,
`ni`, `nih`, `nntp`, `opaquelocktoken`, `pop`, `pres`, `rtsp`, `ni`, `nih`, `nntp`, `opaquelocktoken`, `pop`, `pres`, `rtsp`,
`service`, `session`, `shttp`, `sieve`, `sip`, `sips`, `sms`, `snmp`,` `service`, `session`, `shttp`, `sieve`, `sip`, `sips`, `sms`, `snmp`,`
soap.beep`, `soap.beeps`, `tag`, `tel`, `telnet`, `tftp`, `thismessage`, soap.beep`, `soap.beeps`, `tag`, `tel`, `telnet`, `tftp`, `thismessage`,
`tn3270`, `tip`, `tv`, `urn`, `vemmi`, `ws`, `wss`, `xcon`, `tn3270`, `tip`, `tv`, `urn`, `vemmi`, `ws`, `wss`, `xcon`,
skipping to change at line 5903 skipping to change at line 6127
. .
Spaces are not allowed in autolinks: Spaces are not allowed in autolinks:
. .
<http://foo.bar/baz bim> <http://foo.bar/baz bim>
. .
<p>&lt;http://foo.bar/baz bim&gt;</p> <p>&lt;http://foo.bar/baz bim&gt;</p>
. .
An [email autolink](#email-autolink) <a id="email-autolink"></a> An [email autolink](@email-autolink)
consists of `<`, followed by an [email address](#email-address), consists of `<`, followed by an [email address](#email-address),
followed by `>`. The link's label is the email address, followed by `>`. The link's label is the email address,
and the URL is `mailto:` followed by the email address. and the URL is `mailto:` followed by the email address.
An [email address](#email-address), <a id="email-address"></a> An [email address](@email-address),
for these purposes, is anything that matches for these purposes, is anything that matches
the [non-normative regex from the HTML5 the [non-normative regex from the HTML5
spec](http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#e-m ail-state-%28type=email%29): spec](http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#e-m ail-state-%28type=email%29):
/^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0- 9])? /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0- 9])?
(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/ (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
Examples of email autolinks: Examples of email autolinks:
. .
skipping to change at line 5983 skipping to change at line 6207
## Raw HTML ## Raw HTML
Text between `<` and `>` that looks like an HTML tag is parsed as a Text between `<` and `>` that looks like an HTML tag is parsed as a
raw HTML tag and will be rendered in HTML without escaping. raw HTML tag and will be rendered in HTML without escaping.
Tag and attribute names are not limited to current HTML tags, Tag and attribute names are not limited to current HTML tags,
so custom tags (and even, say, DocBook tags) may be used. so custom tags (and even, say, DocBook tags) may be used.
Here is the grammar for tags: Here is the grammar for tags:
A [tag name](#tag-name) <a id="tag-name"></a> consists of an ASCII letter A [tag name](@tag-name) consists of an ASCII letter
followed by zero or more ASCII letters or digits. followed by zero or more ASCII letters or digits.
An [attribute](#attribute) <a id="attribute"></a> consists of whitespace, An [attribute](@attribute) consists of whitespace,
an **attribute name**, and an optional **attribute value an [attribute name](#attribute-name), and an optional
specification**. [attribute value specification](#attribute-value-specification).
An [attribute name](#attribute-name) <a id="attribute-name"></a> An [attribute name](@attribute-name)
consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII
letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML
specification restricted to ASCII. HTML5 is laxer.) specification restricted to ASCII. HTML5 is laxer.)
An [attribute value specification](#attribute-value-specification) An [attribute value specification](@attribute-value-specification)
<a id="attribute-value-specification"></a> consists of optional whitespace, consists of optional whitespace,
a `=` character, optional whitespace, and an [attribute a `=` character, optional whitespace, and an [attribute
value](#attribute-value). value](#attribute-value).
An [attribute value](#attribute-value) <a id="attribute-value"></a> An [attribute value](@attribute-value)
consists of an [unquoted attribute value](#unquoted-attribute-value), consists of an [unquoted attribute value](#unquoted-attribute-value),
a [single-quoted attribute value](#single-quoted-attribute-value), a [single-quoted attribute value](#single-quoted-attribute-value),
or a [double-quoted attribute value](#double-quoted-attribute-value). or a [double-quoted attribute value](#double-quoted-attribute-value).
An [unquoted attribute value](#unquoted-attribute-value) An [unquoted attribute value](@unquoted-attribute-value)
<a id="unquoted-attribute-value"></a> is a nonempty string of characters not is a nonempty string of characters not
including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``. including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``.
A [single-quoted attribute value](#single-quoted-attribute-value) A [single-quoted attribute value](@single-quoted-attribute-value)
<a id="single-quoted-attribute-value"></a> consists of `'`, zero or more consists of `'`, zero or more
characters not including `'`, and a final `'`. characters not including `'`, and a final `'`.
A [double-quoted attribute value](#double-quoted-attribute-value) A [double-quoted attribute value](@double-quoted-attribute-value)
<a id="double-quoted-attribute-value"></a> consists of `"`, zero or more consists of `"`, zero or more
characters not including `"`, and a final `"`. characters not including `"`, and a final `"`.
An [open tag](#open-tag) <a id="open-tag"></a> consists of a `<` character, An [open tag](@open-tag) consists of a `<` character,
a [tag name](#tag-name), zero or more [attributes](#attribute), a [tag name](#tag-name), zero or more [attributes](#attribute),
optional whitespace, an optional `/` character, and a `>` character. optional whitespace, an optional `/` character, and a `>` character.
A [closing tag](#closing-tag) <a id="closing-tag"></a> consists of the A [closing tag](@closing-tag) consists of the
string `</`, a [tag name](#tag-name), optional whitespace, and the string `</`, a [tag name](#tag-name), optional whitespace, and the
character `>`. character `>`.
An [HTML comment](#html-comment) <a id="html-comment"></a> consists of the An [HTML comment](@html-comment) consists of the
string `<!--`, a string of characters not including the string `--`, and string `<!--`, a string of characters not including the string `--`, and
the string `-->`. the string `-->`.
A [processing instruction](#processing-instruction) A [processing instruction](@processing-instruction)
<a id="processing-instruction"></a> consists of the string `<?`, a string consists of the string `<?`, a string
of characters not including the string `?>`, and the string of characters not including the string `?>`, and the string
`?>`. `?>`.
A [declaration](#declaration) <a id="declaration"></a> consists of the A [declaration](@declaration) consists of the
string `<!`, a name consisting of one or more uppercase ASCII letters, string `<!`, a name consisting of one or more uppercase ASCII letters,
whitespace, a string of characters not including the character `>`, and whitespace, a string of characters not including the character `>`, and
the character `>`. the character `>`.
A [CDATA section](#cdata-section) <a id="cdata-section"></a> consists of A [CDATA section](@cdata-section) consists of
the string `<![CDATA[`, a string of characters not including the string the string `<![CDATA[`, a string of characters not including the string
`]]>`, and the string `]]>`. `]]>`, and the string `]]>`.
An [HTML tag](#html-tag) <a id="html-tag"></a> consists of an [open An [HTML tag](@html-tag) consists of an [open
tag](#open-tag), a [closing tag](#closing-tag), an [HTML tag](#open-tag), a [closing tag](#closing-tag), an [HTML
comment](#html-comment), a [processing comment](#html-comment), a [processing
instruction](#processing-instruction), an [element type instruction](#processing-instruction), an [element type
declaration](#element-type-declaration), or a [CDATA declaration](#element-type-declaration), or a [CDATA
section](#cdata-section). section](#cdata-section).
Here are some simple open tags: Here are some simple open tags:
. .
<a><bab><c2c> <a><bab><c2c>
skipping to change at line 6211 skipping to change at line 6435
. .
<a href="\""> <a href="\"">
. .
<p>&lt;a href=&quot;&quot;&quot;&gt;</p> <p>&lt;a href=&quot;&quot;&quot;&gt;</p>
. .
## Hard line breaks ## Hard line breaks
A line break (not in a code span or HTML tag) that is preceded A line break (not in a code span or HTML tag) that is preceded
by two or more spaces is parsed as a [hard line by two or more spaces and does not occur at the end of a block
break](#hard-line-break)<a id="hard-line-break"></a> (rendered is parsed as a [hard line break](@hard-line-break) (rendered
in HTML as a `<br />` tag): in HTML as a `<br />` tag):
. .
foo foo
baz baz
. .
<p>foo<br /> <p>foo<br />
baz</p> baz</p>
. .
skipping to change at line 6315 skipping to change at line 6539
. .
. .
<a href="foo\ <a href="foo\
bar"> bar">
. .
<p><a href="foo\ <p><a href="foo\
bar"></p> bar"></p>
. .
Hard line breaks are for separating inline content within a block.
Neither syntax for hard line breaks works at the end of a paragraph or
other block element:
.
foo\
.
<p>foo\</p>
.
.
foo
.
<p>foo</p>
.
.
### foo\
.
<h3>foo\</h3>
.
.
### foo
.
<h3>foo</h3>
.
## Soft line breaks ## Soft line breaks
A regular line break (not in a code span or HTML tag) that is not A regular line break (not in a code span or HTML tag) that is not
preceded by two or more spaces is parsed as a softbreak. (A preceded by two or more spaces is parsed as a softbreak. (A
softbreak may be rendered in HTML either as a newline or as a space. softbreak may be rendered in HTML either as a newline or as a space.
The result will be the same in browsers. In the examples here, a The result will be the same in browsers. In the examples here, a
newline will be used.) newline will be used.)
. .
foo foo
 End of changes. 97 change blocks. 
178 lines changed or deleted 430 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/