spec.txt   spec.txt 
--- ---
title: CommonMark Spec title: CommonMark Spec
author: author: John MacFarlane
- John MacFarlane version: 0.16
version: 0.15 date: 2015-01-14
date: 2014-12-31
license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)' license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)'
... ...
# Introduction # Introduction
## What is Markdown? ## What is Markdown?
Markdown is a plain text format for writing structured documents, Markdown is a plain text format for writing structured documents,
based on conventions used for indicating formatting in email and based on conventions used for indicating formatting in email and
usenet posts. It was developed in 2004 by John Gruber, who wrote usenet posts. It was developed in 2004 by John Gruber, who wrote
skipping to change at line 205 skipping to change at line 204
in Markdown with a small extension for the side-by-side tests. in Markdown with a small extension for the side-by-side tests.
The script `spec2md.pl` can be used to turn `spec.txt` into pandoc The script `spec2md.pl` can be used to turn `spec.txt` into pandoc
Markdown, which can then be converted into other formats. Markdown, which can then be converted into other formats.
In the examples, the `→` character is used to represent tabs. In the examples, the `→` character is used to represent tabs.
# Preliminaries # Preliminaries
## Characters and lines ## Characters and lines
The input is a sequence of zero or more [lines](#line). Any sequence of [character]s is a valid CommonMark
document.
A [line](@line)
is a sequence of zero or more [characters](#character) followed by a
[line ending](#line-ending) or by the end of file.
A [character](@character) is a unicode code point. A [character](@character) is a unicode code point.
This spec does not specify an encoding; it thinks of lines as composed This spec does not specify an encoding; it thinks of lines as composed
of characters rather than bytes. A conforming parser may be limited of characters rather than bytes. A conforming parser may be limited
to a certain encoding. to a certain encoding.
A [line](@line) is a sequence of zero or more [character]s
followed by a [line ending] or by the end of file.
A [line ending](@line-ending) is, depending on the platform, a A [line ending](@line-ending) is, depending on the platform, a
newline (`U+000A`), carriage return (`U+000D`), or newline (`U+000A`), carriage return (`U+000D`), or
carriage return + newline. carriage return + newline.
For security reasons, a conforming parser must strip or replace the For security reasons, a conforming parser must strip or replace the
Unicode character `U+0000`. Unicode character `U+0000`.
A line containing no characters, or a line containing only spaces A line containing no characters, or a line containing only spaces
(`U+0020`) or tabs (`U+0009`), is called a [blank line](@blank-line). (`U+0020`) or tabs (`U+0009`), is called a [blank line](@blank-line).
The following definitions of character classes will be used in this spec: The following definitions of character classes will be used in this spec:
A [whitespace character](@whitespace-character) is a space A [whitespace character](@whitespace-character) is a space
(`U+0020`), tab (`U+0009`), carriage return (`U+000D`), or (`U+0020`), tab (`U+0009`), carriage return (`U+000D`), or
newline (`U+000A`). newline (`U+000A`).
[Whitespace](@whitespace) is a sequence of one or more [whitespace [Whitespace](@whitespace) is a sequence of one or more [whitespace
characters](#whitespace-character). character]s.
A [unicode whitespace character](@unicode-whitespace-character) is A [unicode whitespace character](@unicode-whitespace-character) is
any code point in the unicode `Zs` class, or a tab (`U+0009`), any code point in the unicode `Zs` class, or a tab (`U+0009`),
carriage return (`U+000D`), newline (`U+000A`), or form feed carriage return (`U+000D`), newline (`U+000A`), or form feed
(`U+000C`). (`U+000C`).
[Unicode whitespace](@unicode-whitespace) is a sequence of one [Unicode whitespace](@unicode-whitespace) is a sequence of one
or more [unicode whitespace characters](#unicode-whitespace-character). or more [unicode whitespace character]s.
A [non-space character](@non-space-character) is anything but `U+0020`. A [non-space character](@non-space-character) is anything but `U+0020`.
An [ASCII punctuation character](@ascii-punctuation-character) An [ASCII punctuation character](@ascii-punctuation-character)
is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`, is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`,
`*`, `+`, `,`, `-`, `.`, `/`, `:`, `;`, `<`, `=`, `>`, `?`, `@`, `*`, `+`, `,`, `-`, `.`, `/`, `:`, `;`, `<`, `=`, `>`, `?`, `@`,
`[`, `\`, `]`, `^`, `_`, `` ` ``, `{`, `|`, `}`, or `~`. `[`, `\`, `]`, `^`, `_`, `` ` ``, `{`, `|`, `}`, or `~`.
A [punctuation character](@punctuation-character) is an [ASCII A [punctuation character](@punctuation-character) is an [ASCII
punctuation character](#ascii-punctuation-character) or anything in punctuation character] or anything in
the unicode classes `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`. the unicode classes `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`.
## Tab expansion ## Tab expansion
Tabs in lines are expanded to spaces, with a tab stop of 4 characters: Tabs in lines are expanded to spaces, with a tab stop of 4 characters:
. .
→foo→baz→→bim →foo→baz→→bim
. .
<pre><code>foo baz bim <pre><code>foo baz bim
skipping to change at line 311 skipping to change at line 310
paragraphs, headers, and other block constructs can be parsed for inline paragraphs, headers, and other block constructs can be parsed for inline
structure. The second step requires information about link reference structure. The second step requires information about link reference
definitions that will be available only at the end of the first definitions that will be available only at the end of the first
step. Note that the first step requires processing lines in sequence, step. Note that the first step requires processing lines in sequence,
but the second can be parallelized, since the inline parsing of but the second can be parallelized, since the inline parsing of
one block element does not affect the inline parsing of any other. one block element does not affect the inline parsing of any other.
## Container blocks and leaf blocks ## Container blocks and leaf blocks
We can divide blocks into two types: We can divide blocks into two types:
[container blocks](@container-block), [container block](@container-block)s,
which can contain other blocks, and [leaf blocks](@leaf-block), which can contain other blocks, and [leaf block](@leaf-block)s,
which cannot. which cannot.
# Leaf blocks # Leaf blocks
This section describes the different kinds of leaf block that make up a This section describes the different kinds of leaf block that make up a
Markdown document. Markdown document.
## Horizontal rules ## Horizontal rules
A line consisting of 0-3 spaces of indentation, followed by a sequence A line consisting of 0-3 spaces of indentation, followed by a sequence
of three or more matching `-`, `_`, or `*` characters, each followed of three or more matching `-`, `_`, or `*` characters, each followed
optionally by any number of spaces, forms a [horizontal optionally by any number of spaces, forms a
rule](@horizontal-rule). [horizontal rule](@horizontal-rule).
. .
*** ***
--- ---
___ ___
. .
<hr /> <hr />
<hr /> <hr />
<hr /> <hr />
. .
skipping to change at line 442 skipping to change at line 441
a------ a------
---a--- ---a---
. .
<p>_ _ _ _ a</p> <p>_ _ _ _ a</p>
<p>a------</p> <p>a------</p>
<p>---a---</p> <p>---a---</p>
. .
It is required that all of the It is required that all of the [non-space character]s be the same.
[non-space characters](#non-space-character) be the same.
So, this is not a horizontal rule: So, this is not a horizontal rule:
. .
*-* *-*
. .
<p><em>-</em></p> <p><em>-</em></p>
. .
Horizontal rules do not need blank lines before or after: Horizontal rules do not need blank lines before or after:
skipping to change at line 482 skipping to change at line 480
*** ***
bar bar
. .
<p>Foo</p> <p>Foo</p>
<hr /> <hr />
<p>bar</p> <p>bar</p>
. .
If a line of dashes that meets the above conditions for being a If a line of dashes that meets the above conditions for being a
horizontal rule could also be interpreted as the underline of a [setext horizontal rule could also be interpreted as the underline of a [setext
header](#setext-header), the interpretation as a header], the interpretation as a
[setext-header](#setext-header) takes precedence. Thus, for example, [setext header] takes precedence. Thus, for example,
this is a setext header, not a paragraph followed by a horizontal rule: this is a setext header, not a paragraph followed by a horizontal rule:
. .
Foo Foo
--- ---
bar bar
. .
<h2>Foo</h2> <h2>Foo</h2>
<p>bar</p> <p>bar</p>
. .
When both a horizontal rule and a list item are possible When both a horizontal rule and a list item are possible
interpretations of a line, the horizontal rule is preferred: interpretations of a line, the horizontal rule takes precedence:
. .
* Foo * Foo
* * * * * *
* Bar * Bar
. .
<ul> <ul>
<li>Foo</li> <li>Foo</li>
</ul> </ul>
<hr /> <hr />
skipping to change at line 532 skipping to change at line 530
</li> </li>
</ul> </ul>
. .
## ATX headers ## ATX headers
An [ATX header](@atx-header) An [ATX header](@atx-header)
consists of a string of characters, parsed as inline content, between an consists of a string of characters, parsed as inline content, between an
opening sequence of 1--6 unescaped `#` characters and an optional opening sequence of 1--6 unescaped `#` characters and an optional
closing sequence of any number of `#` characters. The opening sequence closing sequence of any number of `#` characters. The opening sequence
of `#` characters cannot be followed directly by a nonspace character. of `#` characters cannot be followed directly by a
[non-space character].
The optional closing sequence of `#`s must be preceded by a space and may be The optional closing sequence of `#`s must be preceded by a space and may be
followed by spaces only. The opening `#` character may be indented 0-3 followed by spaces only. The opening `#` character may be indented 0-3
spaces. The raw contents of the header are stripped of leading and spaces. The raw contents of the header are stripped of leading and
trailing spaces before being parsed as inline content. The header level trailing spaces before being parsed as inline content. The header level
is equal to the number of `#` characters in the opening sequence. is equal to the number of `#` characters in the opening sequence.
Simple headers: Simple headers:
. .
# foo # foo
skipping to change at line 658 skipping to change at line 657
. .
Spaces are allowed after the closing sequence: Spaces are allowed after the closing sequence:
. .
### foo ### ### foo ###
. .
<h3>foo</h3> <h3>foo</h3>
. .
A sequence of `#` characters with a nonspace character following it A sequence of `#` characters with a
[non-space character] following it
is not a closing sequence, but counts as part of the contents of the is not a closing sequence, but counts as part of the contents of the
header: header:
. .
### foo ### b ### foo ### b
. .
<h3>foo ### b</h3> <h3>foo ### b</h3>
. .
The closing sequence must be preceded by a space: The closing sequence must be preceded by a space:
skipping to change at line 727 skipping to change at line 727
### ### ### ###
. .
<h2></h2> <h2></h2>
<h1></h1> <h1></h1>
<h3></h3> <h3></h3>
. .
## Setext headers ## Setext headers
A [setext header](@setext-header) A [setext header](@setext-header)
consists of a line of text, containing at least one nonspace character, consists of a line of text, containing at least one
[non-space character],
with no more than 3 spaces indentation, followed by a [setext header with no more than 3 spaces indentation, followed by a [setext header
underline](#setext-header-underline). The line of text must be underline]. The line of text must be
one that, were it not followed by the setext header underline, one that, were it not followed by the setext header underline,
would be interpreted as part of a paragraph: it cannot be a code would be interpreted as part of a paragraph: it cannot be a code
block, header, blockquote, horizontal rule, or list. block, header, blockquote, horizontal rule, or list.
A [setext header underline](@setext-header-underline) is a sequence of A [setext header underline](@setext-header-underline) is a sequence of
`=` characters or a sequence of `-` characters, with no more than 3 `=` characters or a sequence of `-` characters, with no more than 3
spaces indentation and any number of trailing spaces. If a line spaces indentation and any number of trailing spaces. If a line
containing a single `-` can be interpreted as an containing a single `-` can be interpreted as an
empty [list item](#list-items), it should be interpreted this way empty [list items], it should be interpreted this way
and not as a [setext header underline](#setext-header-underline). and not as a [setext header underline].
The header is a level 1 header if `=` characters are used in the The header is a level 1 header if `=` characters are used in the
[setext header underline](#setext-header-underline), and a level 2 [setext header underline], and a level 2
header if `-` characters are used. The contents of the header are the header if `-` characters are used. The contents of the header are the
result of parsing the first line as Markdown inline content. result of parsing the first line as Markdown inline content.
In general, a setext header need not be preceded or followed by a In general, a setext header need not be preceded or followed by a
blank line. However, it cannot interrupt a paragraph, so when a blank line. However, it cannot interrupt a paragraph, so when a
setext header comes after a paragraph, a blank line is needed between setext header comes after a paragraph, a blank line is needed between
them. them.
Simple examples: Simple examples:
skipping to change at line 826 skipping to change at line 827
Foo Foo
---- ----
. .
<h2>Foo</h2> <h2>Foo</h2>
. .
Four spaces is too much: Four spaces is too much:
. .
Foo Foo
--- ---
. .
<p>Foo <p>Foo
---</p> ---</p>
. .
The setext header underline cannot contain internal spaces: The setext header underline cannot contain internal spaces:
. .
Foo Foo
= = = =
skipping to change at line 884 skipping to change at line 885
--- ---
of dashes"/> of dashes"/>
. .
<h2>`Foo</h2> <h2>`Foo</h2>
<p>`</p> <p>`</p>
<h2>&lt;a title=&quot;a lot</h2> <h2>&lt;a title=&quot;a lot</h2>
<p>of dashes&quot;/&gt;</p> <p>of dashes&quot;/&gt;</p>
. .
The setext header underline cannot be a [lazy continuation The setext header underline cannot be a [lazy continuation
line](#lazy-continuation-line) in a list item or block quote: line] in a list item or block quote:
. .
> Foo > Foo
--- ---
. .
<blockquote> <blockquote>
<p>Foo</p> <p>Foo</p>
</blockquote> </blockquote>
<hr /> <hr />
. .
skipping to change at line 1004 skipping to change at line 1005
. .
\> foo \> foo
------ ------
. .
<h2>&gt; foo</h2> <h2>&gt; foo</h2>
. .
## Indented code blocks ## Indented code blocks
An [indented code block](@indented-code-block) is composed of one or more An [indented code block](@indented-code-block) is composed of one or more
[indented chunks](#indented-chunk) separated by blank lines. [indented chunk]s separated by blank lines.
An [indented chunk](@indented-chunk) is a sequence of non-blank lines, An [indented chunk](@indented-chunk) is a sequence of non-blank lines,
each indented four or more spaces. The contents of the code block are each indented four or more spaces. The contents of the code block are
the literal contents of the lines, including trailing the literal contents of the lines, including trailing
[line endings](#line-ending), minus four spaces of indentation. [line ending]s, minus four spaces of indentation.
An indented code block has no attributes. An indented code block has no [info string].
An indented code block cannot interrupt a paragraph, so there must be An indented code block cannot interrupt a paragraph, so there must be
a blank line between a paragraph and a following indented code block. a blank line between a paragraph and a following indented code block.
(A blank line is not needed, however, between a code block and a following (A blank line is not needed, however, between a code block and a following
paragraph.) paragraph.)
. .
a simple a simple
indented code block indented code block
. .
skipping to change at line 1159 skipping to change at line 1160
A [code fence](@code-fence) is a sequence A [code fence](@code-fence) is a sequence
of at least three consecutive backtick characters (`` ` ``) or of at least three consecutive backtick characters (`` ` ``) or
tildes (`~`). (Tildes and backticks cannot be mixed.) tildes (`~`). (Tildes and backticks cannot be mixed.)
A [fenced code block](@fenced-code-block) A [fenced code block](@fenced-code-block)
begins with a code fence, indented no more than three spaces. begins with a code fence, indented no more than three spaces.
The line with the opening code fence may optionally contain some text The line with the opening code fence may optionally contain some text
following the code fence; this is trimmed of leading and trailing following the code fence; this is trimmed of leading and trailing
spaces and called the [info string](@info-string). spaces and called the [info string](@info-string).
The info string may not contain any backtick The [info string] may not contain any backtick
characters. (The reason for this restriction is that otherwise characters. (The reason for this restriction is that otherwise
some inline code would be incorrectly interpreted as the some inline code would be incorrectly interpreted as the
beginning of a fenced code block.) beginning of a fenced code block.)
The content of the code block consists of all subsequent lines, until The content of the code block consists of all subsequent lines, until
a closing [code fence](#code-fence) of the same type as the code block a closing [code fence] of the same type as the code block
began with (backticks or tildes), and with at least as many backticks began with (backticks or tildes), and with at least as many backticks
or tildes as the opening code fence. If the leading code fence is or tildes as the opening code fence. If the leading code fence is
indented N spaces, then up to N spaces of indentation are removed from indented N spaces, then up to N spaces of indentation are removed from
each line of the content (if present). (If a content line is not each line of the content (if present). (If a content line is not
indented, it is preserved unchanged. If it is indented less than N indented, it is preserved unchanged. If it is indented less than N
spaces, all of the indentation is removed.) spaces, all of the indentation is removed.)
The closing code fence may be indented up to three spaces, and may be The closing code fence may be indented up to three spaces, and may be
followed only by spaces, which are ignored. If the end of the followed only by spaces, which are ignored. If the end of the
containing block (or document) is reached and no closing code fence containing block (or document) is reached and no closing code fence
skipping to change at line 1187 skipping to change at line 1188
opening code fence until the end of the containing block (or opening code fence until the end of the containing block (or
document). (An alternative spec would require backtracking in the document). (An alternative spec would require backtracking in the
event that a closing code fence is not found. But this makes parsing event that a closing code fence is not found. But this makes parsing
much less efficient, and there seems to be no real down side to the much less efficient, and there seems to be no real down side to the
behavior described here.) behavior described here.)
A fenced code block may interrupt a paragraph, and does not require A fenced code block may interrupt a paragraph, and does not require
a blank line either before or after. a blank line either before or after.
The content of a code fence is treated as literal text, not parsed The content of a code fence is treated as literal text, not parsed
as inlines. The first word of the info string is typically used to as inlines. The first word of the [info string] is typically used to
specify the language of the code sample, and rendered in the `class` specify the language of the code sample, and rendered in the `class`
attribute of the `code` tag. However, this spec does not mandate any attribute of the `code` tag. However, this spec does not mandate any
particular treatment of the info string. particular treatment of the [info string].
Here is a simple example with backticks: Here is a simple example with backticks:
. .
``` ```
< <
> >
``` ```
. .
<pre><code>&lt; <pre><code>&lt;
skipping to change at line 1448 skipping to change at line 1449
bar bar
~~~ ~~~
# baz # baz
. .
<h2>foo</h2> <h2>foo</h2>
<pre><code>bar <pre><code>bar
</code></pre> </code></pre>
<h1>baz</h1> <h1>baz</h1>
. .
An [info string](#info-string) can be provided after the opening code fence. An [info string] can be provided after the opening code fence.
Opening and closing spaces will be stripped, and the first word, prefixed Opening and closing spaces will be stripped, and the first word, prefixed
with `language-`, is used as the value for the `class` attribute of the with `language-`, is used as the value for the `class` attribute of the
`code` element within the enclosing `pre` element. `code` element within the enclosing `pre` element.
. .
```ruby ```ruby
def foo(x) def foo(x)
return 3 return 3
end end
``` ```
skipping to change at line 1486 skipping to change at line 1487
</code></pre> </code></pre>
. .
. .
````; ````;
```` ````
. .
<pre><code class="language-;"></code></pre> <pre><code class="language-;"></code></pre>
. .
Info strings for backtick code blocks cannot contain backticks: [Info string]s for backtick code blocks cannot contain backticks:
. .
``` aa ``` ``` aa ```
foo foo
. .
<p><code>aa</code> <p><code>aa</code>
foo</p> foo</p>
. .
Closing code fences cannot have info strings: Closing code fences cannot have [info string]s:
. .
``` ```
``` aaa ``` aaa
``` ```
. .
<pre><code>``` aaa <pre><code>``` aaa
</code></pre> </code></pre>
. .
## HTML blocks ## HTML blocks
An [HTML block tag](@html-block-tag) is An [HTML block tag](@html-block-tag) is
an [open tag](#open-tag) or [closing tag](#closing-tag) whose tag an [open tag] or [closing tag] whose tag
name is one of the following (case-insensitive): name is one of the following (case-insensitive):
`article`, `header`, `aside`, `hgroup`, `blockquote`, `hr`, `iframe`, `article`, `header`, `aside`, `hgroup`, `blockquote`, `hr`, `iframe`,
`body`, `li`, `map`, `button`, `object`, `canvas`, `ol`, `caption`, `body`, `li`, `map`, `button`, `object`, `canvas`, `ol`, `caption`,
`output`, `col`, `p`, `colgroup`, `pre`, `dd`, `progress`, `div`, `output`, `col`, `p`, `colgroup`, `pre`, `dd`, `progress`, `div`,
`section`, `dl`, `table`, `td`, `dt`, `tbody`, `embed`, `textarea`, `section`, `dl`, `table`, `td`, `dt`, `tbody`, `embed`, `textarea`,
`fieldset`, `tfoot`, `figcaption`, `th`, `figure`, `thead`, `footer`, `fieldset`, `tfoot`, `figcaption`, `th`, `figure`, `thead`, `footer`,
`tr`, `form`, `ul`, `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `video`, `tr`, `form`, `ul`, `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `video`,
`script`, `style`. `script`, `style`.
An [HTML block](@html-block) begins with an An [HTML block](@html-block) begins with an
[HTML block tag](#html-block-tag), [HTML comment](#html-comment), [HTML block tag], [HTML comment], [processing instruction],
[processing instruction](#processing-instruction), [declaration], or [CDATA section].
[declaration](#declaration), or [CDATA section](#cdata-section). It ends when a [blank line] or the end of the
It ends when a [blank line](#blank-line) or the end of the
input is encountered. The initial line may be indented up to three input is encountered. The initial line may be indented up to three
spaces, and subsequent lines may have any indentation. The contents spaces, and subsequent lines may have any indentation. The contents
of the HTML block are interpreted as raw HTML, and will not be escaped of the HTML block are interpreted as raw HTML, and will not be escaped
in HTML output. in HTML output.
Some simple examples: Some simple examples:
. .
<table> <table>
<tr> <tr>
skipping to change at line 1795 skipping to change at line 1795
Moreover, blank lines are usually not necessary and can be Moreover, blank lines are usually not necessary and can be
deleted. The exception is inside `<pre>` tags; here, one can deleted. The exception is inside `<pre>` tags; here, one can
replace the blank lines with `&#10;` entities. replace the blank lines with `&#10;` entities.
So there is no important loss of expressive power with the new rule. So there is no important loss of expressive power with the new rule.
## Link reference definitions ## Link reference definitions
A [link reference definition](@link-reference-definition) A [link reference definition](@link-reference-definition)
consists of a [link label](#link-label), indented up to three spaces, followed consists of a [link label], indented up to three spaces, followed
by a colon (`:`), optional [whitespace](#whitespace) (including up to one by a colon (`:`), optional [whitespace] (including up to one
[line ending](#line-ending)), a [link destination](#link-destination), [line ending]), a [link destination],
optional [whitespace](#whitespace) (including up to one optional [whitespace] (including up to one
[line ending](#line-ending)), and an optional [link [line ending]), and an optional [link
title](#link-title), which if it is present must be separated title], which if it is present must be separated
from the [link destination](#link-destination) by [whitespace](#whitespace). from the [link destination] by [whitespace].
No further [non-space characters](#non-space-character) may occur on the line. No further [non-space character]s may occur on the line.
A [link reference-definition](#link-reference-definition) A [link reference-definition]
does not correspond to a structural element of a document. Instead, it does not correspond to a structural element of a document. Instead, it
defines a label which can be used in [reference links](#reference-link) defines a label which can be used in [reference link]s
and reference-style [images](#images) elsewhere in the document. [Link and reference-style [images] elsewhere in the document. [Link
reference definitions] can come either before or after the links that use reference definitions] can come either before or after the links that use
them. them.
. .
[foo]: /url "title" [foo]: /url "title"
[foo] [foo]
. .
<p><a href="/url" title="title">foo</a></p> <p><a href="/url" title="title">foo</a></p>
. .
skipping to change at line 1892 skipping to change at line 1892
. .
[foo] [foo]
[foo]: first [foo]: first
[foo]: second [foo]: second
. .
<p><a href="first">foo</a></p> <p><a href="first">foo</a></p>
. .
As noted in the section on [Links], matching of labels is As noted in the section on [Links], matching of labels is
case-insensitive (see [matches](#matches)). case-insensitive (see [matches]).
. .
[FOO]: /url [FOO]: /url
[Foo] [Foo]
. .
<p><a href="/url">Foo</a></p> <p><a href="/url">Foo</a></p>
. .
. .
skipping to change at line 1919 skipping to change at line 1919
Here is a link reference definition with no corresponding link. Here is a link reference definition with no corresponding link.
It contributes nothing to the document. It contributes nothing to the document.
. .
[foo]: /url [foo]: /url
. .
. .
This is not a link reference definition, because there are This is not a link reference definition, because there are
[non-space characters](#non-space-character) after the title: [non-space character]s after the title:
. .
[foo]: /url "title" ok [foo]: /url "title" ok
. .
<p>[foo]: /url &quot;title&quot; ok</p> <p>[foo]: /url &quot;title&quot; ok</p>
. .
This is not a link reference definition, because it is indented This is not a link reference definition, because it is indented
four spaces: four spaces:
skipping to change at line 1955 skipping to change at line 1955
[foo]: /url [foo]: /url
``` ```
[foo] [foo]
. .
<pre><code>[foo]: /url <pre><code>[foo]: /url
</code></pre> </code></pre>
<p>[foo]</p> <p>[foo]</p>
. .
A [link reference definition](#link-reference-definition) cannot A [link reference definition] cannot interrupt a paragraph.
interrupt a paragraph.
. .
Foo Foo
[bar]: /baz [bar]: /baz
[bar] [bar]
. .
<p>Foo <p>Foo
[bar]: /baz</p> [bar]: /baz</p>
<p>[bar]</p> <p>[bar]</p>
skipping to change at line 1983 skipping to change at line 1982
# [Foo] # [Foo]
[foo]: /url [foo]: /url
> bar > bar
. .
<h1><a href="/url">Foo</a></h1> <h1><a href="/url">Foo</a></h1>
<blockquote> <blockquote>
<p>bar</p> <p>bar</p>
</blockquote> </blockquote>
. .
Several [link references definitions](#link-reference-definition) Several [link reference definition]s
can occur one after another, without intervening blank lines. can occur one after another, without intervening blank lines.
. .
[foo]: /foo-url "foo" [foo]: /foo-url "foo"
[bar]: /bar-url [bar]: /bar-url
"bar" "bar"
[baz]: /baz-url [baz]: /baz-url
[foo], [foo],
[bar], [bar],
[baz] [baz]
. .
<p><a href="/foo-url" title="foo">foo</a>, <p><a href="/foo-url" title="foo">foo</a>,
<a href="/bar-url" title="bar">bar</a>, <a href="/bar-url" title="bar">bar</a>,
<a href="/baz-url">baz</a></p> <a href="/baz-url">baz</a></p>
. .
[Link reference definitions](#link-reference-definition) can occur [Link reference definition]s can occur
inside block containers, like lists and block quotations. They inside block containers, like lists and block quotations. They
affect the entire document, not just the container in which they affect the entire document, not just the container in which they
are defined: are defined:
. .
[foo] [foo]
> [foo]: /url > [foo]: /url
. .
<p><a href="/url">foo</a></p> <p><a href="/url">foo</a></p>
skipping to change at line 2023 skipping to change at line 2022
</blockquote> </blockquote>
. .
## Paragraphs ## Paragraphs
A sequence of non-blank lines that cannot be interpreted as other A sequence of non-blank lines that cannot be interpreted as other
kinds of blocks forms a [paragraph](@paragraph). kinds of blocks forms a [paragraph](@paragraph).
The contents of the paragraph are the result of parsing the The contents of the paragraph are the result of parsing the
paragraph's raw content as inlines. The paragraph's raw content paragraph's raw content as inlines. The paragraph's raw content
is formed by concatenating the lines and removing initial and final is formed by concatenating the lines and removing initial and final
spaces. [whitespace].
A simple example with two paragraphs: A simple example with two paragraphs:
. .
aaa aaa
bbb bbb
. .
<p>aaa</p> <p>aaa</p>
<p>bbb</p> <p>bbb</p>
skipping to change at line 2107 skipping to change at line 2106
aaa aaa
bbb bbb
. .
<pre><code>aaa <pre><code>aaa
</code></pre> </code></pre>
<p>bbb</p> <p>bbb</p>
. .
Final spaces are stripped before inline parsing, so a paragraph Final spaces are stripped before inline parsing, so a paragraph
that ends with two or more spaces will not end with a [hard line that ends with two or more spaces will not end with a [hard line
break](#hard-line-break): break]:
. .
aaa aaa
bbb bbb
. .
<p>aaa<br /> <p>aaa<br />
bbb</p> bbb</p>
. .
## Blank lines ## Blank lines
[Blank lines](#blank-line) between block-level elements are ignored, [Blank line]s between block-level elements are ignored,
except for the role they play in determining whether a [list](#list) except for the role they play in determining whether a [list]
is [tight](#tight) or [loose](#loose). is [tight] or [loose].
Blank lines at the beginning and end of the document are also ignored. Blank lines at the beginning and end of the document are also ignored.
. .
aaa aaa
# aaa # aaa
. .
<p>aaa</p> <p>aaa</p>
<h1>aaa</h1> <h1>aaa</h1>
. .
# Container blocks # Container blocks
A [container block](#container-block) is a block that has other A [container block] is a block that has other
blocks as its contents. There are two basic kinds of container blocks: blocks as its contents. There are two basic kinds of container blocks:
[block quotes](#block-quote) and [list items](#list-item). [block quotes] and [list items].
[Lists](#list) are meta-containers for [list items](#list-item). [Lists] are meta-containers for [list items].
We define the syntax for container blocks recursively. The general We define the syntax for container blocks recursively. The general
form of the definition is: form of the definition is:
> If X is a sequence of blocks, then the result of > If X is a sequence of blocks, then the result of
> transforming X in such-and-such a way is a container of type Y > transforming X in such-and-such a way is a container of type Y
> with these blocks as its content. > with these blocks as its content.
So, we explain what counts as a block quote or list item by explaining So, we explain what counts as a block quote or list item by explaining
how these can be *generated* from their contents. This should suffice how these can be *generated* from their contents. This should suffice
to define the syntax, although it does not give a recipe for *parsing* to define the syntax, although it does not give a recipe for *parsing*
these constructions. (A recipe is provided below in the section entitled these constructions. (A recipe is provided below in the section entitled
[A parsing strategy](#appendix-a-a-parsing-strategy).) [A parsing strategy](#appendix-a-a-parsing-strategy).)
## Block quotes ## Block quotes
A [block quote marker](@block-quote-marker) A [block quote marker](@block-quote-marker)
consists of 0-3 spaces of initial indent, plus (a) the character `>` together consists of 0-3 spaces of initial indent, plus (a) the character `>` together
with a following space, or (b) a single character `>` not followed by a space. with a following space, or (b) a single character `>` not followed by a space.
The following rules define [block quotes](@block-quote): The following rules define [block quotes]:
1. **Basic case.** If a string of lines *Ls* constitute a sequence 1. **Basic case.** If a string of lines *Ls* constitute a sequence
of blocks *Bs*, then the result of prepending a [block quote of blocks *Bs*, then the result of prepending a [block quote
marker](#block-quote-marker) to the beginning of each line in *Ls* marker] to the beginning of each line in *Ls*
is a [block quote](#block-quote) containing *Bs*. is a [block quote](#block-quotes) containing *Bs*.
2. **Laziness.** If a string of lines *Ls* constitute a [block 2. **Laziness.** If a string of lines *Ls* constitute a [block
quote](#block-quote) with contents *Bs*, then the result of deleting quote](#block-quotes) with contents *Bs*, then the result of deleting
the initial [block quote marker](#block-quote-marker) from one or the initial [block quote marker] from one or
more lines in which the next more lines in which the next [non-space character] after the [block
[non-space character](#non-space-character) after the [block quote marker] is [paragraph continuation
quote marker](#block-quote-marker) is [paragraph continuation text] is a block quote with *Bs* as its content.
text](#paragraph-continuation-text) is a block quote with *Bs* as
its content.
[Paragraph continuation text](@paragraph-continuation-text) is text [Paragraph continuation text](@paragraph-continuation-text) is text
that will be parsed as part of the content of a paragraph, but does that will be parsed as part of the content of a paragraph, but does
not occur at the beginning of the paragraph. not occur at the beginning of the paragraph.
3. **Consecutiveness.** A document cannot contain two [block 3. **Consecutiveness.** A document cannot contain two [block
quotes](#block-quote) in a row unless there is a [blank quotes] in a row unless there is a [blank line] between them.
line](#blank-line) between them.
Nothing else counts as a [block quote](#block-quote). Nothing else counts as a [block quote](#block-quotes).
Here is a simple example: Here is a simple example:
. .
> # Foo > # Foo
> bar > bar
> baz > baz
. .
<blockquote> <blockquote>
<h1>Foo</h1> <h1>Foo</h1>
skipping to change at line 2499 skipping to change at line 2495
<blockquote> <blockquote>
<p>foo <p>foo
bar bar
baz</p> baz</p>
</blockquote> </blockquote>
</blockquote> </blockquote>
</blockquote> </blockquote>
. .
When including an indented code block in a block quote, When including an indented code block in a block quote,
remember that the [block quote marker](#block-quote-marker) includes remember that the [block quote marker] includes
both the `>` and a following space. So *five spaces* are needed after both the `>` and a following space. So *five spaces* are needed after
the `>`: the `>`:
. .
> code > code
> not code > not code
. .
<blockquote> <blockquote>
<pre><code>code <pre><code>code
</code></pre> </code></pre>
</blockquote> </blockquote>
<blockquote> <blockquote>
<p>not code</p> <p>not code</p>
</blockquote> </blockquote>
. .
## List items ## List items
A [list marker](@list-marker) is a A [list marker](@list-marker) is a
[bullet list marker](#bullet-list-marker) or an [ordered list [bullet list marker] or an [ordered list marker].
marker](#ordered-list-marker).
A [bullet list marker](@bullet-list-marker) A [bullet list marker](@bullet-list-marker)
is a `-`, `+`, or `*` character. is a `-`, `+`, or `*` character.
An [ordered list marker](@ordered-list-marker) An [ordered list marker](@ordered-list-marker)
is a sequence of one of more digits (`0-9`), followed by either a is a sequence of one of more digits (`0-9`), followed by either a
`.` character or a `)` character. `.` character or a `)` character.
The following rules define [list items](@list-item): The following rules define [list items]:
1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of 1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of
blocks *Bs* starting with a [non-space character](#non-space-character) blocks *Bs* starting with a [non-space character] and not separated
and not separated
from each other by more than one blank line, and *M* is a list from each other by more than one blank line, and *M* is a list
marker *M* of width *W* followed by 0 < *N* < 5 spaces, then the result marker *M* of width *W* followed by 0 < *N* < 5 spaces, then the result
of prepending *M* and the following spaces to the first line of of prepending *M* and the following spaces to the first line of
*Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a
list item with *Bs* as its contents. The type of the list item list item with *Bs* as its contents. The type of the list item
(bullet or ordered) is determined by the type of its list marker. (bullet or ordered) is determined by the type of its list marker.
If the list item is ordered, then it is also assigned a start If the list item is ordered, then it is also assigned a start
number, based on the ordered list marker. number, based on the ordered list marker.
For example, let *Ls* be the lines For example, let *Ls* be the lines
skipping to change at line 2592 skipping to change at line 2586
<p>A block quote.</p> <p>A block quote.</p>
</blockquote> </blockquote>
</li> </li>
</ol> </ol>
. .
The most important thing to notice is that the position of The most important thing to notice is that the position of
the text after the list marker determines how much indentation the text after the list marker determines how much indentation
is needed in subsequent blocks in the list item. If the list is needed in subsequent blocks in the list item. If the list
marker takes up two spaces, and there are three spaces between marker takes up two spaces, and there are three spaces between
the list marker and the next nonspace character, then blocks the list marker and the next [non-space character], then blocks
must be indented five spaces in order to fall under the list must be indented five spaces in order to fall under the list
item. item.
Here are some examples showing how far content must be indented to be Here are some examples showing how far content must be indented to be
put under the list item: put under the list item:
. .
- one - one
two two
skipping to change at line 2649 skipping to change at line 2643
. .
<ul> <ul>
<li> <li>
<p>one</p> <p>one</p>
<p>two</p> <p>two</p>
</li> </li>
</ul> </ul>
. .
It is tempting to think of this in terms of columns: the continuation It is tempting to think of this in terms of columns: the continuation
blocks must be indented at least to the column of the first nonspace blocks must be indented at least to the column of the first
character after the list marker. However, that is not quite right. [non-space character] after the list marker. However, that is not quite right.
The spaces after the list marker determine how much relative indentation The spaces after the list marker determine how much relative indentation
is needed. Which column this indentation reaches will depend on is needed. Which column this indentation reaches will depend on
how the list item is embedded in other constructions, as shown by how the list item is embedded in other constructions, as shown by
this example: this example:
. .
> > 1. one > > 1. one
>> >>
>> two >> two
. .
skipping to change at line 2699 skipping to change at line 2693
<ul> <ul>
<li>one</li> <li>one</li>
</ul> </ul>
<p>two</p> <p>two</p>
</blockquote> </blockquote>
</blockquote> </blockquote>
. .
A list item may not contain blocks that are separated by more than A list item may not contain blocks that are separated by more than
one blank line. Thus, two blank lines will end a list, unless the one blank line. Thus, two blank lines will end a list, unless the
two blanks are contained in a [fenced code block](#fenced-code-block). two blanks are contained in a [fenced code block].
. .
- foo - foo
bar bar
- foo - foo
bar bar
skipping to change at line 2885 skipping to change at line 2879
<pre><code> indented code <pre><code> indented code
</code></pre> </code></pre>
<p>paragraph</p> <p>paragraph</p>
<pre><code>more code <pre><code>more code
</code></pre> </code></pre>
</li> </li>
</ol> </ol>
. .
Note that rules #1 and #2 only apply to two cases: (a) cases Note that rules #1 and #2 only apply to two cases: (a) cases
in which the lines to be included in a list item begin with a nonspace in which the lines to be included in a list item begin with a
character, and (b) cases in which they begin with an indented code [non-space character], and (b) cases in which
they begin with an indented code
block. In a case like the following, where the first block begins with block. In a case like the following, where the first block begins with
a three-space indent, the rules do not allow us to form a list item by a three-space indent, the rules do not allow us to form a list item by
indenting the whole thing and prepending a list marker: indenting the whole thing and prepending a list marker:
. .
foo foo
bar bar
. .
<p>foo</p> <p>foo</p>
skipping to change at line 2929 skipping to change at line 2924
bar bar
. .
<ul> <ul>
<li> <li>
<p>foo</p> <p>foo</p>
<p>bar</p> <p>bar</p>
</li> </li>
</ul> </ul>
. .
3. **Empty list item.** A [list marker](#list-marker) followed by a 3. **Empty list item.** A [list marker] followed by a
line containing only [whitespace](#whitespace) is a list item with line containing only [whitespace] is a list item with no contents.
no contents.
Here is an empty bullet list item: Here is an empty bullet list item:
. .
- foo - foo
- -
- bar - bar
. .
<ul> <ul>
<li>foo</li> <li>foo</li>
<li></li> <li></li>
<li>bar</li> <li>bar</li>
</ul> </ul>
. .
It does not matter whether there are spaces following the It does not matter whether there are spaces following the [list marker]:
[list marker](#list-marker):
. .
- foo - foo
- -
- bar - bar
. .
<ul> <ul>
<li>foo</li> <li>foo</li>
<li></li> <li></li>
<li>bar</li> <li>bar</li>
skipping to change at line 3081 skipping to change at line 3074
<pre><code>1. A paragraph <pre><code>1. A paragraph
with two lines. with two lines.
indented code indented code
&gt; A block quote. &gt; A block quote.
</code></pre> </code></pre>
. .
5. **Laziness.** If a string of lines *Ls* constitute a [list 5. **Laziness.** If a string of lines *Ls* constitute a [list
item](#list-item) with contents *Bs*, then the result of deleting item](#list-items) with contents *Bs*, then the result of deleting
some or all of the indentation from one or more lines in which the some or all of the indentation from one or more lines in which the
next [non-space character](#non-space-character) after the indentation is next [non-space character] after the indentation is
[paragraph continuation text](#paragraph-continuation-text) is a [paragraph continuation text] is a
list item with the same contents and attributes. The unindented list item with the same contents and attributes. The unindented
lines are called lines are called
[lazy continuation lines](@lazy-continuation-line). [lazy continuation line](@lazy-continuation-line)s.
Here is an example with [lazy continuation Here is an example with [lazy continuation line]s:
lines](#lazy-continuation-line):
. .
1. A paragraph 1. A paragraph
with two lines. with two lines.
indented code indented code
> A block quote. > A block quote.
. .
<ol> <ol>
skipping to change at line 3160 skipping to change at line 3152
<blockquote> <blockquote>
<p>Blockquote <p>Blockquote
continued here.</p> continued here.</p>
</blockquote> </blockquote>
</li> </li>
</ol> </ol>
</blockquote> </blockquote>
. .
6. **That's all.** Nothing that is not counted as a list item by rules 6. **That's all.** Nothing that is not counted as a list item by rules
#1--5 counts as a [list item](#list-item). #1--5 counts as a [list item](#list-items).
The rules for sublists follow from the general rules above. A sublist The rules for sublists follow from the general rules above. A sublist
must be indented the same number of spaces a paragraph would need to be must be indented the same number of spaces a paragraph would need to be
in order to be included in the list item. in order to be included in the list item.
So, in this case we need two spaces indent: So, in this case we need two spaces indent:
. .
- foo - foo
- bar - bar
skipping to change at line 3466 skipping to change at line 3458
with indented code. How much indentation is required in that case, since with indented code. How much indentation is required in that case, since
we don't have a "first paragraph" to measure from? Rule #2 simply stipulates we don't have a "first paragraph" to measure from? Rule #2 simply stipulates
that in such cases, we require one space indentation from the list marker that in such cases, we require one space indentation from the list marker
(and then the normal four spaces for the indented code). This will match the (and then the normal four spaces for the indented code). This will match the
four-space rule in cases where the list marker plus its initial indentation four-space rule in cases where the list marker plus its initial indentation
takes four spaces (a common case), but diverge in other cases. takes four spaces (a common case), but diverge in other cases.
## Lists ## Lists
A [list](@list) is a sequence of one or more A [list](@list) is a sequence of one or more
list items [of the same type](#of-the-same-type). The list items list items [of the same type]. The list items
may be separated by single [blank lines](#blank-line), but two may be separated by single [blank lines], but two
blank lines end all containing lists. blank lines end all containing lists.
Two list items are [of the same type](@of-the-same-type) Two list items are [of the same type](@of-the-same-type)
if they begin with a [list if they begin with a [list marker] of the same type.
marker](#list-marker) of the same type. Two list markers are of the Two list markers are of the
same type if (a) they are bullet list markers using the same character same type if (a) they are bullet list markers using the same character
(`-`, `+`, or `*`) or (b) they are ordered list numbers with the same (`-`, `+`, or `*`) or (b) they are ordered list numbers with the same
delimiter (either `.` or `)`). delimiter (either `.` or `)`).
A list is an [ordered list](@ordered-list) A list is an [ordered list](@ordered-list)
if its constituent list items begin with if its constituent list items begin with
[ordered list markers](#ordered-list-marker), and a [bullet [ordered list marker]s, and a
list](@bullet-list) if its constituent list [bullet list](@bullet-list) if its constituent list
items begin with [bullet list markers](#bullet-list-marker). items begin with [bullet list marker]s.
The [start number](@start-number) The [start number](@start-number)
of an [ordered list](#ordered-list) is determined by the list number of of an [ordered list] is determined by the list number of
its initial list item. The numbers of subsequent list items are its initial list item. The numbers of subsequent list items are
disregarded. disregarded.
A list is [loose](@loose) if it any of its constituent A list is [loose](@loose) if it any of its constituent
list items are separated by blank lines, or if any of its constituent list items are separated by blank lines, or if any of its constituent
list items directly contain two block-level elements with a blank line list items directly contain two block-level elements with a blank line
between them. Otherwise a list is [tight](@tight). between them. Otherwise a list is [tight](@tight).
(The difference in HTML output is that paragraphs in a loose list are (The difference in HTML output is that paragraphs in a loose list are
wrapped in `<p>` tags, while paragraphs in a tight list are not.) wrapped in `<p>` tags, while paragraphs in a tight list are not.)
skipping to change at line 3574 skipping to change at line 3566
- a coat - a coat
- a plane ticket - a plane ticket
Second, we are attracted to a Second, we are attracted to a
> [principle of uniformity](@principle-of-uniformity): > [principle of uniformity](@principle-of-uniformity):
> if a chunk of text has a certain > if a chunk of text has a certain
> meaning, it will continue to have the same meaning when put into a > meaning, it will continue to have the same meaning when put into a
> container block (such as a list item or blockquote). > container block (such as a list item or blockquote).
(Indeed, the spec for [list items](#list-item) and (Indeed, the spec for [list items] and [block quotes] presupposes
[blockquotes](#block-quotes) presupposes this principle.) this principle.) This principle implies that if
This principle implies that if
* I need to buy * I need to buy
- new shoes - new shoes
- a coat - a coat
- a plane ticket - a plane ticket
is a list item containing a paragraph followed by a nested sublist, is a list item containing a paragraph followed by a nested sublist,
as all Markdown implementations agree it is (though the paragraph as all Markdown implementations agree it is (though the paragraph
may be rendered without `<p>` tags, since the list is "tight"), may be rendered without `<p>` tags, since the list is "tight"),
then then
I need to buy I need to buy
- new shoes - new shoes
- a coat - a coat
- a plane ticket - a plane ticket
by itself should be a paragraph followed by a nested sublist. by itself should be a paragraph followed by a nested sublist.
Our adherence to the [principle of uniformity](#principle-of-uniformity) Our adherence to the [principle of uniformity]
thus inclines us to think that there are two coherent packages: thus inclines us to think that there are two coherent packages:
1. Require blank lines before *all* lists and blockquotes, 1. Require blank lines before *all* lists and blockquotes,
including lists that occur as sublists inside other list items. including lists that occur as sublists inside other list items.
2. Require blank lines in none of these places. 2. Require blank lines in none of these places.
[reStructuredText](http://docutils.sourceforge.net/rst.html) takes [reStructuredText](http://docutils.sourceforge.net/rst.html) takes
the first approach, for which there is much to be said. But the second the first approach, for which there is much to be said. But the second
seems more consistent with established practice with Markdown. seems more consistent with established practice with Markdown.
skipping to change at line 3630 skipping to change at line 3621
</li> </li>
<li> <li>
<p>bar</p> <p>bar</p>
</li> </li>
</ul> </ul>
<ul> <ul>
<li>baz</li> <li>baz</li>
</ul> </ul>
. .
As illustrated above in the section on [list items](#list-item), As illustrated above in the section on [list items],
two blank lines between blocks *within* a list item will also end a two blank lines between blocks *within* a list item will also end a
list: list:
. .
- foo - foo
bar bar
- baz - baz
. .
<ul> <ul>
skipping to change at line 4049 skipping to change at line 4040
. .
If a backslash is itself escaped, the following character is not: If a backslash is itself escaped, the following character is not:
. .
\\*emphasis* \\*emphasis*
. .
<p>\<em>emphasis</em></p> <p>\<em>emphasis</em></p>
. .
A backslash at the end of the line is a [hard line A backslash at the end of the line is a [hard line break]:
break](#hard-line-break):
. .
foo\ foo\
bar bar
. .
<p>foo<br /> <p>foo<br />
bar</p> bar</p>
. .
Backslash escapes do not work in code blocks, code spans, autolinks, or Backslash escapes do not work in code blocks, code spans, autolinks, or
skipping to change at line 4098 skipping to change at line 4088
<p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p> <p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p>
. .
. .
<a href="/bar\/)"> <a href="/bar\/)">
. .
<p><a href="/bar\/)"></p> <p><a href="/bar\/)"></p>
. .
But they work in all other contexts, including URLs and link titles, But they work in all other contexts, including URLs and link titles,
link references, and info strings in [fenced code link references, and [info string]s in [fenced code block]s:
blocks](#fenced-code-block):
. .
[foo](/bar\* "ti\*tle") [foo](/bar\* "ti\*tle")
. .
<p><a href="/bar*" title="ti*tle">foo</a></p> <p><a href="/bar*" title="ti*tle">foo</a></p>
. .
. .
[foo] [foo]
skipping to change at line 4127 skipping to change at line 4116
foo foo
``` ```
. .
<pre><code class="language-foo+bar">foo <pre><code class="language-foo+bar">foo
</code></pre> </code></pre>
. .
## Entities ## Entities
With the goal of making this standard as HTML-agnostic as possible, all With the goal of making this standard as HTML-agnostic as possible, all
valid HTML entities in any context are recognized as such and valid HTML entities (except in code blocks and code spans)
converted into unicode characters before they are stored in the AST. are recognized as such and converted into unicode characters before
they are stored in the AST. This means that renderers to formats other
This allows implementations that target HTML output to trivially escape than HTML need not be HTML-entity aware. HTML renderers may either escape
the entities when generating HTML, and simplifies the job of unicode characters as entities or leave them as they are. (However,
implementations targetting other languages, as these will only need to `"`, `&`, `<`, and `>` must always be rendered as entities.)
handle the unicode chars and need not be HTML-entity aware.
[Named entities](@name-entities) consist of `&` [Named entities](@name-entities) consist of `&`
+ any of the valid HTML5 entity names + `;`. The + any of the valid HTML5 entity names + `;`. The
[following document](https://html.spec.whatwg.org/multipage/entities.json) [following document](https://html.spec.whatwg.org/multipage/entities.json)
is used as an authoritative source of the valid entity names and their is used as an authoritative source of the valid entity names and their
corresponding codepoints. corresponding codepoints.
Conforming implementations that target HTML don't need to generate
entities for all the valid named entities that exist, with the exception
of `"` (`&quot;`), `&` (`&amp;`), `<` (`&lt;`) and `>` (`&gt;`), which
always need to be written as entities for security reasons.
. .
&nbsp; &amp; &copy; &AElig; &Dcaron; &frac34; &HilbertSpace; &DifferentialD; &Cl ockwiseContourIntegral; &nbsp; &amp; &copy; &AElig; &Dcaron; &frac34; &HilbertSpace; &DifferentialD; &Cl ockwiseContourIntegral;
. .
<p>  &amp; © Æ Ď ¾ ℋ ⅆ ∲</p> <p>  &amp; © Æ Ď ¾ ℋ ⅆ ∲</p>
. .
[Decimal entities](@decimal-entities) [Decimal entities](@decimal-entities)
consist of `&#` + a string of 1--8 arabic digits + `;`. Again, these consist of `&#` + a string of 1--8 arabic digits + `;`. Again, these
entities need to be recognised and tranformed into their corresponding entities need to be recognised and tranformed into their corresponding
UTF8 codepoints. Invalid Unicode codepoints will be written as the UTF8 codepoints. Invalid Unicode codepoints will be written as the
skipping to change at line 4202 skipping to change at line 4185
Strings that are not on the list of HTML5 named entities are not Strings that are not on the list of HTML5 named entities are not
recognized as entities either: recognized as entities either:
. .
&MadeUpEntity; &MadeUpEntity;
. .
<p>&amp;MadeUpEntity;</p> <p>&amp;MadeUpEntity;</p>
. .
Entities are recognized in any context besides code spans or Entities are recognized in any context besides code spans or
code blocks, including raw HTML, URLs, [link titles](#link-title), and code blocks, including raw HTML, URLs, [link title]s, and
[fenced code block](#fenced-code-block) info strings: [fenced code block] [info string]s:
. .
<a href="&ouml;&ouml;.html"> <a href="&ouml;&ouml;.html">
. .
<p><a href="&ouml;&ouml;.html"></p> <p><a href="&ouml;&ouml;.html"></p>
. .
. .
[foo](/f&ouml;&ouml; "f&ouml;&ouml;") [foo](/f&ouml;&ouml; "f&ouml;&ouml;")
. .
skipping to change at line 4249 skipping to change at line 4232
<p><code>f&amp;ouml;&amp;ouml;</code></p> <p><code>f&amp;ouml;&amp;ouml;</code></p>
. .
. .
f&ouml;f&ouml; f&ouml;f&ouml;
. .
<pre><code>f&amp;ouml;f&amp;ouml; <pre><code>f&amp;ouml;f&amp;ouml;
</code></pre> </code></pre>
. .
## Code span ## Code spans
A [backtick string](@backtick-string) A [backtick string](@backtick-string)
is a string of one or more backtick characters (`` ` ``) that is neither is a string of one or more backtick characters (`` ` ``) that is neither
preceded nor followed by a backtick. preceded nor followed by a backtick.
A [code span](@code-span) begins with a backtick string and ends with A [code span](@code-span) begins with a backtick string and ends with
a backtick string of equal length. The contents of the code span are a backtick string of equal length. The contents of the code span are
the characters between the two backtick strings, with leading and the characters between the two backtick strings, with leading and
trailing spaces and [line endings](#line-ending) removed, and trailing spaces and [line ending]s removed, and
[whitespace](#whitespace) collapsed to single spaces. [whitespace] collapsed to single spaces.
This is a simple code span: This is a simple code span:
. .
`foo` `foo`
. .
<p><code>foo</code></p> <p><code>foo</code></p>
. .
Here two backticks are used, because the code contains a backtick. Here two backticks are used, because the code contains a backtick.
skipping to change at line 4287 skipping to change at line 4270
This example shows the motivation for stripping leading and trailing This example shows the motivation for stripping leading and trailing
spaces: spaces:
. .
` `` ` ` `` `
. .
<p><code>``</code></p> <p><code>``</code></p>
. .
[Line endings](#line-ending) are treated like spaces: [Line ending]s are treated like spaces:
. .
`` ``
foo foo
`` ``
. .
<p><code>foo</code></p> <p><code>foo</code></p>
. .
Interior spaces and [line endings](#line-ending) are collapsed into Interior spaces and [line ending]s are collapsed into
single spaces, just as they would be by a browser: single spaces, just as they would be by a browser:
. .
`foo bar `foo bar
baz` baz`
. .
<p><code>foo bar baz</code></p> <p><code>foo bar baz</code></p>
. .
Q: Why not just leave the spaces, since browsers will collapse them Q: Why not just leave the spaces, since browsers will collapse them
anyway? A: Because we might be targeting a non-HTML format, and we anyway? A: Because we might be targeting a non-HTML format, and we
shouldn't rely on HTML-specific rendering assumptions. shouldn't rely on HTML-specific rendering assumptions.
(Existing implementations differ in their treatment of internal (Existing implementations differ in their treatment of internal
spaces and [line endings](#line-ending). Some, including `Markdown.pl` and spaces and [line ending]s. Some, including `Markdown.pl` and
`showdown`, convert an internal [line ending](#line-ending) into a `showdown`, convert an internal [line ending] into a
`<br />` tag. But this makes things difficult for those who like to `<br />` tag. But this makes things difficult for those who like to
hard-wrap their paragraphs, since a line break in the midst of a code hard-wrap their paragraphs, since a line break in the midst of a code
span will cause an unintended line break in the output. Others just span will cause an unintended line break in the output. Others just
leave internal spaces as they are, which is fine if only HTML is being leave internal spaces as they are, which is fine if only HTML is being
targeted.) targeted.)
. .
`foo `` bar` `foo `` bar`
. .
<p><code>foo `` bar</code></p> <p><code>foo `` bar</code></p>
skipping to change at line 4358 skipping to change at line 4341
. .
And this is not parsed as a link: And this is not parsed as a link:
. .
[not a `link](/foo`) [not a `link](/foo`)
. .
<p>[not a <code>link](/foo</code>)</p> <p>[not a <code>link](/foo</code>)</p>
. .
But this is a link: Code spans, HTML tags, and autolinks have the same precedence.
Thus, this is code:
. .
<http://foo.bar.`baz>` `<a href="`">`
. .
<p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a&gt;`</p> <p><code>&lt;a href=&quot;</code>&quot;&gt;`</p>
. .
And this is an HTML tag: But this is an HTML tag:
. .
<a href="`">` <a href="`">`
. .
<p><a href="`">`</p> <p><a href="`">`</p>
. .
And this is code:
.
`<http://foo.bar.`baz>`
.
<p><code>&lt;http://foo.bar.</code>baz&gt;`</p>
.
But this is an autolink:
.
<http://foo.bar.`baz>`
.
<p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a>`</p>
.
When a backtick string is not closed by a matching backtick string, When a backtick string is not closed by a matching backtick string,
we just have literal backticks: we just have literal backticks:
. .
```foo`` ```foo``
. .
<p>```foo``</p> <p>```foo``</p>
. .
. .
skipping to change at line 4441 skipping to change at line 4441
The rules given below capture all of these patterns, while allowing The rules given below capture all of these patterns, while allowing
for efficient parsing strategies that do not backtrack. for efficient parsing strategies that do not backtrack.
First, some definitions. A [delimiter run](@delimiter-run) is either First, some definitions. A [delimiter run](@delimiter-run) is either
a sequence of one or more `*` characters that is not preceded or a sequence of one or more `*` characters that is not preceded or
followed by a `*` character, or a sequence of one or more `_` followed by a `*` character, or a sequence of one or more `_`
characters that is not preceded or followed by a `_` character. characters that is not preceded or followed by a `_` character.
A [left-flanking delimiter run](@left-flanking-delimiter-run) is A [left-flanking delimiter run](@left-flanking-delimiter-run) is
a [delimiter run](#delimiter-run) that is (a) not followed by [unicode a [delimiter run] that is (a) not followed by [unicode whitespace],
whitespace](#unicode-whitespace), and (b) either not followed by a and (b) either not followed by a [punctuation character], or
[punctuation character](#punctuation-character), or preceded by [unicode whitespace] or a [punctuation character].
preceded by [unicode whitespace](#unicode-whitespace) or
a [punctuation character](#punctuation-character).
A [right-flanking delimiter run](@right-flanking-delimiter-run) is A [right-flanking delimiter run](@right-flanking-delimiter-run) is
a [delimiter run](#delimiter-run) that is (a) not preceded by [unicode a [delimiter run] that is (a) not preceded by [unicode whitespace],
whitespace](#unicode-whitespace), and (b) either not preceded by a and (b) either not preceded by a [punctuation character], or
[punctuation character](#punctuation-character), or followed by [unicode whitespace] or a [punctuation character].
followed by [unicode whitespace](#unicode-whitespace) or
a [punctuation character](#punctuation-character).
Here are some examples of delimiter runs. Here are some examples of delimiter runs.
- left-flanking but not right-flanking: - left-flanking but not right-flanking:
``` ```
***abc ***abc
_abc _abc
**"abc" **"abc"
_"abc" _"abc"
skipping to change at line 4499 skipping to change at line 4495
delimiter runs based on the character before and the character delimiter runs based on the character before and the character
after comes from Roopesh Chander's after comes from Roopesh Chander's
[vfmd](http://www.vfmd.org/vfmd-spec/specification/#procedure-for-identifying-em phasis-tags). [vfmd](http://www.vfmd.org/vfmd-spec/specification/#procedure-for-identifying-em phasis-tags).
vfmd uses the terminology "emphasis indicator string" instead of "delimiter vfmd uses the terminology "emphasis indicator string" instead of "delimiter
run," and its rules for distinguishing left- and right-flanking runs run," and its rules for distinguishing left- and right-flanking runs
are a bit more complex than the ones given here.) are a bit more complex than the ones given here.)
The following rules define emphasis and strong emphasis: The following rules define emphasis and strong emphasis:
1. A single `*` character [can open emphasis](@can-open-emphasis) 1. A single `*` character [can open emphasis](@can-open-emphasis)
iff it is part of a iff it is part of a [left-flanking delimiter run].
[left-flanking delimiter run](#left-flanking-delimiter-run).
2. A single `_` character [can open emphasis](#can-open-emphasis) iff 2. A single `_` character [can open emphasis] iff
it is part of a it is part of a [left-flanking delimiter run]
[left-flanking delimiter run](#left-flanking-delimiter-run)
and is not preceded by an ASCII alphanumeric character. and is not preceded by an ASCII alphanumeric character.
3. A single `*` character [can close emphasis](@can-close-emphasis) 3. A single `*` character [can close emphasis](@can-close-emphasis)
iff it is part of a iff it is part of a [right-flanking delimiter run].
[right-flanking delimiter run](#right-flanking-delimiter-run).
4. A single `_` character [can close emphasis](#can-close-emphasis) 4. A single `_` character [can close emphasis]
iff it is part of a iff it is part of a [right-flanking delimiter run].
[right-flanking delimiter run](#right-flanking-delimiter-run).
and it is not followed by an ASCII alphanumeric character. and it is not followed by an ASCII alphanumeric character.
5. A double `**` [can open strong emphasis](@can-open-strong-emphasis) 5. A double `**` [can open strong emphasis](@can-open-strong-emphasis)
iff it is part of a iff it is part of a [left-flanking delimiter run].
[left-flanking delimiter run](#left-flanking-delimiter-run).
6. A double `__` [can open strong emphasis](#can-open-strong-emphasis) 6. A double `__` [can open strong emphasis]
iff it is part of a iff it is part of a [left-flanking delimiter run]
[left-flanking delimiter run](#left-flanking-delimiter-run)
and is not preceded by an ASCII alphanumeric character. and is not preceded by an ASCII alphanumeric character.
7. A double `**` [can close strong emphasis](@can-close-strong-emphasis) 7. A double `**` [can close strong emphasis](@can-close-strong-emphasis)
iff it is part of a iff it is part of a [right-flanking delimiter run].
[right-flanking delimiter run](#right-flanking-delimiter-run).
8. A double `__` [can close strong emphasis](#can-close-strong-emphasis) 8. A double `__` [can close strong emphasis]
iff it is part of a iff it is part of a [right-flanking delimiter run]
[right-flanking delimiter run](#right-flanking-delimiter-run).
and is not followed by an ASCII alphanumeric character. and is not followed by an ASCII alphanumeric character.
9. Emphasis begins with a delimiter that [can open 9. Emphasis begins with a delimiter that [can open emphasis] and ends
emphasis](#can-open-emphasis) and ends with a delimiter that [can close with a delimiter that [can close emphasis], and that uses the same
emphasis](#can-close-emphasis), and that uses the same
character (`_` or `*`) as the opening delimiter. There must character (`_` or `*`) as the opening delimiter. There must
be a nonempty sequence of inlines between the open delimiter be a nonempty sequence of inlines between the open delimiter
and the closing delimiter; these form the contents of the emphasis and the closing delimiter; these form the contents of the emphasis
inline. inline.
10. Strong emphasis begins with a delimiter that [can open strong 10. Strong emphasis begins with a delimiter that
emphasis](#can-open-strong-emphasis) and ends with a delimiter that [can open strong emphasis] and ends with a delimiter that
[can close strong emphasis](#can-close-strong-emphasis), and that [can close strong emphasis], and that uses the same character
uses the same character (`_` or `*`) as the opening delimiter. (`_` or `*`) as the opening delimiter.
There must be a nonempty sequence of inlines between the open There must be a nonempty sequence of inlines between the open
delimiter and the closing delimiter; these form the contents of delimiter and the closing delimiter; these form the contents of
the strong emphasis inline. the strong emphasis inline.
11. A literal `*` character cannot occur at the beginning or end of 11. A literal `*` character cannot occur at the beginning or end of
`*`-delimited emphasis or `**`-delimited strong emphasis, unless it `*`-delimited emphasis or `**`-delimited strong emphasis, unless it
is backslash-escaped. is backslash-escaped.
12. A literal `_` character cannot occur at the beginning or end of 12. A literal `_` character cannot occur at the beginning or end of
`_`-delimited emphasis or `__`-delimited strong emphasis, unless it `_`-delimited emphasis or `__`-delimited strong emphasis, unless it
skipping to change at line 4570 skipping to change at line 4557
13. The number of nestings should be minimized. Thus, for example, 13. The number of nestings should be minimized. Thus, for example,
an interpretation `<strong>...</strong>` is always preferred to an interpretation `<strong>...</strong>` is always preferred to
`<em><em>...</em></em>`. `<em><em>...</em></em>`.
14. An interpretation `<strong><em>...</em></strong>` is always 14. An interpretation `<strong><em>...</em></strong>` is always
preferred to `<em><strong>..</strong></em>`. preferred to `<em><strong>..</strong></em>`.
15. When two potential emphasis or strong emphasis spans overlap, 15. When two potential emphasis or strong emphasis spans overlap,
so that the second begins before the first ends and ends after so that the second begins before the first ends and ends after
the first ends, the first is preferred. Thus, for example, the first ends, the first takes precedence. Thus, for example,
`*foo _bar* baz_` is parsed as `<em>foo _bar</em> baz_` rather `*foo _bar* baz_` is parsed as `<em>foo _bar</em> baz_` rather
than `*foo <em>bar* baz</em>`. For the same reason, than `*foo <em>bar* baz</em>`. For the same reason,
`**foo*bar**` is parsed as `<em><em>foo</em>bar</em>*` `**foo*bar**` is parsed as `<em><em>foo</em>bar</em>*`
rather than `<strong>foo*bar</strong>`. rather than `<strong>foo*bar</strong>`.
16. When there are two potential emphasis or strong emphasis spans 16. When there are two potential emphasis or strong emphasis spans
with the same closing delimiter, the shorter one (the one that with the same closing delimiter, the shorter one (the one that
opens later) is preferred. Thus, for example, opens later) takes precedence. Thus, for example,
`**foo **bar baz**` is parsed as `**foo <strong>bar baz</strong>` `**foo **bar baz**` is parsed as `**foo <strong>bar baz</strong>`
rather than `<strong>foo **bar baz</strong>`. rather than `<strong>foo **bar baz</strong>`.
17. Inline code spans, links, images, and HTML tags group more tightly 17. Inline code spans, links, images, and HTML tags group more tightly
than emphasis. So, when there is a choice between an interpretation than emphasis. So, when there is a choice between an interpretation
that contains one of these elements and one that does not, the that contains one of these elements and one that does not, the
former always wins. Thus, for example, `*[foo*](bar)` is former always wins. Thus, for example, `*[foo*](bar)` is
parsed as `*<a href="bar">foo*</a>` rather than as parsed as `*<a href="bar">foo*</a>` rather than as
`<em>[foo</em>](bar)`. `<em>[foo</em>](bar)`.
skipping to change at line 4600 skipping to change at line 4587
Rule 1: Rule 1:
. .
*foo bar* *foo bar*
. .
<p><em>foo bar</em></p> <p><em>foo bar</em></p>
. .
This is not emphasis, because the opening `*` is followed by This is not emphasis, because the opening `*` is followed by
whitespace, and hence not part of a [left-flanking delimiter whitespace, and hence not part of a [left-flanking delimiter run]:
run](#left-flanking-delimiter-run):
. .
a * foo bar* a * foo bar*
. .
<p>a * foo bar*</p> <p>a * foo bar*</p>
. .
This is not emphasis, because the opening `*` is preceded This is not emphasis, because the opening `*` is preceded
by an alphanumeric and followed by punctuation, and hence by an alphanumeric and followed by punctuation, and hence
not part of a [left-flanking delimiter run](#left-flanking-delimiter-run): not part of a [left-flanking delimiter run]:
. .
a*"foo"* a*"foo"*
. .
<p>a*&quot;foo&quot;*</p> <p>a*&quot;foo&quot;*</p>
. .
Unicode nonbreaking spaces count as whitespace, too: Unicode nonbreaking spaces count as whitespace, too:
. .
skipping to change at line 4649 skipping to change at line 4635
. .
Rule 2: Rule 2:
. .
_foo bar_ _foo bar_
. .
<p><em>foo bar</em></p> <p><em>foo bar</em></p>
. .
This is not emphasis, because the opening `*` is followed by This is not emphasis, because the opening `_` is followed by
whitespace: whitespace:
. .
_ foo bar_ _ foo bar_
. .
<p>_ foo bar_</p> <p>_ foo bar_</p>
. .
This is not emphasis, because the opening `_` is preceded This is not emphasis, because the opening `_` is preceded
by an alphanumeric and followed by punctuation: by an alphanumeric and followed by punctuation:
skipping to change at line 4711 skipping to change at line 4697
whitespace: whitespace:
. .
*foo bar * *foo bar *
. .
<p>*foo bar *</p> <p>*foo bar *</p>
. .
This is not emphasis, because the second `*` is This is not emphasis, because the second `*` is
preceded by punctuation and followed by an alphanumeric preceded by punctuation and followed by an alphanumeric
(hence it is not part of a [right-flanking delimiter (hence it is not part of a [right-flanking delimiter run]:
run](#right-flanking-delimiter-run):
. .
*(*foo) *(*foo)
. .
<p>*(*foo)</p> <p>*(*foo)</p>
. .
The point of this restriction is more easily appreciated The point of this restriction is more easily appreciated
with this example: with this example:
skipping to change at line 4804 skipping to change at line 4789
followed by whitespace: followed by whitespace:
. .
** foo bar** ** foo bar**
. .
<p>** foo bar**</p> <p>** foo bar**</p>
. .
This is not strong emphasis, because the opening `**` is preceded This is not strong emphasis, because the opening `**` is preceded
by an alphanumeric and followed by punctuation, and hence by an alphanumeric and followed by punctuation, and hence
not part of a [left-flanking delimiter run](#left-flanking-delimiter-run): not part of a [left-flanking delimiter run]:
. .
a**"foo"** a**"foo"**
. .
<p>a**&quot;foo&quot;**</p> <p>a**&quot;foo&quot;**</p>
. .
Intraword strong emphasis with `**` is permitted: Intraword strong emphasis with `**` is permitted:
. .
skipping to change at line 5035 skipping to change at line 5020
. .
But note: But note:
. .
*foo**bar**baz* *foo**bar**baz*
. .
<p><em>foo</em><em>bar</em><em>baz</em></p> <p><em>foo</em><em>bar</em><em>baz</em></p>
. .
The difference is that in the preceding case, The difference is that in the preceding case, the internal delimiters
the internal delimiters [can close emphasis](#can-close-emphasis), [can close emphasis], while in the cases with spaces, they cannot.
while in the cases with spaces, they cannot.
. .
***foo** bar* ***foo** bar*
. .
<p><em><strong>foo</strong> bar</em></p> <p><em><strong>foo</strong> bar</em></p>
. .
. .
*foo **bar*** *foo **bar***
. .
skipping to change at line 5149 skipping to change at line 5133
. .
But note: But note:
. .
**foo*bar*baz** **foo*bar*baz**
. .
<p><em><em>foo</em>bar</em>baz**</p> <p><em><em>foo</em>bar</em>baz**</p>
. .
The difference is that in the preceding case, The difference is that in the preceding case, the internal delimiters
the internal delimiters [can close emphasis](#can-close-emphasis), [can close emphasis], while in the cases with spaces, they cannot.
while in the cases with spaces, they cannot.
. .
***foo* bar** ***foo* bar**
. .
<p><strong><em>foo</em> bar</strong></p> <p><strong><em>foo</em> bar</strong></p>
. .
. .
**foo *bar*** **foo *bar***
. .
skipping to change at line 5378 skipping to change at line 5361
. .
<p><strong>foo</strong></p> <p><strong>foo</strong></p>
. .
. .
_*foo*_ _*foo*_
. .
<p><em><em>foo</em></em></p> <p><em><em>foo</em></em></p>
. .
However, strong emphasis within strong emphasisis possible without However, strong emphasis within strong emphasis is possible without
switching delimiters: switching delimiters:
. .
****foo**** ****foo****
. .
<p><strong><strong>foo</strong></strong></p> <p><strong><strong>foo</strong></strong></p>
. .
. .
____foo____ ____foo____
skipping to change at line 5502 skipping to change at line 5485
. .
. .
__a<http://foo.bar?q=__> __a<http://foo.bar?q=__>
. .
<p>__a<a href="http://foo.bar?q=__">http://foo.bar?q=__</a></p> <p>__a<a href="http://foo.bar?q=__">http://foo.bar?q=__</a></p>
. .
## Links ## Links
A link contains [link text](#link-label) (the visible text), A link contains [link text] (the visible text), a [link destination]
a [link destination](#link-destination) (the URI that is the link destination), (the URI that is the link destination), and optionally a [link title].
and optionally a [link title](#link-title). There are two basic kinds There are two basic kinds of links in Markdown. In [inline link]s the
of links in Markdown. In [inline links](#inline-link) the destination destination and title are given immediately after the link text. In
and title are given immediately after the link text. In [reference [reference link]s the destination and title are defined elsewhere in
links](#reference-link) the destination and title are defined elsewhere the document.
in the document.
A [link text](@link-text) consists of a sequence of zero or more A [link text](@link-text) consists of a sequence of zero or more
inline elements enclosed by square brackets (`[` and `]`). The inline elements enclosed by square brackets (`[` and `]`). The
following rules apply: following rules apply:
- Links may not contain other links, at any level of nesting. - Links may not contain other links, at any level of nesting.
- Brackets are allowed in the [link text](#link-text) only if (a) they - Brackets are allowed in the [link text] only if (a) they
are backslash-escaped or (b) they appear as a matched pair of brackets, are backslash-escaped or (b) they appear as a matched pair of brackets,
with an open bracket `[`, a sequence of zero or more inlines, and with an open bracket `[`, a sequence of zero or more inlines, and
a close bracket `]`. a close bracket `]`.
- Backtick [code spans](#code-span), [autolinks](#autolink), and - Backtick [code span]s, [autolink]s, and raw [HTML tag]s bind more tightly
raw [HTML tags](#html-tag) bind more tightly
than the brackets in link text. Thus, for example, than the brackets in link text. Thus, for example,
Backtick [code <span class="insert">span]s, [autolink]s,</span> and raw [HTML < span class="insert">tag]s</span> bind more tightly
`` [foo`]` `` could not be a link text, since the second `]` `` [foo`]` `` could not be a link text, since the second `]`
is part of a code span. is part of a code span.
- The brackets in link text bind more tightly than markers for - The brackets in link text bind more tightly than markers for
[emphasis and strong emphasis](#emphasis-and-strong-emphasis). [emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link.
Thus, for example, `*[foo*](url)` is a link.
A [link destination](@link-destination) consists of either A [link destination](@link-destination) consists of either
- a sequence of zero or more characters between an opening `<` and a - a sequence of zero or more characters between an opening `<` and a
closing `>` that contains no line breaks or unescaped `<` or `>` closing `>` that contains no line breaks or unescaped `<` or `>`
characters, or characters, or
- a nonempty sequence of characters that does not include - a nonempty sequence of characters that does not include
ASCII space or control characters, and includes parentheses ASCII space or control characters, and includes parentheses
only if (a) they are backslash-escaped or (b) they are part of only if (a) they are backslash-escaped or (b) they are part of
skipping to change at line 5556 skipping to change at line 5536
characters (`"`), including a `"` character only if it is characters (`"`), including a `"` character only if it is
backslash-escaped, or backslash-escaped, or
- a sequence of zero or more characters between straight single-quote - a sequence of zero or more characters between straight single-quote
characters (`'`), including a `'` character only if it is characters (`'`), including a `'` character only if it is
backslash-escaped, or backslash-escaped, or
- a sequence of zero or more characters between matching parentheses - a sequence of zero or more characters between matching parentheses
(`(...)`), including a `)` character only if it is backslash-escaped. (`(...)`), including a `)` character only if it is backslash-escaped.
An [inline link](@inline-link) An [inline link](@inline-link) consists of a [link text] followed immediately
consists of a [link text](#link-text) followed immediately by a left parenthesis `(`, optional [whitespace], an optional
by a left parenthesis `(`, optional [whitespace](#whitespace), [link destination], an optional [link title] separated from the link
an optional [link destination](#link-destination), destination by [whitespace], optional [whitespace], and a right
an optional [link title](#link-title) separated from the link parenthesis `)`. The link's text consists of the inlines contained
destination by [whitespace](#whitespace), optional in the [link text] (excluding the enclosing square brackets).
[whitespace](#whitespace), and a right parenthesis `)`.
The link's text consists of the inlines contained
in the [link text](#link-text) (excluding the enclosing square brackets).
The link's URI consists of the link destination, excluding enclosing The link's URI consists of the link destination, excluding enclosing
`<...>` if present, with backslash-escapes in effect as described `<...>` if present, with backslash-escapes in effect as described
above. The link's title consists of the link title, excluding its above. The link's title consists of the link title, excluding its
enclosing delimiters, with backslash-escapes in effect as described enclosing delimiters, with backslash-escapes in effect as described
above. above.
Here is a simple inline link: Here is a simple inline link:
. .
[link](/uri "title") [link](/uri "title")
skipping to change at line 5735 skipping to change at line 5712
entities, or using a different quote type for the enclosing title---to entities, or using a different quote type for the enclosing title---to
write titles containing double quotes. `Markdown.pl`'s handling of write titles containing double quotes. `Markdown.pl`'s handling of
titles has a number of other strange features. For example, it allows titles has a number of other strange features. For example, it allows
single-quoted titles in inline links, but not reference links. And, in single-quoted titles in inline links, but not reference links. And, in
reference links but not inline links, it allows a title to begin with reference links but not inline links, it allows a title to begin with
`"` and end with `)`. `Markdown.pl` 1.0.1 even allows titles with no closing `"` and end with `)`. `Markdown.pl` 1.0.1 even allows titles with no closing
quotation mark, though 1.0.2b8 does not. It seems preferable to adopt quotation mark, though 1.0.2b8 does not. It seems preferable to adopt
a simple, rational rule that works the same way in inline links and a simple, rational rule that works the same way in inline links and
link reference definitions.) link reference definitions.)
[Whitespace](#whitespace) is allowed around the destination and title: [Whitespace] is allowed around the destination and title:
. .
[link]( /uri [link]( /uri
"title" ) "title" )
. .
<p><a href="/uri" title="title">link</a></p> <p><a href="/uri" title="title">link</a></p>
. .
But it is not allowed between the link text and the But it is not allowed between the link text and the
following parenthesis: following parenthesis:
skipping to change at line 5829 skipping to change at line 5806
. .
<p>*<a href="/uri">foo*</a></p> <p>*<a href="/uri">foo*</a></p>
. .
. .
[foo *bar](baz*) [foo *bar](baz*)
. .
<p><a href="baz*">foo *bar</a></p> <p><a href="baz*">foo *bar</a></p>
. .
Note that brackets that *aren't* part of links do not take
precedence:
.
*foo [bar* baz]
.
<p><em>foo [bar</em> baz]</p>
.
These cases illustrate the precedence of HTML tags, code spans, These cases illustrate the precedence of HTML tags, code spans,
and autolinks over link grouping: and autolinks over link grouping:
. .
[foo <bar attr="](baz)"> [foo <bar attr="](baz)">
. .
<p>[foo <bar attr="](baz)"></p> <p>[foo <bar attr="](baz)"></p>
. .
. .
skipping to change at line 5850 skipping to change at line 5836
. .
<p>[foo<code>](/uri)</code></p> <p>[foo<code>](/uri)</code></p>
. .
. .
[foo<http://example.com?search=](uri)> [foo<http://example.com?search=](uri)>
. .
<p>[foo<a href="http://example.com?search=%5D(uri)">http://example.com?search=]( uri)</a></p> <p>[foo<a href="http://example.com?search=%5D(uri)">http://example.com?search=]( uri)</a></p>
. .
There are three kinds of [reference links](@reference-link): There are three kinds of [reference link](@reference-link)s:
[full](#full-reference-link), [collapsed](#collapsed-reference-link), [full](#full-reference-link), [collapsed](#collapsed-reference-link),
and [shortcut](#shortcut-reference-link). and [shortcut](#shortcut-reference-link).
A [full reference link](@full-reference-link) A [full reference link](@full-reference-link)
consists of a [link text](#link-text), consists of a [link text], optional [whitespace], and a [link label]
optional [whitespace](#whitespace), and that [matches] a [link reference definition] elsewhere in the document.
a [link label](#link-label) that [matches](#matches) a
[link reference definition](#link-reference-definition) elsewhere in the
document.
A [link label](@link-label) begins with a left bracket (`[`) and ends A [link label](@link-label) begins with a left bracket (`[`) and ends
with the first right bracket (`]`) that is not backslash-escaped. with the first right bracket (`]`) that is not backslash-escaped.
Unescaped square bracket characters are not allowed in Unescaped square bracket characters are not allowed in
[link labels](#link-label). A link label can have at most 999 [link label]s. A link label can have at most 999
characters inside the square brackets. characters inside the square brackets.
One label [matches](@matches) One label [matches](@matches)
another just in case their normalized forms are equal. To normalize a another just in case their normalized forms are equal. To normalize a
label, perform the *unicode case fold* and collapse consecutive internal label, perform the *unicode case fold* and collapse consecutive internal
[whitespace](#whitespace) to a single space. If there are multiple [whitespace] to a single space. If there are multiple
matching reference link definitions, the one that comes first in the matching reference link definitions, the one that comes first in the
document is used. (It is desirable in such cases to emit a warning.) document is used. (It is desirable in such cases to emit a warning.)
The contents of the first link label are parsed as inlines, which are The contents of the first link label are parsed as inlines, which are
used as the link's text. The link's URI and title are provided by the used as the link's text. The link's URI and title are provided by the
matching [link reference definition](#link-reference-definition). matching [link reference definition].
Here is a simple example: Here is a simple example:
. .
[foo][bar] [foo][bar]
[bar]: /url "title" [bar]: /url "title"
. .
<p><a href="/url" title="title">foo</a></p> <p><a href="/url" title="title">foo</a></p>
. .
The rules for the [link text](#link-text) are the same as with The rules for the [link text] are the same as with
[inline links](#inline-link). Thus: [inline link]s. Thus:
The link text may contain balanced brackets, but not unbalanced ones, The link text may contain balanced brackets, but not unbalanced ones,
unless they are escaped: unless they are escaped:
. .
[link [foo [bar]]][ref] [link [foo [bar]]][ref]
[ref]: /uri [ref]: /uri
. .
<p><a href="/uri">link [foo [bar]]</a></p> <p><a href="/uri">link [foo [bar]]</a></p>
skipping to change at line 5946 skipping to change at line 5929
. .
. .
[foo *bar [baz][ref]*][ref] [foo *bar [baz][ref]*][ref]
[ref]: /uri [ref]: /uri
. .
<p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p> <p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p>
. .
(In the examples above, we have two [shortcut reference (In the examples above, we have two [shortcut reference link]s
links](#shortcut-reference-link) instead of one [full reference instead of one [full reference link].)
link](#full-reference-link).)
The following cases illustrate the precedence of link text grouping over The following cases illustrate the precedence of link text grouping over
emphasis grouping: emphasis grouping:
. .
*[foo*][ref] *[foo*][ref]
[ref]: /uri [ref]: /uri
. .
<p>*<a href="/uri">foo*</a></p> <p>*<a href="/uri">foo*</a></p>
skipping to change at line 6016 skipping to change at line 5998
Unicode case fold is used: Unicode case fold is used:
. .
[Толпой][Толпой] is a Russian word. [Толпой][Толпой] is a Russian word.
[ТОЛПОЙ]: /url [ТОЛПОЙ]: /url
. .
<p><a href="/url">Толпой</a> is a Russian word.</p> <p><a href="/url">Толпой</a> is a Russian word.</p>
. .
Consecutive internal [whitespace](#whitespace) is treated as one space for Consecutive internal [whitespace] is treated as one space for
purposes of determining matching: purposes of determining matching:
. .
[Foo [Foo
bar]: /url bar]: /url
[Baz][Foo bar] [Baz][Foo bar]
. .
<p><a href="/url">Baz</a></p> <p><a href="/url">Baz</a></p>
. .
There can be [whitespace](#whitespace) between the There can be [whitespace] between the [link text] and the [link label]:
[link text](#link-text) and the [link label](#link-label):
. .
[foo] [bar] [foo] [bar]
[bar]: /url "title" [bar]: /url "title"
. .
<p><a href="/url" title="title">foo</a></p> <p><a href="/url" title="title">foo</a></p>
. .
. .
[foo] [foo]
[bar] [bar]
[bar]: /url "title" [bar]: /url "title"
. .
<p><a href="/url" title="title">foo</a></p> <p><a href="/url" title="title">foo</a></p>
. .
When there are multiple matching [link reference When there are multiple matching [link reference definition]s,
definitions](#link-reference-definition), the first is used: the first is used:
. .
[foo]: /url1 [foo]: /url1
[foo]: /url2 [foo]: /url2
[bar][foo] [bar][foo]
. .
<p><a href="/url1">bar</a></p> <p><a href="/url1">bar</a></p>
. .
skipping to change at line 6073 skipping to change at line 6054
labels define equivalent inline content: labels define equivalent inline content:
. .
[bar][foo\!] [bar][foo\!]
[foo!]: /url [foo!]: /url
. .
<p>[bar][foo!]</p> <p>[bar][foo!]</p>
. .
[Link labels](#link-label) cannot contain brackets, unless they are [Link label]s cannot contain brackets, unless they are
backslash-escaped: backslash-escaped:
. .
[foo][ref[] [foo][ref[]
[ref[]: /uri [ref[]: /uri
. .
<p>[foo][ref[]</p> <p>[foo][ref[]</p>
<p>[ref[]: /uri</p> <p>[ref[]: /uri</p>
. .
skipping to change at line 6112 skipping to change at line 6093
. .
[foo][ref\[] [foo][ref\[]
[ref\[]: /uri [ref\[]: /uri
. .
<p><a href="/uri">foo</a></p> <p><a href="/uri">foo</a></p>
. .
A [collapsed reference link](@collapsed-reference-link) A [collapsed reference link](@collapsed-reference-link)
consists of a [link consists of a [link label] that [matches] a
label](#link-label) that [matches](#matches) a [link reference [link reference definition] elsewhere in the
definition](#link-reference-definition) elsewhere in the document, optional [whitespace], and the string `[]`.
document, optional [whitespace](#whitespace), and the string `[]`.
The contents of the first link label are parsed as inlines, The contents of the first link label are parsed as inlines,
which are used as the link's text. The link's URI and title are which are used as the link's text. The link's URI and title are
provided by the matching reference link definition. Thus, provided by the matching reference link definition. Thus,
`[foo][]` is equivalent to `[foo][foo]`. `[foo][]` is equivalent to `[foo][foo]`.
. .
[foo][] [foo][]
[foo]: /url "title" [foo]: /url "title"
. .
skipping to change at line 6147 skipping to change at line 6127
The link labels are case-insensitive: The link labels are case-insensitive:
. .
[Foo][] [Foo][]
[foo]: /url "title" [foo]: /url "title"
. .
<p><a href="/url" title="title">Foo</a></p> <p><a href="/url" title="title">Foo</a></p>
. .
As with full reference links, [whitespace](#whitespace) is allowed As with full reference links, [whitespace] is allowed
between the two sets of brackets: between the two sets of brackets:
. .
[foo] [foo]
[] []
[foo]: /url "title" [foo]: /url "title"
. .
<p><a href="/url" title="title">foo</a></p> <p><a href="/url" title="title">foo</a></p>
. .
A [shortcut reference link](@shortcut-reference-link) A [shortcut reference link](@shortcut-reference-link)
consists of a [link consists of a [link label] that [matches] a
label](#link-label) that [matches](#matches) a [link reference [link reference definition] elsewhere in the
definition](#link-reference-definition) elsewhere in the
document and is not followed by `[]` or a link label. document and is not followed by `[]` or a link label.
The contents of the first link label are parsed as inlines, The contents of the first link label are parsed as inlines,
which are used as the link's text. the link's URI and title which are used as the link's text. the link's URI and title
are provided by the matching link reference definition. are provided by the matching link reference definition.
Thus, `[foo]` is equivalent to `[foo][]`. Thus, `[foo]` is equivalent to `[foo][]`.
. .
[foo] [foo]
[foo]: /url "title" [foo]: /url "title"
skipping to change at line 6235 skipping to change at line 6214
following closing bracket: following closing bracket:
. .
[foo*]: /url [foo*]: /url
*[foo*] *[foo*]
. .
<p>*<a href="/url">foo*</a></p> <p>*<a href="/url">foo*</a></p>
. .
This is a link too, for the same reason:
.
[foo`]: /url
[foo`]`
.
<p>[foo<code>]</code></p>
.
Full references take precedence over shortcut references: Full references take precedence over shortcut references:
. .
[foo][bar] [foo][bar]
[foo]: /url1 [foo]: /url1
[bar]: /url2 [bar]: /url2
. .
<p><a href="/url2">foo</a></p> <p><a href="/url2">foo</a></p>
. .
skipping to change at line 6294 skipping to change at line 6263
[baz]: /url1 [baz]: /url1
[foo]: /url2 [foo]: /url2
. .
<p>[foo]<a href="/url1">bar</a></p> <p>[foo]<a href="/url1">bar</a></p>
. .
## Images ## Images
Syntax for images is like the syntax for links, with one Syntax for images is like the syntax for links, with one
difference. Instead of [link text](#link-text), we have an [image difference. Instead of [link text], we have an
description](@image-description). The rules for this are the [image description](@image-description). The rules for this are the
same as for [link text](#link-text), except that (a) an same as for [link text], except that (a) an
image description starts with `![` rather than `[`, and image description starts with `![` rather than `[`, and
(b) an image description may contain links. (b) an image description may contain links.
An image description has inline elements An image description has inline elements
as its contents. When an image is rendered to HTML, as its contents. When an image is rendered to HTML,
this is standardly used as the image's `alt` attribute. this is standardly used as the image's `alt` attribute.
. .
![foo](/url "title") ![foo](/url "title")
. .
<p><img src="/url" alt="foo" title="title" /></p> <p><img src="/url" alt="foo" title="title" /></p>
skipping to change at line 6331 skipping to change at line 6300
. .
. .
![foo [bar](/url)](/url2) ![foo [bar](/url)](/url2)
. .
<p><img src="/url2" alt="foo bar" /></p> <p><img src="/url2" alt="foo bar" /></p>
. .
Though this spec is concerned with parsing, not rendering, it is Though this spec is concerned with parsing, not rendering, it is
recommended that in rendering to HTML, only the plain string content recommended that in rendering to HTML, only the plain string content
of the [image description](#image-description) be used. Note that in of the [image description] be used. Note that in
the above example, the alt attribute's value is `foo bar`, not `foo the above example, the alt attribute's value is `foo bar`, not `foo
[bar](/url)` or `foo <a href="/url">bar</a>`. Only the plain string [bar](/url)` or `foo <a href="/url">bar</a>`. Only the plain string
content is rendered, without formatting. content is rendered, without formatting.
. .
![foo *bar*][] ![foo *bar*][]
[foo *bar*]: train.jpg "train & tracks" [foo *bar*]: train.jpg "train & tracks"
. .
<p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p> <p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
skipping to change at line 6422 skipping to change at line 6391
The labels are case-insensitive: The labels are case-insensitive:
. .
![Foo][] ![Foo][]
[foo]: /url "title" [foo]: /url "title"
. .
<p><img src="/url" alt="Foo" title="title" /></p> <p><img src="/url" alt="Foo" title="title" /></p>
. .
As with full reference links, [whitespace](#whitespace) is allowed As with full reference links, [whitespace] is allowed
between the two sets of brackets: between the two sets of brackets:
. .
![foo] ![foo]
[] []
[foo]: /url "title" [foo]: /url "title"
. .
<p><img src="/url" alt="foo" title="title" /></p> <p><img src="/url" alt="foo" title="title" /></p>
. .
skipping to change at line 6497 skipping to change at line 6466
. .
\![foo] \![foo]
[foo]: /url "title" [foo]: /url "title"
. .
<p>!<a href="/url" title="title">foo</a></p> <p>!<a href="/url" title="title">foo</a></p>
. .
## Autolinks ## Autolinks
[Autolinks](@autolink) are absolute URIs and email addresses inside `<` and `>`. [Autolink](@autolink)s are absolute URIs and email addresses inside
They are parsed as links, with the URL or email address as the link `<` and `>`. They are parsed as links, with the URL or email address
label. as the link label.
A [URI autolink](@uri-autolink) A [URI autolink](@uri-autolink) consists of `<`, followed by an
consists of `<`, followed by an [absolute [absolute URI] not containing `<`, followed by `>`. It is parsed as
URI](#absolute-uri) not containing `<`, followed by `>`. It is parsed a link to the URI, with the URI as the link's label.
as a link to the URI, with the URI as the link's label.
An [absolute URI](@absolute-uri), An [absolute URI](@absolute-uri),
for these purposes, consists of a [scheme](#scheme) followed by a colon (`:`) for these purposes, consists of a [scheme] followed by a colon (`:`)
followed by zero or more characters other than ASCII followed by zero or more characters other than ASCII
[whitespace](#whitespace) and control characters, `<`, and `>`. If [whitespace] and control characters, `<`, and `>`. If
the URI includes these characters, you must use percent-encoding the URI includes these characters, you must use percent-encoding
(e.g. `%20` for a space). (e.g. `%20` for a space).
The following [schemes](@scheme) The following [schemes](@scheme)
are recognized (case-insensitive): are recognized (case-insensitive):
`coap`, `doi`, `javascript`, `aaa`, `aaas`, `about`, `acap`, `cap`, `coap`, `doi`, `javascript`, `aaa`, `aaas`, `about`, `acap`, `cap`,
`cid`, `crid`, `data`, `dav`, `dict`, `dns`, `file`, `ftp`, `geo`, `go`, `cid`, `crid`, `data`, `dav`, `dict`, `dns`, `file`, `ftp`, `geo`, `go`,
`gopher`, `h323`, `http`, `https`, `iax`, `icap`, `im`, `imap`, `info`, `gopher`, `h323`, `http`, `https`, `iax`, `icap`, `im`, `imap`, `info`,
`ipp`, `iris`, `iris.beep`, `iris.xpc`, `iris.xpcs`, `iris.lwz`, `ldap`, `ipp`, `iris`, `iris.beep`, `iris.xpc`, `iris.xpcs`, `iris.lwz`, `ldap`,
`mailto`, `mid`, `msrp`, `msrps`, `mtqp`, `mupdate`, `news`, `nfs`, `mailto`, `mid`, `msrp`, `msrps`, `mtqp`, `mupdate`, `news`, `nfs`,
skipping to change at line 6576 skipping to change at line 6544
Spaces are not allowed in autolinks: Spaces are not allowed in autolinks:
. .
<http://foo.bar/baz bim> <http://foo.bar/baz bim>
. .
<p>&lt;http://foo.bar/baz bim&gt;</p> <p>&lt;http://foo.bar/baz bim&gt;</p>
. .
An [email autolink](@email-autolink) An [email autolink](@email-autolink)
consists of `<`, followed by an [email address](#email-address), consists of `<`, followed by an [email address],
followed by `>`. The link's label is the email address, followed by `>`. The link's label is the email address,
and the URL is `mailto:` followed by the email address. and the URL is `mailto:` followed by the email address.
An [email address](@email-address), An [email address](@email-address),
for these purposes, is anything that matches for these purposes, is anything that matches
the [non-normative regex from the HTML5 the [non-normative regex from the HTML5
spec](https://html.spec.whatwg.org/multipage/forms.html#e-mail-state-(type=email )): spec](https://html.spec.whatwg.org/multipage/forms.html#e-mail-state-(type=email )):
/^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0- 9])? /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0- 9])?
(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/ (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
skipping to change at line 6658 skipping to change at line 6626
Text between `<` and `>` that looks like an HTML tag is parsed as a Text between `<` and `>` that looks like an HTML tag is parsed as a
raw HTML tag and will be rendered in HTML without escaping. raw HTML tag and will be rendered in HTML without escaping.
Tag and attribute names are not limited to current HTML tags, Tag and attribute names are not limited to current HTML tags,
so custom tags (and even, say, DocBook tags) may be used. so custom tags (and even, say, DocBook tags) may be used.
Here is the grammar for tags: Here is the grammar for tags:
A [tag name](@tag-name) consists of an ASCII letter A [tag name](@tag-name) consists of an ASCII letter
followed by zero or more ASCII letters or digits. followed by zero or more ASCII letters or digits.
An [attribute](@attribute) consists of [whitespace](#whitespace), An [attribute](@attribute) consists of [whitespace],
an [attribute name](#attribute-name), and an optional an [attribute name], and an optional
[attribute value specification](#attribute-value-specification). [attribute value specification].
An [attribute name](@attribute-name) An [attribute name](@attribute-name)
consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII
letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML
specification restricted to ASCII. HTML5 is laxer.) specification restricted to ASCII. HTML5 is laxer.)
An [attribute value specification](@attribute-value-specification) An [attribute value specification](@attribute-value-specification)
consists of optional [whitespace](#whitespace), consists of optional [whitespace],
a `=` character, optional [whitespace](#whitespace), and an [attribute a `=` character, optional [whitespace], and an [attribute
value](#attribute-value). value].
An [attribute value](@attribute-value) An [attribute value](@attribute-value)
consists of an [unquoted attribute value](#unquoted-attribute-value), consists of an [unquoted attribute value],
a [single-quoted attribute value](#single-quoted-attribute-value), a [single-quoted attribute value], or a [double-quoted attribute value].
or a [double-quoted attribute value](#double-quoted-attribute-value).
An [unquoted attribute value](@unquoted-attribute-value) An [unquoted attribute value](@unquoted-attribute-value)
is a nonempty string of characters not is a nonempty string of characters not
including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``. including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``.
A [single-quoted attribute value](@single-quoted-attribute-value) A [single-quoted attribute value](@single-quoted-attribute-value)
consists of `'`, zero or more consists of `'`, zero or more
characters not including `'`, and a final `'`. characters not including `'`, and a final `'`.
A [double-quoted attribute value](@double-quoted-attribute-value) A [double-quoted attribute value](@double-quoted-attribute-value)
consists of `"`, zero or more consists of `"`, zero or more
characters not including `"`, and a final `"`. characters not including `"`, and a final `"`.
An [open tag](@open-tag) consists of a `<` character, An [open tag](@open-tag) consists of a `<` character, a [tag name],
a [tag name](#tag-name), zero or more [attributes](#attribute), zero or more [attributes], optional [whitespace], an optional `/`
optional [whitespace](#whitespace), an optional `/` character, and a character, and a `>` character.
`>` character.
A [closing tag](@closing-tag) consists of the A [closing tag](@closing-tag) consists of the string `</`, a
string `</`, a [tag name](#tag-name), optional [tag name], optional [whitespace], and the character `>`.
[whitespace](#whitespace), and the character `>`.
An [HTML comment](@html-comment) consists of the An [HTML comment](@html-comment) consists of `<!--` + *text* + `-->`,
string `<!--`, a string of characters not including the string `--`, and where *text* does not start with `>` or `->`, does not end with `-`,
the string `-->`. and does not contain `--`. (See the
[HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).)
A [processing instruction](@processing-instruction) A [processing instruction](@processing-instruction)
consists of the string `<?`, a string consists of the string `<?`, a string
of characters not including the string `?>`, and the string of characters not including the string `?>`, and the string
`?>`. `?>`.
A [declaration](@declaration) consists of the A [declaration](@declaration) consists of the
string `<!`, a name consisting of one or more uppercase ASCII letters, string `<!`, a name consisting of one or more uppercase ASCII letters,
[whitespace](#whitespace), a string of characters not including the [whitespace], a string of characters not including the
character `>`, and the character `>`. character `>`, and the character `>`.
A [CDATA section](@cdata-section) consists of A [CDATA section](@cdata-section) consists of
the string `<![CDATA[`, a string of characters not including the string the string `<![CDATA[`, a string of characters not including the string
`]]>`, and the string `]]>`. `]]>`, and the string `]]>`.
An [HTML tag](@html-tag) consists of an [open An [HTML tag](@html-tag) consists of an [open tag], a [closing tag],
tag](#open-tag), a [closing tag](#closing-tag), an [HTML an [HTML comment], a [processing instruction], a [declaration],
comment](#html-comment), a [processing instruction](#processing-instruction), or a [CDATA section].
a [declaration](#declaration), or a [CDATA section](#cdata-section).
Here are some simple open tags: Here are some simple open tags:
. .
<a><bab><c2c> <a><bab><c2c>
. .
<p><a><bab><c2c></p> <p><a><bab><c2c></p>
. .
Empty elements: Empty elements:
. .
<a/><b2/> <a/><b2/>
. .
<p><a/><b2/></p> <p><a/><b2/></p>
. .
[Whitespace](#whitespace) is allowed: [Whitespace] is allowed:
. .
<a /><b2 <a /><b2
data="foo" > data="foo" >
. .
<p><a /><b2 <p><a /><b2
data="foo" ></p> data="foo" ></p>
. .
With attributes: With attributes:
skipping to change at line 6781 skipping to change at line 6746
. .
Illegal attribute values: Illegal attribute values:
. .
<a href="hi'> <a href=hi'> <a href="hi'> <a href=hi'>
. .
<p>&lt;a href=&quot;hi'&gt; &lt;a href=hi'&gt;</p> <p>&lt;a href=&quot;hi'&gt; &lt;a href=hi'&gt;</p>
. .
Illegal [whitespace](#whitespace): Illegal [whitespace]:
. .
< a>< < a><
foo><bar/ > foo><bar/ >
. .
<p>&lt; a&gt;&lt; <p>&lt; a&gt;&lt;
foo&gt;&lt;bar/ &gt;</p> foo&gt;&lt;bar/ &gt;</p>
. .
Missing [whitespace](#whitespace): Missing [whitespace]:
. .
<a href='bar'title=title> <a href='bar'title=title>
. .
<p>&lt;a href='bar'title=title&gt;</p> <p>&lt;a href='bar'title=title&gt;</p>
. .
Closing tags: Closing tags:
. .
skipping to change at line 6833 skipping to change at line 6798
<p>foo <!-- this is a <p>foo <!-- this is a
comment - with hyphen --></p> comment - with hyphen --></p>
. .
. .
foo <!-- not a comment -- two hyphens --> foo <!-- not a comment -- two hyphens -->
. .
<p>foo &lt;!-- not a comment -- two hyphens --&gt;</p> <p>foo &lt;!-- not a comment -- two hyphens --&gt;</p>
. .
Not comments:
.
foo <!--> foo -->
foo <!-- foo--->
.
<p>foo &lt;!--&gt; foo --&gt;</p>
<p>foo &lt;!-- foo---&gt;</p>
.
Processing instructions: Processing instructions:
. .
foo <?php echo $a; ?> foo <?php echo $a; ?>
. .
<p>foo <?php echo $a; ?></p> <p>foo <?php echo $a; ?></p>
. .
Declarations: Declarations:
skipping to change at line 6895 skipping to change at line 6871
. .
foo foo
baz baz
. .
<p>foo<br /> <p>foo<br />
baz</p> baz</p>
. .
For a more visible alternative, a backslash before the For a more visible alternative, a backslash before the
[line ending](#line-ending) may be used instead of two spaces: [line ending] may be used instead of two spaces:
. .
foo\ foo\
baz baz
. .
<p>foo<br /> <p>foo<br />
baz</p> baz</p>
. .
More than two spaces can be used: More than two spaces can be used:
skipping to change at line 7019 skipping to change at line 6995
### foo ### foo
. .
<h3>foo</h3> <h3>foo</h3>
. .
## Soft line breaks ## Soft line breaks
A regular line break (not in a code span or HTML tag) that is not A regular line break (not in a code span or HTML tag) that is not
preceded by two or more spaces is parsed as a softbreak. (A preceded by two or more spaces is parsed as a softbreak. (A
softbreak may be rendered in HTML either as a softbreak may be rendered in HTML either as a
[line ending](#line-ending) or as a space. The result will be the same [line ending] or as a space. The result will be the same
in browsers. In the examples here, a [line ending](#line-ending) will in browsers. In the examples here, a [line ending] will be used.)
be used.)
. .
foo foo
baz baz
. .
<p>foo <p>foo
baz</p> baz</p>
. .
Spaces at the end of the line and beginning of the next line are Spaces at the end of the line and beginning of the next line are
skipping to change at line 7256 skipping to change at line 7231
list_item list_item
paragraph paragraph
str "Qui " str "Qui "
emph emph
str "quodsi iracundia" str "quodsi iracundia"
list_item list_item
paragraph paragraph
str "aliquando id" str "aliquando id"
``` ```
Notice how the [line ending](#line-ending) in the first paragraph has Notice how the [line ending] in the first paragraph has
been parsed as a `softbreak`, and the asterisks in the first list item been parsed as a `softbreak`, and the asterisks in the first list item
have become an `emph`. have become an `emph`.
The document can be rendered as HTML, or in any other format, given The document can be rendered as HTML, or in any other format, given
an appropriate renderer. an appropriate renderer.
 End of changes. 153 change blocks. 
310 lines changed or deleted 284 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/