spec.txt   spec.txt 
--- ---
title: CommonMark Spec title: CommonMark Spec
author: John MacFarlane author: John MacFarlane
version: 0.22 version: 0.23
date: 2015-08-23 date: 2015-12-29
license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)' license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)'
... ...
# Introduction # Introduction
## What is Markdown? ## What is Markdown?
Markdown is a plain text format for writing structured documents, Markdown is a plain text format for writing structured documents,
based on conventions used for indicating formatting in email and based on conventions used for indicating formatting in email and
usenet posts. It was developed in 2004 by John Gruber, who wrote usenet posts. It was developed in 2004 by John Gruber, who wrote
skipping to change at line 39 skipping to change at line 39
1. How much indentation is needed for a sublist? The spec says that 1. How much indentation is needed for a sublist? The spec says that
continuation paragraphs need to be indented four spaces, but is continuation paragraphs need to be indented four spaces, but is
not fully explicit about sublists. It is natural to think that not fully explicit about sublists. It is natural to think that
they, too, must be indented four spaces, but `Markdown.pl` does they, too, must be indented four spaces, but `Markdown.pl` does
not require that. This is hardly a "corner case," and divergences not require that. This is hardly a "corner case," and divergences
between implementations on this issue often lead to surprises for between implementations on this issue often lead to surprises for
users in real documents. (See [this comment by John users in real documents. (See [this comment by John
Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).) Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).)
2. Is a blank line needed before a block quote or header? 2. Is a blank line needed before a block quote or heading?
Most implementations do not require the blank line. However, Most implementations do not require the blank line. However,
this can lead to unexpected results in hard-wrapped text, and this can lead to unexpected results in hard-wrapped text, and
also to ambiguities in parsing (note that some implementations also to ambiguities in parsing (note that some implementations
put the header inside the blockquote, while others do not). put the heading inside the blockquote, while others do not).
(John Gruber has also spoken [in favor of requiring the blank (John Gruber has also spoken [in favor of requiring the blank
lines](http://article.gmane.org/gmane.text.markdown.general/2146).) lines](http://article.gmane.org/gmane.text.markdown.general/2146).)
3. Is a blank line needed before an indented code block? 3. Is a blank line needed before an indented code block?
(`Markdown.pl` requires it, but this is not mentioned in the (`Markdown.pl` requires it, but this is not mentioned in the
documentation, and some implementations do not require it.) documentation, and some implementations do not require it.)
``` markdown ``` markdown
paragraph paragraph
code? code?
skipping to change at line 88 skipping to change at line 88
[here](http://article.gmane.org/gmane.text.markdown.general/2554).) [here](http://article.gmane.org/gmane.text.markdown.general/2554).)
5. Can list markers be indented? Can ordered list markers be right-aligned? 5. Can list markers be indented? Can ordered list markers be right-aligned?
``` markdown ``` markdown
8. item 1 8. item 1
9. item 2 9. item 2
10. item 2a 10. item 2a
``` ```
6. Is this one list with a horizontal rule in its second item, 6. Is this one list with a thematic break in its second item,
or two lists separated by a horizontal rule? or two lists separated by a thematic break?
``` markdown ``` markdown
* a * a
* * * * * * * * * *
* b * b
``` ```
7. When list markers change from numbers to bullets, do we have 7. When list markers change from numbers to bullets, do we have
two lists or one? (The Markdown syntax description suggests two, two lists or one? (The Markdown syntax description suggests two,
but the perl scripts and many other implementations produce one.) but the perl scripts and many other implementations produce one.)
skipping to change at line 131 skipping to change at line 131
``` ```
10. What are the precedence rules between block-level and inline-level 10. What are the precedence rules between block-level and inline-level
structure? For example, how should the following be parsed? structure? For example, how should the following be parsed?
``` markdown ``` markdown
- `a long code span can contain a hyphen like this - `a long code span can contain a hyphen like this
- and it can screw things up` - and it can screw things up`
``` ```
11. Can list items include section headers? (`Markdown.pl` does not 11. Can list items include section headings? (`Markdown.pl` does not
allow this, but does allow blockquotes to include headers.) allow this, but does allow blockquotes to include headings.)
``` markdown ``` markdown
- # Heading - # Heading
``` ```
12. Can list items be empty? 12. Can list items be empty?
``` markdown ``` markdown
* a * a
* *
skipping to change at line 327 skipping to change at line 327
## Insecure characters ## Insecure characters
For security reasons, the Unicode character `U+0000` must be replaced For security reasons, the Unicode character `U+0000` must be replaced
with the replacement character (`U+FFFD`). with the replacement character (`U+FFFD`).
# Blocks and inlines # Blocks and inlines
We can think of a document as a sequence of We can think of a document as a sequence of
[blocks](@block)---structural elements like paragraphs, block [blocks](@block)---structural elements like paragraphs, block
quotations, lists, headers, rules, and code blocks. Some blocks (like quotations, lists, headings, rules, and code blocks. Some blocks (like
block quotes and list items) contain other blocks; others (like block quotes and list items) contain other blocks; others (like
headers and paragraphs) contain [inline](@inline) content---text, headings and paragraphs) contain [inline](@inline) content---text,
links, emphasized text, images, code, and so on. links, emphasized text, images, code, and so on.
## Precedence ## Precedence
Indicators of block structure always take precedence over indicators Indicators of block structure always take precedence over indicators
of inline structure. So, for example, the following is a list with of inline structure. So, for example, the following is a list with
two items, not a list with one item containing a code span: two items, not a list with one item containing a code span:
. .
- `one - `one
- two` - two`
. .
<ul> <ul>
<li>`one</li> <li>`one</li>
<li>two`</li> <li>two`</li>
</ul> </ul>
. .
This means that parsing can proceed in two steps: first, the block This means that parsing can proceed in two steps: first, the block
structure of the document can be discerned; second, text lines inside structure of the document can be discerned; second, text lines inside
paragraphs, headers, and other block constructs can be parsed for inline paragraphs, headings, and other block constructs can be parsed for inline
structure. The second step requires information about link reference structure. The second step requires information about link reference
definitions that will be available only at the end of the first definitions that will be available only at the end of the first
step. Note that the first step requires processing lines in sequence, step. Note that the first step requires processing lines in sequence,
but the second can be parallelized, since the inline parsing of but the second can be parallelized, since the inline parsing of
one block element does not affect the inline parsing of any other. one block element does not affect the inline parsing of any other.
## Container blocks and leaf blocks ## Container blocks and leaf blocks
We can divide blocks into two types: We can divide blocks into two types:
[container block](@container-block)s, [container block](@container-block)s,
which can contain other blocks, and [leaf block](@leaf-block)s, which can contain other blocks, and [leaf block](@leaf-block)s,
which cannot. which cannot.
# Leaf blocks # Leaf blocks
This section describes the different kinds of leaf block that make up a This section describes the different kinds of leaf block that make up a
Markdown document. Markdown document.
## Horizontal rules ## Thematic breaks
A line consisting of 0-3 spaces of indentation, followed by a sequence A line consisting of 0-3 spaces of indentation, followed by a sequence
of three or more matching `-`, `_`, or `*` characters, each followed of three or more matching `-`, `_`, or `*` characters, each followed
optionally by any number of spaces, forms a optionally by any number of spaces, forms a
[horizontal rule](@horizontal-rule). [thematic break](@thematic-break).
. .
*** ***
--- ---
___ ___
. .
<hr /> <hr />
<hr /> <hr />
<hr /> <hr />
. .
skipping to change at line 492 skipping to change at line 492
a------ a------
---a--- ---a---
. .
<p>_ _ _ _ a</p> <p>_ _ _ _ a</p>
<p>a------</p> <p>a------</p>
<p>---a---</p> <p>---a---</p>
. .
It is required that all of the [non-whitespace character]s be the same. It is required that all of the [non-whitespace character]s be the same.
So, this is not a horizontal rule: So, this is not a thematic break:
. .
*-* *-*
. .
<p><em>-</em></p> <p><em>-</em></p>
. .
Horizontal rules do not need blank lines before or after: Thematic breaks do not need blank lines before or after:
. .
- foo - foo
*** ***
- bar - bar
. .
<ul> <ul>
<li>foo</li> <li>foo</li>
</ul> </ul>
<hr /> <hr />
<ul> <ul>
<li>bar</li> <li>bar</li>
</ul> </ul>
. .
Horizontal rules can interrupt a paragraph: Thematic breaks can interrupt a paragraph:
. .
Foo Foo
*** ***
bar bar
. .
<p>Foo</p> <p>Foo</p>
<hr /> <hr />
<p>bar</p> <p>bar</p>
. .
If a line of dashes that meets the above conditions for being a If a line of dashes that meets the above conditions for being a
horizontal rule could also be interpreted as the underline of a [setext thematic break could also be interpreted as the underline of a [setext
header], the interpretation as a heading], the interpretation as a
[setext header] takes precedence. Thus, for example, [setext heading] takes precedence. Thus, for example,
this is a setext header, not a paragraph followed by a horizontal rule: this is a setext heading, not a paragraph followed by a thematic break:
. .
Foo Foo
--- ---
bar bar
. .
<h2>Foo</h2> <h2>Foo</h2>
<p>bar</p> <p>bar</p>
. .
When both a horizontal rule and a list item are possible When both a thematic break and a list item are possible
interpretations of a line, the horizontal rule takes precedence: interpretations of a line, the thematic break takes precedence:
. .
* Foo * Foo
* * * * * *
* Bar * Bar
. .
<ul> <ul>
<li>Foo</li> <li>Foo</li>
</ul> </ul>
<hr /> <hr />
<ul> <ul>
<li>Bar</li> <li>Bar</li>
</ul> </ul>
. .
If you want a horizontal rule in a list item, use a different bullet: If you want a thematic break in a list item, use a different bullet:
. .
- Foo - Foo
- * * * - * * *
. .
<ul> <ul>
<li>Foo</li> <li>Foo</li>
<li> <li>
<hr /> <hr />
</li> </li>
</ul> </ul>
. .
## ATX headers ## ATX headings
An [ATX header](@atx-header) An [ATX heading](@atx-heading)
consists of a string of characters, parsed as inline content, between an consists of a string of characters, parsed as inline content, between an
opening sequence of 1--6 unescaped `#` characters and an optional opening sequence of 1--6 unescaped `#` characters and an optional
closing sequence of any number of unescaped `#` characters. closing sequence of any number of unescaped `#` characters.
The opening sequence of `#` characters cannot be followed directly by a The opening sequence of `#` characters must be followed by a
[non-whitespace character]. The optional closing sequence of `#`s must be [space] or by the end of line. The optional closing sequence of `#`s must be
preceded by a [space] and may be followed by spaces only. The opening preceded by a [space] and may be followed by spaces only. The opening
`#` character may be indented 0-3 spaces. The raw contents of the `#` character may be indented 0-3 spaces. The raw contents of the
header are stripped of leading and trailing spaces before being parsed heading are stripped of leading and trailing spaces before being parsed
as inline content. The header level is equal to the number of `#` as inline content. The heading level is equal to the number of `#`
characters in the opening sequence. characters in the opening sequence.
Simple headers: Simple headings:
. .
# foo # foo
## foo ## foo
### foo ### foo
#### foo #### foo
##### foo ##### foo
###### foo ###### foo
. .
<h1>foo</h1> <h1>foo</h1>
<h2>foo</h2> <h2>foo</h2>
<h3>foo</h3> <h3>foo</h3>
<h4>foo</h4> <h4>foo</h4>
<h5>foo</h5> <h5>foo</h5>
<h6>foo</h6> <h6>foo</h6>
. .
More than six `#` characters is not a header: More than six `#` characters is not a heading:
. .
####### foo ####### foo
. .
<p>####### foo</p> <p>####### foo</p>
. .
At least one space is required between the `#` characters and the At least one space is required between the `#` characters and the
header's contents, unless the header is empty. Note that many heading's contents, unless the heading is empty. Note that many
implementations currently do not require the space. However, the implementations currently do not require the space. However, the
space was required by the space was required by the
[original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py), [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py),
and it helps prevent things like the following from being parsed as and it helps prevent things like the following from being parsed as
headers: headings:
. .
#5 bolt #5 bolt
#foobar #hashtag
. .
<p>#5 bolt</p> <p>#5 bolt</p>
<p>#foobar</p> <p>#hashtag</p>
. .
This is not a header, because the first `#` is escaped: A tab will not work:
.
#→foo
.
<p>#→foo</p>
.
This is not a heading, because the first `#` is escaped:
. .
\## foo \## foo
. .
<p>## foo</p> <p>## foo</p>
. .
Contents are parsed as inlines: Contents are parsed as inlines:
. .
skipping to change at line 714 skipping to change at line 722
Spaces are allowed after the closing sequence: Spaces are allowed after the closing sequence:
. .
### foo ### ### foo ###
. .
<h3>foo</h3> <h3>foo</h3>
. .
A sequence of `#` characters with anything but [space]s following it A sequence of `#` characters with anything but [space]s following it
is not a closing sequence, but counts as part of the contents of the is not a closing sequence, but counts as part of the contents of the
header: heading:
. .
### foo ### b ### foo ### b
. .
<h3>foo ### b</h3> <h3>foo ### b</h3>
. .
The closing sequence must be preceded by a space: The closing sequence must be preceded by a space:
. .
skipping to change at line 743 skipping to change at line 751
. .
### foo \### ### foo \###
## foo #\## ## foo #\##
# foo \# # foo \#
. .
<h3>foo ###</h3> <h3>foo ###</h3>
<h2>foo ###</h2> <h2>foo ###</h2>
<h1>foo #</h1> <h1>foo #</h1>
. .
ATX headers need not be separated from surrounding content by blank ATX headings need not be separated from surrounding content by blank
lines, and they can interrupt paragraphs: lines, and they can interrupt paragraphs:
. .
**** ****
## foo ## foo
**** ****
. .
<hr /> <hr />
<h2>foo</h2> <h2>foo</h2>
<hr /> <hr />
skipping to change at line 766 skipping to change at line 774
. .
Foo bar Foo bar
# baz # baz
Bar foo Bar foo
. .
<p>Foo bar</p> <p>Foo bar</p>
<h1>baz</h1> <h1>baz</h1>
<p>Bar foo</p> <p>Bar foo</p>
. .
ATX headers can be empty: ATX headings can be empty:
. .
## ##
# #
### ### ### ###
. .
<h2></h2> <h2></h2>
<h1></h1> <h1></h1>
<h3></h3> <h3></h3>
. .
## Setext headers ## Setext headings
A [setext header](@setext-header) A [setext heading](@setext-heading)
consists of a line of text, containing at least one [non-whitespace character], consists of a line of text, containing at least one [non-whitespace character],
with no more than 3 spaces indentation, followed by a [setext header with no more than 3 spaces indentation, followed by a [setext heading
underline]. The line of text must be underline]. The line of text must be
one that, were it not followed by the setext header underline, one that, were it not followed by the setext heading underline,
would be interpreted as part of a paragraph: it cannot be would be interpreted as part of a paragraph: it cannot be
interpretable as a [code fence], [ATX header][ATX headers], interpretable as a [code fence], [ATX heading][ATX headings],
[block quote][block quotes], [horizontal rule][horizontal rules], [block quote][block quotes], [thematic break][thematic breaks],
[list item][list items], or [HTML block][HTML blocks]. [list item][list items], or [HTML block][HTML blocks].
A [setext header underline](@setext-header-underline) is a sequence of A [setext heading underline](@setext-heading-underline) is a sequence of
`=` characters or a sequence of `-` characters, with no more than 3 `=` characters or a sequence of `-` characters, with no more than 3
spaces indentation and any number of trailing spaces. If a line spaces indentation and any number of trailing spaces. If a line
containing a single `-` can be interpreted as an containing a single `-` can be interpreted as an
empty [list items], it should be interpreted this way empty [list items], it should be interpreted this way
and not as a [setext header underline]. and not as a [setext heading underline].
The header is a level 1 header if `=` characters are used in the The heading is a level 1 heading if `=` characters are used in the
[setext header underline], and a level 2 [setext heading underline], and a level 2
header if `-` characters are used. The contents of the header are the heading if `-` characters are used. The contents of the heading are the
result of parsing the first line as Markdown inline content. result of parsing the first line as Markdown inline content.
In general, a setext header need not be preceded or followed by a In general, a setext heading need not be preceded or followed by a
blank line. However, it cannot interrupt a paragraph, so when a blank line. However, it cannot interrupt a paragraph, so when a
setext header comes after a paragraph, a blank line is needed between setext heading comes after a paragraph, a blank line is needed between
them. them.
Simple examples: Simple examples:
. .
Foo *bar* Foo *bar*
========= =========
Foo *bar* Foo *bar*
--------- ---------
skipping to change at line 833 skipping to change at line 841
Foo Foo
------------------------- -------------------------
Foo Foo
= =
. .
<h2>Foo</h2> <h2>Foo</h2>
<h1>Foo</h1> <h1>Foo</h1>
. .
The header content can be indented up to three spaces, and need The heading content can be indented up to three spaces, and need
not line up with the underlining: not line up with the underlining:
. .
Foo Foo
--- ---
Foo Foo
----- -----
Foo Foo
skipping to change at line 868 skipping to change at line 876
--- ---
. .
<pre><code>Foo <pre><code>Foo
--- ---
Foo Foo
</code></pre> </code></pre>
<hr /> <hr />
. .
The setext header underline can be indented up to three spaces, and The setext heading underline can be indented up to three spaces, and
may have trailing spaces: may have trailing spaces:
. .
Foo Foo
---- ----
. .
<h2>Foo</h2> <h2>Foo</h2>
. .
Four spaces is too much: Four spaces is too much:
. .
Foo Foo
--- ---
. .
<p>Foo <p>Foo
---</p> ---</p>
. .
The setext header underline cannot contain internal spaces: The setext heading underline cannot contain internal spaces:
. .
Foo Foo
= = = =
Foo Foo
--- - --- -
. .
<p>Foo <p>Foo
= =</p> = =</p>
skipping to change at line 922 skipping to change at line 930
Nor does a backslash at the end: Nor does a backslash at the end:
. .
Foo\ Foo\
---- ----
. .
<h2>Foo\</h2> <h2>Foo\</h2>
. .
Since indicators of block structure take precedence over Since indicators of block structure take precedence over
indicators of inline structure, the following are setext headers: indicators of inline structure, the following are setext headings:
. .
`Foo `Foo
---- ----
` `
<a title="a lot <a title="a lot
--- ---
of dashes"/> of dashes"/>
. .
<h2>`Foo</h2> <h2>`Foo</h2>
<p>`</p> <p>`</p>
<h2>&lt;a title=&quot;a lot</h2> <h2>&lt;a title=&quot;a lot</h2>
<p>of dashes&quot;/&gt;</p> <p>of dashes&quot;/&gt;</p>
. .
The setext header underline cannot be a [lazy continuation The setext heading underline cannot be a [lazy continuation
line] in a list item or block quote: line] in a list item or block quote:
. .
> Foo > Foo
--- ---
. .
<blockquote> <blockquote>
<p>Foo</p> <p>Foo</p>
</blockquote> </blockquote>
<hr /> <hr />
skipping to change at line 962 skipping to change at line 970
. .
- Foo - Foo
--- ---
. .
<ul> <ul>
<li>Foo</li> <li>Foo</li>
</ul> </ul>
<hr /> <hr />
. .
A setext header cannot interrupt a paragraph: A setext heading cannot interrupt a paragraph:
. .
Foo Foo
Bar Bar
--- ---
Foo Foo
Bar Bar
=== ===
. .
skipping to change at line 997 skipping to change at line 1005
Bar Bar
--- ---
Baz Baz
. .
<hr /> <hr />
<h2>Foo</h2> <h2>Foo</h2>
<h2>Bar</h2> <h2>Bar</h2>
<p>Baz</p> <p>Baz</p>
. .
Setext headers cannot be empty: Setext headings cannot be empty:
. .
==== ====
. .
<p>====</p> <p>====</p>
. .
Setext header text lines must not be interpretable as block Setext heading text lines must not be interpretable as block
constructs other than paragraphs. So, the line of dashes constructs other than paragraphs. So, the line of dashes
in these examples gets interpreted as a horizontal rule: in these examples gets interpreted as a thematic break:
. .
--- ---
--- ---
. .
<hr /> <hr />
<hr /> <hr />
. .
. .
skipping to change at line 1047 skipping to change at line 1055
. .
> foo > foo
----- -----
. .
<blockquote> <blockquote>
<p>foo</p> <p>foo</p>
</blockquote> </blockquote>
<hr /> <hr />
. .
If you want a header with `> foo` as its literal text, you can If you want a heading with `> foo` as its literal text, you can
use backslash escapes: use backslash escapes:
. .
\> foo \> foo
------ ------
. .
<h2>&gt; foo</h2> <h2>&gt; foo</h2>
. .
## Indented code blocks ## Indented code blocks
skipping to change at line 1189 skipping to change at line 1197
. .
<pre><code>foo <pre><code>foo
</code></pre> </code></pre>
<p>bar</p> <p>bar</p>
. .
And indented code can occur immediately before and after other kinds of And indented code can occur immediately before and after other kinds of
blocks: blocks:
. .
# Header # Heading
foo foo
Header Heading
------ ------
foo foo
---- ----
. .
<h1>Header</h1> <h1>Heading</h1>
<pre><code>foo <pre><code>foo
</code></pre> </code></pre>
<h2>Header</h2> <h2>Heading</h2>
<pre><code>foo <pre><code>foo
</code></pre> </code></pre>
<hr /> <hr />
. .
The first line can be indented more than four spaces: The first line can be indented more than four spaces:
. .
foo foo
bar bar
skipping to change at line 1357 skipping to change at line 1365
aaa aaa
~~~ ~~~
~~~~ ~~~~
. .
<pre><code>aaa <pre><code>aaa
~~~ ~~~
</code></pre> </code></pre>
. .
Unclosed code blocks are closed by the end of the document Unclosed code blocks are closed by the end of the document
(or the enclosing [block quote] or [list item]): (or the enclosing [block quote][block quotes] or [list item][list items]):
. .
``` ```
. .
<pre><code></code></pre> <pre><code></code></pre>
. .
. .
````` `````
skipping to change at line 1977 skipping to change at line 1985
. .
<style <style
type="text/css"> type="text/css">
h1 {color:red;} h1 {color:red;}
p {color:blue;} p {color:blue;}
</style> </style>
. .
If there is no matching end tag, the block will end at the If there is no matching end tag, the block will end at the
end of the document (or the enclosing [block quote] or end of the document (or the enclosing [block quote][block quotes]
[list item]): or [list item][list items]):
. .
<style <style
type="text/css"> type="text/css">
foo foo
. .
<style <style
type="text/css"> type="text/css">
skipping to change at line 2536 skipping to change at line 2544
Foo Foo
[bar]: /baz [bar]: /baz
[bar] [bar]
. .
<p>Foo <p>Foo
[bar]: /baz</p> [bar]: /baz</p>
<p>[bar]</p> <p>[bar]</p>
. .
However, it can directly follow other block elements, such as headers However, it can directly follow other block elements, such as headings
and horizontal rules, and it need not be followed by a blank line. and thematic breaks, and it need not be followed by a blank line.
. .
# [Foo] # [Foo]
[foo]: /url [foo]: /url
> bar > bar
. .
<h1><a href="/url">Foo</a></h1> <h1><a href="/url">Foo</a></h1>
<blockquote> <blockquote>
<p>bar</p> <p>bar</p>
</blockquote> </blockquote>
skipping to change at line 3400 skipping to change at line 3408
<pre><code>bar <pre><code>bar
</code></pre> </code></pre>
<p>baz</p> <p>baz</p>
<blockquote> <blockquote>
<p>bam</p> <p>bam</p>
</blockquote> </blockquote>
</li> </li>
</ol> </ol>
. .
A list item that contains an indented code block will preserve
empty lines within the code block verbatim, unless there are two
or more empty lines in a row (since as described above, two
blank lines end the list):
.
- Foo
bar
baz
.
<ul>
<li>
<p>Foo</p>
<pre><code>bar
baz
</code></pre>
</li>
</ul>
.
.
- Foo
bar
baz
.
<ul>
<li>
<p>Foo</p>
<pre><code>bar
</code></pre>
</li>
</ul>
<pre><code> baz
</code></pre>
.
Note that ordered list start numbers must be nine digits or less: Note that ordered list start numbers must be nine digits or less:
. .
123456789. ok 123456789. ok
. .
<ol start="123456789"> <ol start="123456789">
<li>ok</li> <li>ok</li>
</ol> </ol>
. .
skipping to change at line 3967 skipping to change at line 4016
<li> <li>
<ol start="2"> <ol start="2">
<li>foo</li> <li>foo</li>
</ol> </ol>
</li> </li>
</ul> </ul>
</li> </li>
</ol> </ol>
. .
A list item can contain a header: A list item can contain a heading:
. .
- # Foo - # Foo
- Bar - Bar
--- ---
baz baz
. .
<ul> <ul>
<li> <li>
<h1>Foo</h1> <h1>Foo</h1>
skipping to change at line 4778 skipping to change at line 4827
Escaped characters are treated as regular characters and do Escaped characters are treated as regular characters and do
not have their usual Markdown meanings: not have their usual Markdown meanings:
. .
\*not emphasized* \*not emphasized*
\<br/> not a tag \<br/> not a tag
\[not a link](/foo) \[not a link](/foo)
\`not code` \`not code`
1\. not a list 1\. not a list
\* not a list \* not a list
\# not a header \# not a heading
\[foo]: /url "not a reference" \[foo]: /url "not a reference"
. .
<p>*not emphasized* <p>*not emphasized*
&lt;br/&gt; not a tag &lt;br/&gt; not a tag
[not a link](/foo) [not a link](/foo)
`not code` `not code`
1. not a list 1. not a list
* not a list * not a list
# not a header # not a heading
[foo]: /url &quot;not a reference&quot;</p> [foo]: /url &quot;not a reference&quot;</p>
. .
If a backslash is itself escaped, the following character is not: If a backslash is itself escaped, the following character is not:
. .
\\*emphasis* \\*emphasis*
. .
<p>\<em>emphasis</em></p> <p>\<em>emphasis</em></p>
. .
skipping to change at line 4872 skipping to change at line 4921
. .
``` foo\+bar ``` foo\+bar
foo foo
``` ```
. .
<pre><code class="language-foo+bar">foo <pre><code class="language-foo+bar">foo
</code></pre> </code></pre>
. .
## Entities ## Entity and numeric character references
With the goal of making this standard as HTML-agnostic as possible, all All valid HTML entity references and numeric character
valid HTML entities (except in code blocks and code spans) references, except those occuring in code blocks, code spans,
are recognized as such and converted into Unicode characters before and raw HTML, are recognized as such and treated as equivalent to the
they are stored in the AST. This means that renderers to formats other corresponding Unicode characters. Conforming CommonMark parsers
than HTML need not be HTML-entity aware. HTML renderers may either escape need not store information about whether a particular character
Unicode characters as entities or leave them as they are. (However, was represented in the source using a Unicode character or
`"`, `&`, `<`, and `>` must always be rendered as entities.) an entity reference.
[Named entities](@name-entities) consist of `&` + any of the valid [Entity references](@entity-references) consist of `&` + any of the valid
HTML5 entity names + `;`. The HTML5 entity names + `;`. The
[following document](https://html.spec.whatwg.org/multipage/entities.json) document <https://html.spec.whatwg.org/multipage/entities.json>
is used as an authoritative source of the valid entity names and their is used as an authoritative source for the valid entity
corresponding code points. references and their corresponding code points.
. .
&nbsp; &amp; &copy; &AElig; &Dcaron; &nbsp; &amp; &copy; &AElig; &Dcaron;
&frac34; &HilbertSpace; &DifferentialD; &frac34; &HilbertSpace; &DifferentialD;
&ClockwiseContourIntegral; &ngE; &ClockwiseContourIntegral; &ngE;
. .
<p>  &amp; © Æ Ď <p>  &amp; © Æ Ď
¾ ℋ ⅆ ¾ ℋ ⅆ
∲ ≧̸</p> ∲ ≧̸</p>
. .
[Decimal entities](@decimal-entities) [Decimal numeric character
consist of `&#` + a string of 1--8 arabic digits + `;`. Again, these references](@decimal-numeric-character-references)
entities need to be recognised and transformed into their corresponding consist of `&#` + a string of 1--8 arabic digits + `;`. A
Unicode code points. Invalid Unicode code points will be replaced by numeric character reference is parsed as the corresponding
Unicode character. Invalid Unicode code points will be replaced by
the "unknown code point" character (`U+FFFD`). For security reasons, the "unknown code point" character (`U+FFFD`). For security reasons,
the code point `U+0000` will also be replaced by `U+FFFD`. the code point `U+0000` will also be replaced by `U+FFFD`.
. .
&#35; &#1234; &#992; &#98765432; &#0; &#35; &#1234; &#992; &#98765432; &#0;
. .
<p># Ӓ Ϡ � �</p> <p># Ӓ Ϡ � �</p>
. .
[Hexadecimal entities](@hexadecimal-entities) consist of `&#` + either [Hexadecimal numeric character
`X` or `x` + a string of 1-8 hexadecimal digits + `;`. They will also references](@hexadecimal-numeric-character-references) consist of `&#` +
be parsed and turned into the corresponding Unicode code points in the either `X` or `x` + a string of 1-8 hexadecimal digits + `;`.
AST. They too are parsed as the corresponding Unicode character (this
time specified with a hexadecimal numeral instead of decimal).
. .
&#X22; &#XD06; &#xcab; &#X22; &#XD06; &#xcab;
. .
<p>&quot; ആ ಫ</p> <p>&quot; ആ ಫ</p>
. .
Here are some nonentities: Here are some nonentities:
. .
&nbsp &x; &#; &#x; &ThisIsWayTooLongToBeAnEntityIsntIt; &hi?; &nbsp &x; &#; &#x;
&ThisIsWayTooLongToBeAnEntityIsntIt; &hi?;
. .
<p>&amp;nbsp &amp;x; &amp;#; &amp;#x; &amp;ThisIsWayTooLongToBeAnEntityIsntIt; & <p>&amp;nbsp &amp;x; &amp;#; &amp;#x;
amp;hi?;</p> &amp;ThisIsWayTooLongToBeAnEntityIsntIt; &amp;hi?;</p>
. .
Although HTML5 does accept some entities without a trailing semicolon Although HTML5 does accept some entity references
(such as `&copy`), these are not recognized as entities here, because it without a trailing semicolon (such as `&copy`), these are not
makes the grammar too ambiguous: recognized here, because it makes the grammar too ambiguous:
. .
&copy &copy
. .
<p>&amp;copy</p> <p>&amp;copy</p>
. .
Strings that are not on the list of HTML5 named entities are not Strings that are not on the list of HTML5 named entities are not
recognized as entities either: recognized as entity references either:
. .
&MadeUpEntity; &MadeUpEntity;
. .
<p>&amp;MadeUpEntity;</p> <p>&amp;MadeUpEntity;</p>
. .
Entities are recognized in any context besides code spans or Entity and numeric character references are recognized in any
code blocks, including raw HTML, URLs, [link title]s, and context besides code spans or code blocks or raw HTML, including
[fenced code block] [info string]s: URLs, [link title]s, and [fenced code block][] [info string]s:
. .
<a href="&ouml;&ouml;.html"> <a href="&ouml;&ouml;.html">
. .
<a href="&ouml;&ouml;.html"> <a href="&ouml;&ouml;.html">
. .
. .
[foo](/f&ouml;&ouml; "f&ouml;&ouml;") [foo](/f&ouml;&ouml; "f&ouml;&ouml;")
. .
skipping to change at line 4982 skipping to change at line 5035
. .
``` f&ouml;&ouml; ``` f&ouml;&ouml;
foo foo
``` ```
. .
<pre><code class="language-föö">foo <pre><code class="language-föö">foo
</code></pre> </code></pre>
. .
Entities are treated as literal text in code spans and code blocks: Entity and numeric character references are treated as literal
text in code spans and code blocks, and in raw HTML:
. .
`f&ouml;&ouml;` `f&ouml;&ouml;`
. .
<p><code>f&amp;ouml;&amp;ouml;</code></p> <p><code>f&amp;ouml;&amp;ouml;</code></p>
. .
. .
f&ouml;f&ouml; f&ouml;f&ouml;
. .
<pre><code>f&amp;ouml;f&amp;ouml; <pre><code>f&amp;ouml;f&amp;ouml;
</code></pre> </code></pre>
. .
.
<a href="f&ouml;f&ouml;"/>
.
<a href="f&ouml;f&ouml;"/>
.
## Code spans ## Code spans
A [backtick string](@backtick-string) A [backtick string](@backtick-string)
is a string of one or more backtick characters (`` ` ``) that is neither is a string of one or more backtick characters (`` ` ``) that is neither
preceded nor followed by a backtick. preceded nor followed by a backtick.
A [code span](@code-span) begins with a backtick string and ends with A [code span](@code-span) begins with a backtick string and ends with
a backtick string of equal length. The contents of the code span are a backtick string of equal length. The contents of the code span are
the characters between the two backtick strings, with leading and the characters between the two backtick strings, with leading and
trailing spaces and [line ending]s removed, and trailing spaces and [line ending]s removed, and
skipping to change at line 5269 skipping to change at line 5329
are a bit more complex than the ones given here.) are a bit more complex than the ones given here.)
The following rules define emphasis and strong emphasis: The following rules define emphasis and strong emphasis:
1. A single `*` character [can open emphasis](@can-open-emphasis) 1. A single `*` character [can open emphasis](@can-open-emphasis)
iff (if and only if) it is part of a [left-flanking delimiter run]. iff (if and only if) it is part of a [left-flanking delimiter run].
2. A single `_` character [can open emphasis] iff 2. A single `_` character [can open emphasis] iff
it is part of a [left-flanking delimiter run] it is part of a [left-flanking delimiter run]
and either (a) not part of a [right-flanking delimiter run] and either (a) not part of a [right-flanking delimiter run]
or (b) part of a [right-flanking delimeter run] or (b) part of a [right-flanking delimiter run]
preceded by punctuation. preceded by punctuation.
3. A single `*` character [can close emphasis](@can-close-emphasis) 3. A single `*` character [can close emphasis](@can-close-emphasis)
iff it is part of a [right-flanking delimiter run]. iff it is part of a [right-flanking delimiter run].
4. A single `_` character [can close emphasis] iff 4. A single `_` character [can close emphasis] iff
it is part of a [right-flanking delimiter run] it is part of a [right-flanking delimiter run]
and either (a) not part of a [left-flanking delimiter run] and either (a) not part of a [left-flanking delimiter run]
or (b) part of a [left-flanking delimeter run] or (b) part of a [left-flanking delimiter run]
followed by punctuation. followed by punctuation.
5. A double `**` [can open strong emphasis](@can-open-strong-emphasis) 5. A double `**` [can open strong emphasis](@can-open-strong-emphasis)
iff it is part of a [left-flanking delimiter run]. iff it is part of a [left-flanking delimiter run].
6. A double `__` [can open strong emphasis] iff 6. A double `__` [can open strong emphasis] iff
it is part of a [left-flanking delimiter run] it is part of a [left-flanking delimiter run]
and either (a) not part of a [right-flanking delimiter run] and either (a) not part of a [right-flanking delimiter run]
or (b) part of a [right-flanking delimeter run] or (b) part of a [right-flanking delimiter run]
preceded by punctuation. preceded by punctuation.
7. A double `**` [can close strong emphasis](@can-close-strong-emphasis) 7. A double `**` [can close strong emphasis](@can-close-strong-emphasis)
iff it is part of a [right-flanking delimiter run]. iff it is part of a [right-flanking delimiter run].
8. A double `__` [can close strong emphasis] 8. A double `__` [can close strong emphasis]
it is part of a [right-flanking delimiter run] it is part of a [right-flanking delimiter run]
and either (a) not part of a [left-flanking delimiter run] and either (a) not part of a [left-flanking delimiter run]
or (b) part of a [left-flanking delimeter run] or (b) part of a [left-flanking delimiter run]
followed by punctuation. followed by punctuation.
9. Emphasis begins with a delimiter that [can open emphasis] and ends 9. Emphasis begins with a delimiter that [can open emphasis] and ends
with a delimiter that [can close emphasis], and that uses the same with a delimiter that [can close emphasis], and that uses the same
character (`_` or `*`) as the opening delimiter. There must character (`_` or `*`) as the opening delimiter. There must
be a nonempty sequence of inlines between the open delimiter be a nonempty sequence of inlines between the open delimiter
and the closing delimiter; these form the contents of the emphasis and the closing delimiter; these form the contents of the emphasis
inline. inline.
10. Strong emphasis begins with a delimiter that 10. Strong emphasis begins with a delimiter that
skipping to change at line 6512 skipping to change at line 6572
<p><a href="foo):">link</a></p> <p><a href="foo):">link</a></p>
. .
A link can contain fragment identifiers and queries: A link can contain fragment identifiers and queries:
. .
[link](#fragment) [link](#fragment)
[link](http://example.com#fragment) [link](http://example.com#fragment)
[link](http://example.com?foo=bar&baz#fragment) [link](http://example.com?foo=3#frag)
. .
<p><a href="#fragment">link</a></p> <p><a href="#fragment">link</a></p>
<p><a href="http://example.com#fragment">link</a></p> <p><a href="http://example.com#fragment">link</a></p>
<p><a href="http://example.com?foo=bar&amp;baz#fragment">link</a></p> <p><a href="http://example.com?foo=3#frag">link</a></p>
. .
Note that a backslash before a non-escapable character is Note that a backslash before a non-escapable character is
just a backslash: just a backslash:
. .
[link](foo\bar) [link](foo\bar)
. .
<p><a href="foo%5Cbar">link</a></p> <p><a href="foo%5Cbar">link</a></p>
. .
URL-escaping should be left alone inside the destination, as all URL-escaping should be left alone inside the destination, as all
URL-escaped characters are also valid URL characters. HTML entities in URL-escaped characters are also valid URL characters. Entity and
the destination will be parsed into the corresponding Unicode numerical character references in the destination will be parsed
code points, as usual, and optionally URL-escaped when written as HTML. into the corresponding Unicode code points, as usual. These may
be optionally URL-escaped when written as HTML, but this spec
does not enforce any particular policy for rendering URLs in
HTML or other formats. Renderers may make different decisions
about how to escape or normalize URLs in the output.
. .
[link](foo%20b&auml;) [link](foo%20b&auml;)
. .
<p><a href="foo%20b%C3%A4">link</a></p> <p><a href="foo%20b%C3%A4">link</a></p>
. .
Note that, because titles can often be parsed as destinations, Note that, because titles can often be parsed as destinations,
if you try to omit the destination and keep the title, you'll if you try to omit the destination and keep the title, you'll
get unexpected results: get unexpected results:
skipping to change at line 6561 skipping to change at line 6625
. .
[link](/url "title") [link](/url "title")
[link](/url 'title') [link](/url 'title')
[link](/url (title)) [link](/url (title))
. .
<p><a href="/url" title="title">link</a> <p><a href="/url" title="title">link</a>
<a href="/url" title="title">link</a> <a href="/url" title="title">link</a>
<a href="/url" title="title">link</a></p> <a href="/url" title="title">link</a></p>
. .
Backslash escapes and entities may be used in titles: Backslash escapes and entity and numeric character references
may be used in titles:
. .
[link](/url "title \"&quot;") [link](/url "title \"&quot;")
. .
<p><a href="/url" title="title &quot;&quot;">link</a></p> <p><a href="/url" title="title &quot;&quot;">link</a></p>
. .
Nested balanced quotes are not allowed without escaping: Nested balanced quotes are not allowed without escaping:
. .
skipping to change at line 6589 skipping to change at line 6654
. .
[link](/url 'title "and" title') [link](/url 'title "and" title')
. .
<p><a href="/url" title="title &quot;and&quot; title">link</a></p> <p><a href="/url" title="title &quot;and&quot; title">link</a></p>
. .
(Note: `Markdown.pl` did allow double quotes inside a double-quoted (Note: `Markdown.pl` did allow double quotes inside a double-quoted
title, and its test suite included a test demonstrating this. title, and its test suite included a test demonstrating this.
But it is hard to see a good rationale for the extra complexity this But it is hard to see a good rationale for the extra complexity this
brings, since there are already many ways---backslash escaping, brings, since there are already many ways---backslash escaping,
entities, or using a different quote type for the enclosing title---to entity and numeric character references, or using a different
write titles containing double quotes. `Markdown.pl`'s handling of quote type for the enclosing title---to write titles containing
titles has a number of other strange features. For example, it allows double quotes. `Markdown.pl`'s handling of titles has a number
single-quoted titles in inline links, but not reference links. And, in of other strange features. For example, it allows single-quoted
reference links but not inline links, it allows a title to begin with titles in inline links, but not reference links. And, in
`"` and end with `)`. `Markdown.pl` 1.0.1 even allows titles with no closing reference links but not inline links, it allows a title to begin
quotation mark, though 1.0.2b8 does not. It seems preferable to adopt with `"` and end with `)`. `Markdown.pl` 1.0.1 even allows
a simple, rational rule that works the same way in inline links and titles with no closing quotation mark, though 1.0.2b8 does not.
link reference definitions.) It seems preferable to adopt a simple, rational rule that works
the same way in inline links and link reference definitions.)
[Whitespace] is allowed around the destination and title: [Whitespace] is allowed around the destination and title:
. .
[link]( /uri [link]( /uri
"title" ) "title" )
. .
<p><a href="/uri" title="title">link</a></p> <p><a href="/uri" title="title">link</a></p>
. .
skipping to change at line 6728 skipping to change at line 6794
[foo<http://example.com/?search=](uri)> [foo<http://example.com/?search=](uri)>
. .
<p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search= ](uri)</a></p> <p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search= ](uri)</a></p>
. .
There are three kinds of [reference link](@reference-link)s: There are three kinds of [reference link](@reference-link)s:
[full](#full-reference-link), [collapsed](#collapsed-reference-link), [full](#full-reference-link), [collapsed](#collapsed-reference-link),
and [shortcut](#shortcut-reference-link). and [shortcut](#shortcut-reference-link).
A [full reference link](@full-reference-link) A [full reference link](@full-reference-link)
consists of a [link text], optional [whitespace], and a [link label] consists of a [link text] immediately followed by a [link label]
that [matches] a [link reference definition] elsewhere in the document. that [matches] a [link reference definition] elsewhere in the document.
A [link label](@link-label) begins with a left bracket (`[`) and ends A [link label](@link-label) begins with a left bracket (`[`) and ends
with the first right bracket (`]`) that is not backslash-escaped. with the first right bracket (`]`) that is not backslash-escaped.
Between these brackets there must be at least one [non-whitespace character]. Between these brackets there must be at least one [non-whitespace character].
Unescaped square bracket characters are not allowed in Unescaped square bracket characters are not allowed in
[link label]s. A link label can have at most 999 [link label]s. A link label can have at most 999
characters inside the square brackets. characters inside the square brackets.
One label [matches](@matches) One label [matches](@matches)
skipping to change at line 6898 skipping to change at line 6964
. .
[Foo [Foo
bar]: /url bar]: /url
[Baz][Foo bar] [Baz][Foo bar]
. .
<p><a href="/url">Baz</a></p> <p><a href="/url">Baz</a></p>
. .
There can be [whitespace] between the [link text] and the [link label]: No [whitespace] is allowed between the [link text] and the
[link label]:
. .
[foo] [bar] [foo] [bar]
[bar]: /url "title" [bar]: /url "title"
. .
<p><a href="/url" title="title">foo</a></p> <p>[foo] <a href="/url" title="title">bar</a></p>
. .
. .
[foo] [foo]
[bar] [bar]
[bar]: /url "title" [bar]: /url "title"
. .
<p><a href="/url" title="title">foo</a></p> <p>[foo]
<a href="/url" title="title">bar</a></p>
. .
This is a departure from John Gruber's original Markdown syntax
description, which explicitly allows whitespace between the link
text and the link label. It brings reference links in line with
[inline link]s, which (according to both original Markdown and
this spec) cannot have whitespace after the link text. More
importantly, it prevents inadvertent capture of consecutive
[shortcut reference link]s. If whitespace is allowed between the
link text and the link label, then in the following we will have
a single reference link, not two shortcut reference links, as
intended:
``` markdown
[foo]
[bar]
[foo]: /url1
[bar]: /url2
```
(Note that [shortcut reference link]s were introduced by Gruber
himself in a beta version of `Markdown.pl`, but never included
in the official syntax description. Without shortcut reference
links, it is harmless to allow space between the link text and
link label; but once shortcut references are introduced, it is
too dangerous to allow this, as it frequently leads to
unintended results.)
When there are multiple matching [link reference definition]s, When there are multiple matching [link reference definition]s,
the first is used: the first is used:
. .
[foo]: /url1 [foo]: /url1
[foo]: /url2 [foo]: /url2
[bar][foo] [bar][foo]
. .
skipping to change at line 6980 skipping to change at line 7075
. .
. .
[foo][ref\[] [foo][ref\[]
[ref\[]: /uri [ref\[]: /uri
. .
<p><a href="/uri">foo</a></p> <p><a href="/uri">foo</a></p>
. .
Note that in this example `]` is not backslash-escaped:
.
[bar\\]: /uri
[bar\\]
.
<p><a href="/uri">bar\</a></p>
.
A [link label] must contain at least one [non-whitespace character]: A [link label] must contain at least one [non-whitespace character]:
. .
[] []
[]: /uri []: /uri
. .
<p>[]</p> <p>[]</p>
<p>[]: /uri</p> <p>[]: /uri</p>
. .
skipping to change at line 7007 skipping to change at line 7112
. .
<p>[ <p>[
]</p> ]</p>
<p>[ <p>[
]: /uri</p> ]: /uri</p>
. .
A [collapsed reference link](@collapsed-reference-link) A [collapsed reference link](@collapsed-reference-link)
consists of a [link label] that [matches] a consists of a [link label] that [matches] a
[link reference definition] elsewhere in the [link reference definition] elsewhere in the
document, optional [whitespace], and the string `[]`. document, followed by the string `[]`.
The contents of the first link label are parsed as inlines, The contents of the first link label are parsed as inlines,
which are used as the link's text. The link's URI and title are which are used as the link's text. The link's URI and title are
provided by the matching reference link definition. Thus, provided by the matching reference link definition. Thus,
`[foo][]` is equivalent to `[foo][foo]`. `[foo][]` is equivalent to `[foo][foo]`.
. .
[foo][] [foo][]
[foo]: /url "title" [foo]: /url "title"
. .
skipping to change at line 7039 skipping to change at line 7144
The link labels are case-insensitive: The link labels are case-insensitive:
. .
[Foo][] [Foo][]
[foo]: /url "title" [foo]: /url "title"
. .
<p><a href="/url" title="title">Foo</a></p> <p><a href="/url" title="title">Foo</a></p>
. .
As with full reference links, [whitespace] is allowed As with full reference links, [whitespace] is not
between the two sets of brackets: allowed between the two sets of brackets:
. .
[foo] [foo]
[] []
[foo]: /url "title" [foo]: /url "title"
. .
<p><a href="/url" title="title">foo</a></p> <p><a href="/url" title="title">foo</a>
[]</p>
. .
A [shortcut reference link](@shortcut-reference-link) A [shortcut reference link](@shortcut-reference-link)
consists of a [link label] that [matches] a consists of a [link label] that [matches] a
[link reference definition] elsewhere in the [link reference definition] elsewhere in the
document and is not followed by `[]` or a link label. document and is not followed by `[]` or a link label.
The contents of the first link label are parsed as inlines, The contents of the first link label are parsed as inlines,
which are used as the link's text. the link's URI and title which are used as the link's text. the link's URI and title
are provided by the matching link reference definition. are provided by the matching link reference definition.
Thus, `[foo]` is equivalent to `[foo][]`. Thus, `[foo]` is equivalent to `[foo][]`.
skipping to change at line 7268 skipping to change at line 7374
. .
![](/url) ![](/url)
. .
<p><img src="/url" alt="" /></p> <p><img src="/url" alt="" /></p>
. .
Reference-style: Reference-style:
. .
![foo] [bar] ![foo][bar]
[bar]: /url [bar]: /url
. .
<p><img src="/url" alt="foo" /></p> <p><img src="/url" alt="foo" /></p>
. .
. .
![foo] [bar] ![foo][bar]
[BAR]: /url [BAR]: /url
. .
<p><img src="/url" alt="foo" /></p> <p><img src="/url" alt="foo" /></p>
. .
Collapsed: Collapsed:
. .
![foo][] ![foo][]
skipping to change at line 7311 skipping to change at line 7417
The labels are case-insensitive: The labels are case-insensitive:
. .
![Foo][] ![Foo][]
[foo]: /url "title" [foo]: /url "title"
. .
<p><img src="/url" alt="Foo" title="title" /></p> <p><img src="/url" alt="Foo" title="title" /></p>
. .
As with full reference links, [whitespace] is allowed As with reference links, [whitespace] is not allowed
between the two sets of brackets: between the two sets of brackets:
. .
![foo] ![foo]
[] []
[foo]: /url "title" [foo]: /url "title"
. .
<p><img src="/url" alt="foo" title="title" /></p> <p><img src="/url" alt="foo" title="title" />
[]</p>
. .
Shortcut: Shortcut:
. .
![foo] ![foo]
[foo]: /url "title" [foo]: /url "title"
. .
<p><img src="/url" alt="foo" title="title" /></p> <p><img src="/url" alt="foo" title="title" /></p>
skipping to change at line 7594 skipping to change at line 7701
A [single-quoted attribute value](@single-quoted-attribute-value) A [single-quoted attribute value](@single-quoted-attribute-value)
consists of `'`, zero or more consists of `'`, zero or more
characters not including `'`, and a final `'`. characters not including `'`, and a final `'`.
A [double-quoted attribute value](@double-quoted-attribute-value) A [double-quoted attribute value](@double-quoted-attribute-value)
consists of `"`, zero or more consists of `"`, zero or more
characters not including `"`, and a final `"`. characters not including `"`, and a final `"`.
An [open tag](@open-tag) consists of a `<` character, a [tag name], An [open tag](@open-tag) consists of a `<` character, a [tag name],
zero or more [attributes](@attribute], optional [whitespace], an optional `/` zero or more [attribute]s, optional [whitespace], an optional `/`
character, and a `>` character. character, and a `>` character.
A [closing tag](@closing-tag) consists of the string `</`, a A [closing tag](@closing-tag) consists of the string `</`, a
[tag name], optional [whitespace], and the character `>`. [tag name], optional [whitespace], and the character `>`.
An [HTML comment](@html-comment) consists of `<!--` + *text* + `-->`, An [HTML comment](@html-comment) consists of `<!--` + *text* + `-->`,
where *text* does not start with `>` or `->`, does not end with `-`, where *text* does not start with `>` or `->`, does not end with `-`,
and does not contain `--`. (See the and does not contain `--`. (See the
[HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).) [HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).)
skipping to change at line 7662 skipping to change at line 7769
<a foo="bar" bam = 'baz <em>"</em>' <a foo="bar" bam = 'baz <em>"</em>'
_boolean zoop:33=zoop:33 /> _boolean zoop:33=zoop:33 />
. .
<p><a foo="bar" bam = 'baz <em>"</em>' <p><a foo="bar" bam = 'baz <em>"</em>'
_boolean zoop:33=zoop:33 /></p> _boolean zoop:33=zoop:33 /></p>
. .
Custom tag names can be used: Custom tag names can be used:
. .
<responsive-image src="foo.jpg" /> Foo <responsive-image src="foo.jpg" />
<My-Tag>
foo
</My-Tag>
. .
<responsive-image src="foo.jpg" /> <p>Foo <responsive-image src="foo.jpg" /></p>
<My-Tag>
foo
</My-Tag>
. .
Illegal tag names, not parsed as HTML: Illegal tag names, not parsed as HTML:
. .
<33> <__> <33> <__>
. .
<p>&lt;33&gt; &lt;__&gt;</p> <p>&lt;33&gt; &lt;__&gt;</p>
. .
skipping to change at line 7719 skipping to change at line 7819
. .
<a href='bar'title=title> <a href='bar'title=title>
. .
<p>&lt;a href='bar'title=title&gt;</p> <p>&lt;a href='bar'title=title&gt;</p>
. .
Closing tags: Closing tags:
. .
</a> </a></foo >
</foo >
. .
</a> <p></a></foo ></p>
</foo >
. .
Illegal attributes in closing tag: Illegal attributes in closing tag:
. .
</a href="foo"> </a href="foo">
. .
<p>&lt;/a href=&quot;foo&quot;&gt;</p> <p>&lt;/a href=&quot;foo&quot;&gt;</p>
. .
skipping to change at line 7785 skipping to change at line 7883
. .
CDATA sections: CDATA sections:
. .
foo <![CDATA[>&<]]> foo <![CDATA[>&<]]>
. .
<p>foo <![CDATA[>&<]]></p> <p>foo <![CDATA[>&<]]></p>
. .
Entities are preserved in HTML attributes: Entity and numeric character references are preserved in HTML
attributes:
. .
<a href="&ouml;"> foo <a href="&ouml;">
. .
<a href="&ouml;"> <p>foo <a href="&ouml;"></p>
. .
Backslash escapes do not work in HTML attributes: Backslash escapes do not work in HTML attributes:
. .
<a href="\*"> foo <a href="\*">
. .
<a href="\*"> <p>foo <a href="\*"></p>
. .
. .
<a href="\""> <a href="\"">
. .
<p>&lt;a href=&quot;&quot;&quot;&gt;</p> <p>&lt;a href=&quot;&quot;&quot;&gt;</p>
. .
## Hard line breaks ## Hard line breaks
skipping to change at line 8017 skipping to change at line 8116
## Overview {-} ## Overview {-}
Parsing has two phases: Parsing has two phases:
1. In the first phase, lines of input are consumed and the block 1. In the first phase, lines of input are consumed and the block
structure of the document---its division into paragraphs, block quotes, structure of the document---its division into paragraphs, block quotes,
list items, and so on---is constructed. Text is assigned to these list items, and so on---is constructed. Text is assigned to these
blocks but not parsed. Link reference definitions are parsed and a blocks but not parsed. Link reference definitions are parsed and a
map of links is constructed. map of links is constructed.
2. In the second phase, the raw text contents of paragraphs and headers 2. In the second phase, the raw text contents of paragraphs and headings
are parsed into sequences of Markdown inline elements (strings, are parsed into sequences of Markdown inline elements (strings,
code spans, links, emphasis, and so on), using the map of link code spans, links, emphasis, and so on), using the map of link
references constructed in phase 1. references constructed in phase 1.
At each point in processing, the document is represented as a tree of At each point in processing, the document is represented as a tree of
**blocks**. The root of the tree is a `document` block. The `document` **blocks**. The root of the tree is a `document` block. The `document`
may have any number of other blocks as **children**. These children may have any number of other blocks as **children**. These children
may, in turn, have other blocks as children. The last child of a block may, in turn, have other blocks as children. The last child of a block
is normally considered **open**, meaning that subsequent lines of input is normally considered **open**, meaning that subsequent lines of input
can alter its contents. (Blocks that are not open are **closed**.) can alter its contents. (Blocks that are not open are **closed**.)
skipping to change at line 8080 skipping to change at line 8179
2. Next, after consuming the continuation markers for existing 2. Next, after consuming the continuation markers for existing
blocks, we look for new block starts (e.g. `>` for a block quote. blocks, we look for new block starts (e.g. `>` for a block quote.
If we encounter a new block start, we close any blocks unmatched If we encounter a new block start, we close any blocks unmatched
in step 1 before creating the new block as a child of the last in step 1 before creating the new block as a child of the last
matched block. matched block.
3. Finally, we look at the remainder of the line (after block 3. Finally, we look at the remainder of the line (after block
markers like `>`, list markers, and indentation have been consumed). markers like `>`, list markers, and indentation have been consumed).
This is text that can be incorporated into the last open This is text that can be incorporated into the last open
block (a paragraph, code block, header, or raw HTML). block (a paragraph, code block, heading, or raw HTML).
Setext headers are formed when we detect that the second line of Setext headings are formed when we detect that the second line of
a paragraph is a setext header line. a paragraph is a setext heading line.
Reference link definitions are detected when a paragraph is closed; Reference link definitions are detected when a paragraph is closed;
the accumulated text lines are parsed to see if they begin with the accumulated text lines are parsed to see if they begin with
one or more reference link definitions. Any remainder becomes a one or more reference link definitions. Any remainder becomes a
normal paragraph. normal paragraph.
We can see how this works by considering how the tree above is We can see how this works by considering how the tree above is
generated by four lines of Markdown: generated by four lines of Markdown:
``` markdown ``` markdown
skipping to change at line 8192 skipping to change at line 8291
-> list_item -> list_item
-> paragraph -> paragraph
"aliquando id" "aliquando id"
``` ```
## Phase 2: inline structure {-} ## Phase 2: inline structure {-}
Once all of the input has been parsed, all open blocks are closed. Once all of the input has been parsed, all open blocks are closed.
We then "walk the tree," visiting every node, and parse raw We then "walk the tree," visiting every node, and parse raw
string contents of paragraphs and headers as inlines. At this string contents of paragraphs and headings as inlines. At this
point we have seen all the link reference definitions, so we can point we have seen all the link reference definitions, so we can
resolve reference links as we go. resolve reference links as we go.
``` tree ``` tree
document document
block_quote block_quote
paragraph paragraph
str "Lorem ipsum dolor" str "Lorem ipsum dolor"
softbreak softbreak
str "sit amet." str "sit amet."
 End of changes. 110 change blocks. 
160 lines changed or deleted 258 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/