spec.txt   spec.txt
--- ---
title: CommonMark Spec title: CommonMark Spec
author: John MacFarlane author: John MacFarlane
version: version: 0.23
date: date: 2015-12-29
... ...
# Introduction # Introduction
## What is Markdown? ## What is Markdown?
Markdown is a plain text format for writing structured documents, Markdown is a plain text format for writing structured documents,
based on conventions used for indicating formatting in email and based on conventions used for indicating formatting in email and
usenet posts. It was developed in 2004 by John Gruber, who wrote usenet posts. It was developed in 2004 by John Gruber, who wrote
skipping to change at line 39 skipping to change at line 39
1. How much indentation is needed for a sublist? The spec says that 1. How much indentation is needed for a sublist? The spec says that
continuation paragraphs need to be indented four spaces, but is continuation paragraphs need to be indented four spaces, but is
not fully explicit about sublists. It is natural to think that not fully explicit about sublists. It is natural to think that
they, too, must be indented four spaces, but Markdown.pl does they, too, must be indented four spaces, but Markdown.pl does
not require that. This is hardly a "corner case," and divergences not require that. This is hardly a "corner case," and divergences
between implementations on this issue often lead to surprises for between implementations on this issue often lead to surprises for
users in real documents. (See [this comment by John users in real documents. (See [this comment by John
Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).) Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).)
2. Is a blank line needed before a block quote or head? 2. Is a blank line needed before a block quote or heading?
Most implementations do not require the blank line. However, Most implementations do not require the blank line. However,
this can lead to unexpected results in hard-wrapped text, and this can lead to unexpected results in hard-wrapped text, and
also to ambiguities in parsing (note that some implementations also to ambiguities in parsing (note that some implementations
put the head inside the blockquote, while others do not). put the heading inside the blockquote, while others do not).
(John Gruber has also spoken [in favor of requiring the blank (John Gruber has also spoken [in favor of requiring the blank
lines](http://article.gmane.org/gmane.text.markdown.general/2146).) lines](http://article.gmane.org/gmane.text.markdown.general/2146).)
3. Is a blank line needed before an indented code block? 3. Is a blank line needed before an indented code block?
(Markdown.pl requires it, but this is not mentioned in the (Markdown.pl requires it, but this is not mentioned in the
documentation, and some implementations do not require it.) documentation, and some implementations do not require it.)
 markdown  markdown
paragraph paragraph
code? code?
skipping to change at line 88 skipping to change at line 88
[here](http://article.gmane.org/gmane.text.markdown.general/2554).) [here](http://article.gmane.org/gmane.text.markdown.general/2554).)
5. Can list markers be indented? Can ordered list markers be right-aligned? 5. Can list markers be indented? Can ordered list markers be right-aligned?
 markdown  markdown
8. item 1 8. item 1
9. item 2 9. item 2
10. item 2a 10. item 2a
 
6. Is this one list with a in its second item, 6. Is this one list with a thematic break in its second item,
or two lists separated by a or two lists separated by a thematic break?
 markdown  markdown
* a * a
* * * * * * * * * *
* b * b
 
7. When list markers change from numbers to bullets, do we have 7. When list markers change from numbers to bullets, do we have
two lists or one? (The Markdown syntax description suggests two, two lists or one? (The Markdown syntax description suggests two,
but the perl scripts and many other implementations produce one.) but the perl scripts and many other implementations produce one.)
skipping to change at line 131 skipping to change at line 131
 
10. What are the precedence rules between block-level and inline-level 10. What are the precedence rules between block-level and inline-level
structure? For example, how should the following be parsed? structure? For example, how should the following be parsed?
 markdown  markdown
- a long code span can contain a hyphen like this - a long code span can contain a hyphen like this
- and it can screw things up - and it can screw things up
 
11. Can list items include section (Markdown.pl does not 11. Can list items include section headings? (Markdown.pl does not
allow this, but does allow blockquotes to include allow this, but does allow blockquotes to include headings.)
 markdown  markdown
 
12. Can list items be empty? 12. Can list items be empty?
 markdown  markdown
* a * a
* *
skipping to change at line 327 skipping to change at line 327
## Insecure characters ## Insecure characters
For security reasons, the Unicode character U+0000 must be replaced For security reasons, the Unicode character U+0000 must be replaced
with the replacement character (U+FFFD). with the replacement character (U+FFFD).
# Blocks and inlines # Blocks and inlines
We can think of a document as a sequence of We can think of a document as a sequence of
[blocks](@block)---structural elements like paragraphs, block [blocks](@block)---structural elements like paragraphs, block
quotations, lists, heads, rules, and code blocks. Some blocks (like quotations, lists, headings, rules, and code blocks. Some blocks (like
block quotes and list items) contain other blocks; others (like block quotes and list items) contain other blocks; others (like
heads and paragraphs) contain [inline](@inline) content---text, headings and paragraphs) contain [inline](@inline) content---text,
links, emphasized text, images, code, and so on. links, emphasized text, images, code, and so on.
## Precedence ## Precedence
Indicators of block structure always take precedence over indicators Indicators of block structure always take precedence over indicators
of inline structure. So, for example, the following is a list with of inline structure. So, for example, the following is a list with
two items, not a list with one item containing a code span: two items, not a list with one item containing a code span:
. .
- one - one
- two - two
. .
<ul> <ul>
<li>one</li> <li>one</li>
<li>two</li> <li>two</li>
</ul> </ul>
. .
This means that parsing can proceed in two steps: first, the block This means that parsing can proceed in two steps: first, the block
structure of the document can be discerned; second, text lines inside structure of the document can be discerned; second, text lines inside
paragraphs, heads, and other block constructs can be parsed for inline paragraphs, headings, and other block constructs can be parsed for inline
definitions that will be available only at the end of the first definitions that will be available only at the end of the first
step. Note that the first step requires processing lines in sequence, step. Note that the first step requires processing lines in sequence,
but the second can be parallelized, since the inline parsing of but the second can be parallelized, since the inline parsing of
one block element does not affect the inline parsing of any other. one block element does not affect the inline parsing of any other.
## Container blocks and leaf blocks ## Container blocks and leaf blocks
We can divide blocks into two types: We can divide blocks into two types:
[container block](@container-block)s, [container block](@container-block)s,
which can contain other blocks, and [leaf block](@leaf-block)s, which can contain other blocks, and [leaf block](@leaf-block)s,
which cannot. which cannot.
# Leaf blocks # Leaf blocks
This section describes the different kinds of leaf block that make up a This section describes the different kinds of leaf block that make up a
Markdown document. Markdown document.
## s ## Thematic breaks
A line consisting of 0-3 spaces of indentation, followed by a sequence A line consisting of 0-3 spaces of indentation, followed by a sequence
of three or more matching -, _, or * characters, each followed of three or more matching -, _, or * characters, each followed
optionally by any number of spaces, forms a optionally by any number of spaces, forms a
[). [thematic break](@thematic-break).
. .
*** ***
--- ---
___ ___
. .
<hr /> <hr />
<hr /> <hr />
<hr /> <hr />
. .
skipping to change at line 492 skipping to change at line 492
a------ a------
---a--- ---a---
. .
<p>_ _ _ _ a</p> <p>_ _ _ _ a</p>
<p>a------</p> <p>a------</p>
<p>---a---</p> <p>---a---</p>
. .
It is required that all of the [non-whitespace character]s be the same. It is required that all of the [non-whitespace character]s be the same.
So, this is not a : So, this is not a thematic break:
. .
*-* *-*
. .
<p><em>-</em></p> <p><em>-</em></p>
. .
s do not need blank lines before or after: Thematic breaks do not need blank lines before or after:
. .
- foo - foo
*** ***
- bar - bar
. .
<ul> <ul>
<li>foo</li> <li>foo</li>
</ul> </ul>
<hr /> <hr />
<ul> <ul>
<li>bar</li> <li>bar</li>
</ul> </ul>
. .
s can interrupt a paragraph: Thematic breaks can interrupt a paragraph:
. .
Foo Foo
*** ***
bar bar
. .
<p>Foo</p> <p>Foo</p>
<hr /> <hr />
<p>bar</p> <p>bar</p>
. .
If a line of dashes that meets the above conditions for being a If a line of dashes that meets the above conditions for being a
could also be interpreted as the underline of a [setext thematic break could also be interpreted as the underline of a [setext
the interpretation as a heading], the interpretation as a
[setext takes precedence. Thus, for example, [setext heading] takes precedence. Thus, for example,
this is a setext not a paragraph followed by a this is a setext heading, not a paragraph followed by a thematic break:
. .
Foo Foo
--- ---
bar bar
. .
<h2>Foo</h2> <h2>Foo</h2>
<p>bar</p> <p>bar</p>
. .
When both a and a list item are possible When both a thematic break and a list item are possible
interpretations of a line, the takes precedence: interpretations of a line, the thematic break takes precedence:
. .
* Foo * Foo
* * * * * *
* Bar * Bar
. .
<ul> <ul>
<li>Foo</li> <li>Foo</li>
</ul> </ul>
<hr /> <hr />
<ul> <ul>
<li>Bar</li> <li>Bar</li>
</ul> </ul>
. .
If you want a in a list item, use a different bullet: If you want a thematic break in a list item, use a different bullet:
. .
- Foo - Foo
- * * * - * * *
. .
<ul> <ul>
<li>Foo</li> <li>Foo</li>
<li> <li>
<hr /> <hr />
</li> </li>
</ul> </ul>
. .
consists of a string of characters, parsed as inline content, between an consists of a string of characters, parsed as inline content, between an
opening sequence of 1--6 unescaped # characters and an optional opening sequence of 1--6 unescaped # characters and an optional
closing sequence of any number of unescaped # characters. closing sequence of any number of unescaped # characters.
The opening sequence of # characters be followed by a The opening sequence of # characters must be followed by a
The optional closing sequence of #s must be [space] or by the end of line. The optional closing sequence of #s must be
preceded by a [space] and may be followed by spaces only. The opening preceded by a [space] and may be followed by spaces only. The opening
# character may be indented 0-3 spaces. The raw contents of the # character may be indented 0-3 spaces. The raw contents of the
are stripped of leading and trailing spaces before being parsed heading are stripped of leading and trailing spaces before being parsed
as inline content. The level is equal to the number of # as inline content. The heading level is equal to the number of #
characters in the opening sequence. characters in the opening sequence.
. .
# foo # foo
## foo ## foo
### foo ### foo
#### foo #### foo
##### foo ##### foo
###### foo ###### foo
. .
<h1>foo</h1> <h1>foo</h1>
<h2>foo</h2> <h2>foo</h2>
<h3>foo</h3> <h3>foo</h3>
<h4>foo</h4> <h4>foo</h4>
<h5>foo</h5> <h5>foo</h5>
<h6>foo</h6> <h6>foo</h6>
. .
More than six # characters is not a head: More than six # characters is not a heading:
. .
####### foo ####### foo
. .
<p>####### foo</p> <p>####### foo</p>
. .
At least one space is required between the # characters and the At least one space is required between the # characters and the
implementations currently do not require the space. However, the implementations currently do not require the space. However, the
space was required by the space was required by the
[original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py), [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py),
and it helps prevent things like the following from being parsed as and it helps prevent things like the following from being parsed as
. .
#5 bolt #5 bolt
# #hashtag
. .
<p>#5 bolt</p> <p>#5 bolt</p>
<p>#</p> <p>#hashtag</p>
. .
This is not a because the first # is escaped: A tab will not work:
.
#→foo
.
<p>#→foo</p>
.
This is not a heading, because the first # is escaped:
. .
\## foo \## foo
. .
<p>## foo</p> <p>## foo</p>
. .
Contents are parsed as inlines: Contents are parsed as inlines:
. .
skipping to change at line 714 skipping to change at line 722
Spaces are allowed after the closing sequence: Spaces are allowed after the closing sequence:
. .
### foo ### ### foo ###
. .
<h3>foo</h3> <h3>foo</h3>
. .
A sequence of # characters with anything but [space]s following it A sequence of # characters with anything but [space]s following it
is not a closing sequence, but counts as part of the contents of the is not a closing sequence, but counts as part of the contents of the
. .
### foo ### b ### foo ### b
. .
<h3>foo ### b</h3> <h3>foo ### b</h3>
. .
The closing sequence must be preceded by a space: The closing sequence must be preceded by a space:
. .
skipping to change at line 743 skipping to change at line 751
. .
### foo \### ### foo \###
## foo #\## ## foo #\##
# foo \# # foo \#
. .
<h3>foo ###</h3> <h3>foo ###</h3>
<h2>foo ###</h2> <h2>foo ###</h2>
<h1>foo #</h1> <h1>foo #</h1>
. .
ATX heads need not be separated from surrounding content by blank ATX headings need not be separated from surrounding content by blank
lines, and they can interrupt paragraphs: lines, and they can interrupt paragraphs:
. .
**** ****
## foo ## foo
**** ****
. .
<hr /> <hr />
<h2>foo</h2> <h2>foo</h2>
<hr /> <hr />
skipping to change at line 766 skipping to change at line 774
. .
Foo bar Foo bar
# baz # baz
Bar foo Bar foo
. .
<p>Foo bar</p> <p>Foo bar</p>
<h1>baz</h1> <h1>baz</h1>
<p>Bar foo</p> <p>Bar foo</p>
. .
. .
## ##
# #
### ### ### ###
. .
<h2></h2> <h2></h2>
<h1></h1> <h1></h1>
<h3></h3> <h3></h3>
. .
consists of a line of text, containing at least one [non-whitespace character], consists of a line of text, containing at least one [non-whitespace character],
with no more than 3 spaces indentation, followed by a [setext head with no more than 3 spaces indentation, followed by a [setext heading
underline]. The line of text must be underline]. The line of text must be
one that, were it not followed by the setext head underline, one that, were it not followed by the setext heading underline,
would be interpreted as part of a paragraph: it cannot be would be interpreted as part of a paragraph: it cannot be
interpretable as a [code fence], [ATX interpretable as a [code fence], [ATX heading][ATX headings],
[block quote][block quotes], [block quote][block quotes], [thematic break][thematic breaks],
[list item][list items], or [HTML block][HTML blocks]. [list item][list items], or [HTML block][HTML blocks].
= characters or a sequence of - characters, with no more than 3 = characters or a sequence of - characters, with no more than 3
spaces indentation and any number of trailing spaces. If a line spaces indentation and any number of trailing spaces. If a line
containing a single - can be interpreted as an containing a single - can be interpreted as an
empty [list items], it should be interpreted this way empty [list items], it should be interpreted this way
and not as a [setext head underline]. and not as a [setext heading underline].
The is a level 1 if = characters are used in the The heading is a level 1 heading if = characters are used in the
[setext underline], and a level 2 [setext heading underline], and a level 2
if - characters are used. The contents of the are the heading if - characters are used. The contents of the heading are the
result of parsing the first line as Markdown inline content. result of parsing the first line as Markdown inline content.
In general, a setext head need not be preceded or followed by a In general, a setext heading need not be preceded or followed by a
blank line. However, it cannot interrupt a paragraph, so when a blank line. However, it cannot interrupt a paragraph, so when a
setext head comes after a paragraph, a blank line is needed between setext heading comes after a paragraph, a blank line is needed between
them. them.
Simple examples: Simple examples:
. .
Foo *bar* Foo *bar*
========= =========
Foo *bar* Foo *bar*
--------- ---------
skipping to change at line 833 skipping to change at line 841
Foo Foo
------------------------- -------------------------
Foo Foo
= =
. .
<h2>Foo</h2> <h2>Foo</h2>
<h1>Foo</h1> <h1>Foo</h1>
. .
The head content can be indented up to three spaces, and need The heading content can be indented up to three spaces, and need
not line up with the underlining: not line up with the underlining:
. .
Foo Foo
--- ---
Foo Foo
----- -----
Foo Foo
skipping to change at line 868 skipping to change at line 876
--- ---
. .
<pre><code>Foo <pre><code>Foo
--- ---
Foo Foo
</code></pre> </code></pre>
<hr /> <hr />
. .
The setext head underline can be indented up to three spaces, and The setext heading underline can be indented up to three spaces, and
may have trailing spaces: may have trailing spaces:
. .
Foo Foo
---- ----
. .
<h2>Foo</h2> <h2>Foo</h2>
. .
Four spaces is too much: Four spaces is too much:
. .
Foo Foo
--- ---
. .
<p>Foo <p>Foo
---</p> ---</p>
. .
The setext head underline cannot contain internal spaces: The setext heading underline cannot contain internal spaces:
. .
Foo Foo
= = = =
Foo Foo
--- - --- -
. .
<p>Foo <p>Foo
= =</p> = =</p>
skipping to change at line 922 skipping to change at line 930
Nor does a backslash at the end: Nor does a backslash at the end:
. .
Foo\ Foo\
---- ----
. .
<h2>Foo\</h2> <h2>Foo\</h2>
. .
Since indicators of block structure take precedence over Since indicators of block structure take precedence over
indicators of inline structure, the following are setext heads: indicators of inline structure, the following are setext headings:
. .
Foo Foo
---- ----

<a title="a lot <a title="a lot
--- ---
of dashes"/> of dashes"/>
. .
<h2>Foo</h2> <h2>Foo</h2>
<p></p> <p></p>
<h2>&lt;a title=&quot;a lot</h2> <h2>&lt;a title=&quot;a lot</h2>
<p>of dashes&quot;/&gt;</p> <p>of dashes&quot;/&gt;</p>
. .
The setext head underline cannot be a [lazy continuation The setext heading underline cannot be a [lazy continuation
line] in a list item or block quote: line] in a list item or block quote:
. .
> Foo > Foo
--- ---
. .
<blockquote> <blockquote>
<p>Foo</p> <p>Foo</p>
</blockquote> </blockquote>
<hr /> <hr />
skipping to change at line 962 skipping to change at line 970
. .
- Foo - Foo
--- ---
. .
<ul> <ul>
<li>Foo</li> <li>Foo</li>
</ul> </ul>
<hr /> <hr />
. .
A setext head cannot interrupt a paragraph: A setext heading cannot interrupt a paragraph:
. .
Foo Foo
Bar Bar
--- ---
Foo Foo
Bar Bar
=== ===
. .
skipping to change at line 997 skipping to change at line 1005
Bar Bar
--- ---
Baz Baz
. .
<hr /> <hr />
<h2>Foo</h2> <h2>Foo</h2>
<h2>Bar</h2> <h2>Bar</h2>
<p>Baz</p> <p>Baz</p>
. .
. .
==== ====
. .
<p>====</p> <p>====</p>
. .
Setext head text lines must not be interpretable as block Setext heading text lines must not be interpretable as block
constructs other than paragraphs. So, the line of dashes constructs other than paragraphs. So, the line of dashes
in these examples gets interpreted as a : in these examples gets interpreted as a thematic break:
. .
--- ---
--- ---
. .
<hr /> <hr />
<hr /> <hr />
. .
. .
skipping to change at line 1047 skipping to change at line 1055
. .
> foo > foo
----- -----
. .
<blockquote> <blockquote>
<p>foo</p> <p>foo</p>
</blockquote> </blockquote>
<hr /> <hr />
. .
If you want a head with > foo as its literal text, you can If you want a heading with > foo as its literal text, you can
use backslash escapes: use backslash escapes:
. .
\> foo \> foo
------ ------
. .
<h2>&gt; foo</h2> <h2>&gt; foo</h2>
. .
## Indented code blocks ## Indented code blocks
skipping to change at line 1189 skipping to change at line 1197
. .
<pre><code>foo <pre><code>foo
</code></pre> </code></pre>
<p>bar</p> <p>bar</p>
. .
And indented code can occur immediately before and after other kinds of And indented code can occur immediately before and after other kinds of
blocks: blocks:
. .
foo foo
------ ------
foo foo
---- ----
. .
<pre><code>foo <pre><code>foo
</code></pre> </code></pre>
<pre><code>foo <pre><code>foo
</code></pre> </code></pre>
<hr /> <hr />
. .
The first line can be indented more than four spaces: The first line can be indented more than four spaces:
. .
foo foo
bar bar
skipping to change at line 1357 skipping to change at line 1365
aaa aaa
~~~ ~~~
~~~~ ~~~~
. .
<pre><code>aaa <pre><code>aaa
~~~ ~~~
</code></pre> </code></pre>
. .
Unclosed code blocks are closed by the end of the document Unclosed code blocks are closed by the end of the document
(or the enclosing [block quote]]): (or the enclosing [block quote][block quotes] or [list item][list items]):
. .
 
. .
<pre><code></code></pre> <pre><code></code></pre>
. .
. .
 
skipping to change at line 1977 skipping to change at line 1985
. .
<style <style
type="text/css"> type="text/css">
h1 {color:red;} h1 {color:red;}
p {color:blue;} p {color:blue;}
</style> </style>
. .
If there is no matching end tag, the block will end at the If there is no matching end tag, the block will end at the
end of the document (or the enclosing [block or end of the document (or the enclosing [block quote][block quotes]
[list or [list item][list items]):
. .
<style <style
type="text/css"> type="text/css">
foo foo
. .
<style <style
type="text/css"> type="text/css">
skipping to change at line 2536 skipping to change at line 2544
Foo Foo
[bar]: /baz [bar]: /baz
[bar] [bar]
. .
<p>Foo <p>Foo
[bar]: /baz</p> [bar]: /baz</p>
<p>[bar]</p> <p>[bar]</p>
. .
However, it can directly follow other block elements, such as However, it can directly follow other block elements, such as headings
and and it need not be followed by a blank line. and thematic breaks, and it need not be followed by a blank line.
. .
# [Foo] # [Foo]
[foo]: /url [foo]: /url
> bar > bar
. .
<h1><a href="/url">Foo</a></h1> <h1><a href="/url">Foo</a></h1>
<blockquote> <blockquote>
<p>bar</p> <p>bar</p>
</blockquote> </blockquote>
skipping to change at line 3400 skipping to change at line 3408
<pre><code>bar <pre><code>bar
</code></pre> </code></pre>
<p>baz</p> <p>baz</p>
<blockquote> <blockquote>
<p>bam</p> <p>bam</p>
</blockquote> </blockquote>
</li> </li>
</ol> </ol>
. .
A list item that contains an indented code block will preserve
empty lines within the code block verbatim, unless there are two
or more empty lines in a row (since as described above, two
blank lines end the list):
.
- Foo
bar
baz
.
<ul>
<li>
<p>Foo</p>
<pre><code>bar
baz
</code></pre>
</li>
</ul>
.
.
- Foo
bar
baz
.
<ul>
<li>
<p>Foo</p>
<pre><code>bar
</code></pre>
</li>
</ul>
<pre><code> baz
</code></pre>
.
Note that ordered list start numbers must be nine digits or less: Note that ordered list start numbers must be nine digits or less:
. .
123456789. ok 123456789. ok
. .
<ol start="123456789"> <ol start="123456789">
<li>ok</li> <li>ok</li>
</ol> </ol>
. .
skipping to change at line 3967 skipping to change at line 4016
<li> <li>
<ol start="2"> <ol start="2">
<li>foo</li> <li>foo</li>
</ol> </ol>
</li> </li>
</ul> </ul>
</li> </li>
</ol> </ol>
. .
A list item can contain a head: A list item can contain a heading:
. .
- # Foo - # Foo
- Bar - Bar
--- ---
baz baz
. .
<ul> <ul>
<li> <li>
<h1>Foo</h1> <h1>Foo</h1>
skipping to change at line 4778 skipping to change at line 4827
Escaped characters are treated as regular characters and do Escaped characters are treated as regular characters and do
not have their usual Markdown meanings: not have their usual Markdown meanings:
. .
\*not emphasized* \*not emphasized*
\<br/> not a tag \<br/> not a tag
$not a link](/foo) \[not a link](/foo) \not code \not code 1\. not a list 1\. not a list \* not a list \* not a list \# not a head \# not a heading \[foo]: /url "not a reference" \[foo]: /url "not a reference" . . <p>*not emphasized* <p>*not emphasized* &lt;br/&gt; not a tag &lt;br/&gt; not a tag [not a link](/foo) [not a link](/foo) not code not code 1. not a list 1. not a list * not a list * not a list # not a head # not a heading [foo]: /url &quot;not a reference&quot;</p> [foo]: /url &quot;not a reference&quot;</p> . . If a backslash is itself escaped, the following character is not: If a backslash is itself escaped, the following character is not: . . \\*emphasis* \\*emphasis* . . <p>\<em>emphasis</em></p> <p>\<em>emphasis</em></p> . . skipping to change at line 4872 skipping to change at line 4921 . .  foo\+bar  foo\+bar foo foo   . . <pre><code class="language-foo+bar">foo <pre><code class="language-foo+bar">foo </code></pre> </code></pre> . . ## Entites ## Entity and numeric character references All valid HTML entity references and numeric character valid HTML in code code references, except those occuring in code blocks, code spans, are recognized as such and and raw HTML, are recognized as such and treated as equivalent to the to corresponding Unicode characters. Conforming CommonMark parsers need not need not store information about whether a particular character Unicode or was represented in the source using a Unicode character or an entity reference. [es) consist of & + any of the valid [Entity references](@entity-references) consist of & + any of the valid HTML5 entity names + ;. The HTML5 entity names + ;. The document <https://html.spec.whatwg.org/multipage/entities.json> is used as an authoritative source the valid entity and their is used as an authoritative source for the valid entity corresponding code points. references and their corresponding code points. . . &nbsp; &amp; &copy; &AElig; &Dcaron; &nbsp; &amp; &copy; &AElig; &Dcaron; &frac34; &HilbertSpace; &DifferentialD; &frac34; &HilbertSpace; &DifferentialD; &ClockwiseContourIntegral; &ngE; &ClockwiseContourIntegral; &ngE; . . <p> &amp; © Æ Ď <p> &amp; © Æ Ď ¾ ℋ ⅆ ¾ ℋ ⅆ ∲ ≧̸</p> ∲ ≧̸</p> . . [Decimal [Decimal numeric character consist of &# + a string of 1--8 arabic digits + ;. references](@decimal-numeric-character-references) corresponding consist of &# + a string of 1--8 arabic digits + ;. A Unicode Invalid Unicode code points will be replaced by numeric character reference is parsed as the corresponding Unicode character. Invalid Unicode code points will be replaced by the "unknown code point" character (U+FFFD). For security reasons, the "unknown code point" character (U+FFFD). For security reasons, the code point U+0000 will also be replaced by U+FFFD. the code point U+0000 will also be replaced by U+FFFD. . . &#35; &#1234; &#992; &#98765432; &#0; &#35; &#1234; &#992; &#98765432; &#0; . . <p># Ӓ Ϡ � �</p> <p># Ӓ Ϡ � �</p> . . [Hexadecimal consist of &# + either [Hexadecimal numeric character X or x + a string of 1-8 hexadecimal digits + ;. They references](@hexadecimal-numeric-character-references) consist of &# + parsed the corresponding Unicode either X or x + a string of 1-8 hexadecimal digits + ;. They too are parsed as the corresponding Unicode character (this time specified with a hexadecimal numeral instead of decimal). . . &#X22; &#XD06; &#xcab; &#X22; &#XD06; &#xcab; . . <p>&quot; ആ ಫ</p> <p>&quot; ആ ಫ</p> . . Here are some nonentities: Here are some nonentities: . . &nbsp &x; &#; &#x; &ThisIsWayTooLongToBeAnEntityIsntIt; &hi?; &nbsp &x; &#; &#x; &ThisIsWayTooLongToBeAnEntityIsntIt; &hi?; . . <p>&amp;nbsp &amp;x; &amp;#; &amp;#x; &amp;ThisIsWayTooLongToBeAnEntityIsntIt; <p>&amp;nbsp &amp;x; &amp;#; &amp;#x; &amp;ThisIsWayTooLongToBeAnEntityIsntIt; &amp;hi?;</p> . . Although HTML5 does accept some without a trailing semicolon Although HTML5 does accept some entity references (such as &copy), these are not recognized here, because it without a trailing semicolon (such as &copy), these are not makes the grammar too ambiguous: recognized here, because it makes the grammar too ambiguous: . . &copy &copy . . <p>&amp;copy</p> <p>&amp;copy</p> . . Strings that are not on the list of HTML5 named entities are not Strings that are not on the list of HTML5 named entities are not recognized as entites either: recognized as entity references either: . . &MadeUpEntity; &MadeUpEntity; . . <p>&amp;MadeUpEntity;</p> <p>&amp;MadeUpEntity;</p> . . are recognized in any context besides code spans or Entity and numeric character references are recognized in any code raw HTML, URLs, [link title]s, and context besides code spans or code blocks or raw HTML, including [fenced code [info string]s: URLs, [link title]s, and [fenced code block][] [info string]s: . . <a href="&ouml;&ouml;.html"> <a href="&ouml;&ouml;.html"> . . <a href="&ouml;&ouml;.html"> <a href="&ouml;&ouml;.html"> . . . . [foo](/f&ouml;&ouml; "f&ouml;&ouml;") [foo](/f&ouml;&ouml; "f&ouml;&ouml;") . . skipping to change at line 4982 skipping to change at line 5035 . .  f&ouml;&ouml;  f&ouml;&ouml; foo foo   . . <pre><code class="language-föö">foo <pre><code class="language-föö">foo </code></pre> </code></pre> . . are treated as literal text in code spans and code Entity and numeric character references are treated as literal text in code spans and code blocks, and in raw HTML: . . f&ouml;&ouml; f&ouml;&ouml; . . <p><code>f&amp;ouml;&amp;ouml;</code></p> <p><code>f&amp;ouml;&amp;ouml;</code></p> . . . . f&ouml;f&ouml; f&ouml;f&ouml; . . <pre><code>f&amp;ouml;f&amp;ouml; <pre><code>f&amp;ouml;f&amp;ouml; </code></pre> </code></pre> . . . <a href="f&ouml;f&ouml;"/> . <a href="f&ouml;f&ouml;"/> . ## Code spans ## Code spans A [backtick string](@backtick-string) A [backtick string](@backtick-string) is a string of one or more backtick characters (  ) that is neither is a string of one or more backtick characters (  ) that is neither preceded nor followed by a backtick. preceded nor followed by a backtick. A [code span](@code-span) begins with a backtick string and ends with A [code span](@code-span) begins with a backtick string and ends with a backtick string of equal length. The contents of the code span are a backtick string of equal length. The contents of the code span are the characters between the two backtick strings, with leading and the characters between the two backtick strings, with leading and trailing spaces and [line ending]s removed, and trailing spaces and [line ending]s removed, and skipping to change at line 5269 skipping to change at line 5329 are a bit more complex than the ones given here.) are a bit more complex than the ones given here.) The following rules define emphasis and strong emphasis: The following rules define emphasis and strong emphasis: 1. A single * character [can open emphasis](@can-open-emphasis) 1. A single * character [can open emphasis](@can-open-emphasis) iff (if and only if) it is part of a [left-flanking delimiter run]. iff (if and only if) it is part of a [left-flanking delimiter run]. 2. A single _ character [can open emphasis] iff 2. A single _ character [can open emphasis] iff it is part of a [left-flanking delimiter run] it is part of a [left-flanking delimiter run] and either (a) not part of a [right-flanking delimiter run] and either (a) not part of a [right-flanking delimiter run] or (b) part of a [right-flanking delimter run] or (b) part of a [right-flanking delimiter run] preceded by punctuation. preceded by punctuation. 3. A single * character [can close emphasis](@can-close-emphasis) 3. A single * character [can close emphasis](@can-close-emphasis) iff it is part of a [right-flanking delimiter run]. iff it is part of a [right-flanking delimiter run]. 4. A single _ character [can close emphasis] iff 4. A single _ character [can close emphasis] iff it is part of a [right-flanking delimiter run] it is part of a [right-flanking delimiter run] and either (a) not part of a [left-flanking delimiter run] and either (a) not part of a [left-flanking delimiter run] or (b) part of a [left-flanking delimter run] or (b) part of a [left-flanking delimiter run] followed by punctuation. followed by punctuation. 5. A double ** [can open strong emphasis](@can-open-strong-emphasis) 5. A double ** [can open strong emphasis](@can-open-strong-emphasis) iff it is part of a [left-flanking delimiter run]. iff it is part of a [left-flanking delimiter run]. 6. A double __ [can open strong emphasis] iff 6. A double __ [can open strong emphasis] iff it is part of a [left-flanking delimiter run] it is part of a [left-flanking delimiter run] and either (a) not part of a [right-flanking delimiter run] and either (a) not part of a [right-flanking delimiter run] or (b) part of a [right-flanking delimter run] or (b) part of a [right-flanking delimiter run] preceded by punctuation. preceded by punctuation. 7. A double ** [can close strong emphasis](@can-close-strong-emphasis) 7. A double ** [can close strong emphasis](@can-close-strong-emphasis) iff it is part of a [right-flanking delimiter run]. iff it is part of a [right-flanking delimiter run]. 8. A double __ [can close strong emphasis] 8. A double __ [can close strong emphasis] it is part of a [right-flanking delimiter run] it is part of a [right-flanking delimiter run] and either (a) not part of a [left-flanking delimiter run] and either (a) not part of a [left-flanking delimiter run] or (b) part of a [left-flanking delimter run] or (b) part of a [left-flanking delimiter run] followed by punctuation. followed by punctuation. 9. Emphasis begins with a delimiter that [can open emphasis] and ends 9. Emphasis begins with a delimiter that [can open emphasis] and ends with a delimiter that [can close emphasis], and that uses the same with a delimiter that [can close emphasis], and that uses the same character (_ or *) as the opening delimiter. There must character (_ or *) as the opening delimiter. There must be a nonempty sequence of inlines between the open delimiter be a nonempty sequence of inlines between the open delimiter and the closing delimiter; these form the contents of the emphasis and the closing delimiter; these form the contents of the emphasis inline. inline. 10. Strong emphasis begins with a delimiter that 10. Strong emphasis begins with a delimiter that skipping to change at line 6512 skipping to change at line 6572 <p><a href="foo):">link</a></p> <p><a href="foo):">link</a></p> . . A link can contain fragment identifiers and queries: A link can contain fragment identifiers and queries: . . [link](#fragment) [link](#fragment) [link](http://example.com#fragment) [link](http://example.com#fragment) [link](http://example.com?foo=) [link](http://example.com?foo=3#frag) . . <p><a href="#fragment">link</a></p> <p><a href="#fragment">link</a></p> <p><a href="http://example.com#fragment">link</a></p> <p><a href="http://example.com#fragment">link</a></p> <p><a href="http://example.com?foo=">link</a></p> <p><a href="http://example.com?foo=3#frag">link</a></p> . . Note that a backslash before a non-escapable character is Note that a backslash before a non-escapable character is just a backslash: just a backslash: . . [link](foo\bar) [link](foo\bar) . . <p><a href="foo%5Cbar">link</a></p> <p><a href="foo%5Cbar">link</a></p> . . URL-escaping should be left alone inside the destination, as all URL-escaping should be left alone inside the destination, as all URL-escaped characters are also valid URL characters. in URL-escaped characters are also valid URL characters. Entity and the destination will be parsed into the corresponding Unicode numerical character references in the destination will be parsed code points, as optionally URL-escaped when written as into the corresponding Unicode code points, as usual. These may be optionally URL-escaped when written as HTML, but this spec does not enforce any particular policy for rendering URLs in HTML or other formats. Renderers may make different decisions about how to escape or normalize URLs in the output. . . [link](foo%20b&auml;) [link](foo%20b&auml;) . . <p><a href="foo%20b%C3%A4">link</a></p> <p><a href="foo%20b%C3%A4">link</a></p> . . Note that, because titles can often be parsed as destinations, Note that, because titles can often be parsed as destinations, if you try to omit the destination and keep the title, you'll if you try to omit the destination and keep the title, you'll get unexpected results: get unexpected results: skipping to change at line 6561 skipping to change at line 6625 . . [link](/url "title") [link](/url "title") [link](/url 'title') [link](/url 'title') [link](/url (title)) [link](/url (title)) . . <p><a href="/url" title="title">link</a> <p><a href="/url" title="title">link</a> <a href="/url" title="title">link</a> <a href="/url" title="title">link</a> <a href="/url" title="title">link</a></p> <a href="/url" title="title">link</a></p> . . Backslash escapes and may be used in titles: Backslash escapes and entity and numeric character references may be used in titles: . . [link](/url "title \"&quot;") [link](/url "title \"&quot;") . . <p><a href="/url" title="title &quot;&quot;">link</a></p> <p><a href="/url" title="title &quot;&quot;">link</a></p> . . Nested balanced quotes are not allowed without escaping: Nested balanced quotes are not allowed without escaping: . . skipping to change at line 6589 skipping to change at line 6654 . . [link](/url 'title "and" title') [link](/url 'title "and" title') . . <p><a href="/url" title="title &quot;and&quot; title">link</a></p> <p><a href="/url" title="title &quot;and&quot; title">link</a></p> . . (Note: Markdown.pl did allow double quotes inside a double-quoted (Note: Markdown.pl did allow double quotes inside a double-quoted title, and its test suite included a test demonstrating this. title, and its test suite included a test demonstrating this. But it is hard to see a good rationale for the extra complexity this But it is hard to see a good rationale for the extra complexity this brings, since there are already many ways---backslash escaping, brings, since there are already many ways---backslash escaping, or using a different quote type for the enclosing title---to entity and numeric character references, or using a different write titles containing double quotes. Markdown.pl's handling of quote type for the enclosing title---to write titles containing titles has a number of other strange features. For example, it allows double quotes. Markdown.pl's handling of titles has a number single-quoted titles in inline links, but not reference links. And, in of other strange features. For example, it allows single-quoted reference links but not inline links, it allows a title to begin with titles in inline links, but not reference links. And, in " and end with ). Markdown.pl 1.0.1 even allows titles with no closing reference links but not inline links, it allows a title to begin quotation mark, though 1.0.2b8 does not. It seems preferable to adopt with " and end with ). Markdown.pl 1.0.1 even allows a simple, rational rule that works the same way in inline links and titles with no closing quotation mark, though 1.0.2b8 does not. link reference definitions.) It seems preferable to adopt a simple, rational rule that works the same way in inline links and link reference definitions.) [Whitespace] is allowed around the destination and title: [Whitespace] is allowed around the destination and title: . . [link]( /uri [link]( /uri "title" ) "title" ) . . <p><a href="/uri" title="title">link</a></p> <p><a href="/uri" title="title">link</a></p> . . skipping to change at line 6728 skipping to change at line 6794 [foo<http://example.com/?search=](uri)> [foo<http://example.com/?search=](uri)> . . <p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search= ](uri)</a></p> <p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search= ](uri)</a></p> . . There are three kinds of [reference link](@reference-link)s: There are three kinds of [reference link](@reference-link)s: [full](#full-reference-link), [collapsed](#collapsed-reference-link), [full](#full-reference-link), [collapsed](#collapsed-reference-link), and [shortcut](#shortcut-reference-link). and [shortcut](#shortcut-reference-link). A [full reference link](@full-reference-link) A [full reference link](@full-reference-link) consists of a [link text] a [link label] consists of a [link text] immediately followed by a [link label] that [matches] a [link reference definition] elsewhere in the document. that [matches] a [link reference definition] elsewhere in the document. A [link label](@link-label) begins with a left bracket ([) and ends A [link label](@link-label) begins with a left bracket ([) and ends with the first right bracket (]) that is not backslash-escaped. with the first right bracket (]) that is not backslash-escaped. Between these brackets there must be at least one [non-whitespace character]. Between these brackets there must be at least one [non-whitespace character]. Unescaped square bracket characters are not allowed in Unescaped square bracket characters are not allowed in [link label]s. A link label can have at most 999 [link label]s. A link label can have at most 999 characters inside the square brackets. characters inside the square brackets. One label [matches](@matches) One label [matches](@matches) skipping to change at line 6898 skipping to change at line 6964 . . [Foo [Foo bar]: /url bar]: /url [Baz][Foo bar] [Baz][Foo bar] . . <p><a href="/url">Baz</a></p> <p><a href="/url">Baz</a></p> . . [whitespace] between the [link text] and the [link label]: No [whitespace] is allowed between the [link text] and the [link label]: . . [foo] [bar] [foo] [bar] [bar]: /url "title" [bar]: /url "title" . . <p></a></p> <p>[foo] <a href="/url" title="title">bar</a></p> . . . . [foo] [foo] [bar] [bar] [bar]: /url "title" [bar]: /url "title" . . href="/url" <p>[foo] <a href="/url" title="title">bar</a></p> . . This is a departure from John Gruber's original Markdown syntax description, which explicitly allows whitespace between the link text and the link label. It brings reference links in line with [inline link]s, which (according to both original Markdown and this spec) cannot have whitespace after the link text. More importantly, it prevents inadvertent capture of consecutive [shortcut reference link]s. If whitespace is allowed between the link text and the link label, then in the following we will have a single reference link, not two shortcut reference links, as intended:  markdown [foo] [bar] [foo]: /url1 [bar]: /url2  (Note that [shortcut reference link]s were introduced by Gruber himself in a beta version of Markdown.pl, but never included in the official syntax description. Without shortcut reference links, it is harmless to allow space between the link text and link label; but once shortcut references are introduced, it is too dangerous to allow this, as it frequently leads to unintended results.) When there are multiple matching [link reference definition]s, When there are multiple matching [link reference definition]s, the first is used: the first is used: . . [foo]: /url1 [foo]: /url1 [foo]: /url2 [foo]: /url2 [bar][foo] [bar][foo] . . skipping to change at line 6980 skipping to change at line 7075 . . . . [foo][ref\[] [foo][ref\[] [ref\[]: /uri [ref\[]: /uri . . <p><a href="/uri">foo</a></p> <p><a href="/uri">foo</a></p> . . Note that in this example ] is not backslash-escaped: . [bar\$: /uri
[bar\\]
.
<p><a href="/uri">bar\</a></p>
.
A [link label] must contain at least one [non-whitespace character]: A [link label] must contain at least one [non-whitespace character]:
. .
[] []
[]: /uri []: /uri
. .
<p>[]</p> <p>[]</p>
<p>[]: /uri</p> <p>[]: /uri</p>
. .
skipping to change at line 7007 skipping to change at line 7112
. .
<p>[ <p>[
]</p> ]</p>
<p>[ <p>[
]: /uri</p> ]: /uri</p>
. .
consists of a [link label] that [matches] a consists of a [link label] that [matches] a
[link reference definition] elsewhere in the [link reference definition] elsewhere in the
document, the string []. document, followed by the string [].
The contents of the first link label are parsed as inlines, The contents of the first link label are parsed as inlines,
which are used as the link's text. The link's URI and title are which are used as the link's text. The link's URI and title are
provided by the matching reference link definition. Thus, provided by the matching reference link definition. Thus,
[foo][] is equivalent to [foo][foo]. [foo][] is equivalent to [foo][foo].
. .
[foo][] [foo][]
[foo]: /url "title" [foo]: /url "title"
. .
skipping to change at line 7039 skipping to change at line 7144
. .
[Foo][] [Foo][]
[foo]: /url "title" [foo]: /url "title"
. .
<p><a href="/url" title="title">Foo</a></p> <p><a href="/url" title="title">Foo</a></p>
. .
As with full reference links, [whitespace] is allowed As with full reference links, [whitespace] is not
between the two sets of brackets: allowed between the two sets of brackets:
. .
[foo] [foo]
[] []
[foo]: /url "title" [foo]: /url "title"
. .
<p><a href="/url" <p><a href="/url" title="title">foo</a>
[]</p>
. .
consists of a [link label] that [matches] a consists of a [link label] that [matches] a
[link reference definition] elsewhere in the [link reference definition] elsewhere in the
document and is not followed by [] or a link label. document and is not followed by [] or a link label.
The contents of the first link label are parsed as inlines, The contents of the first link label are parsed as inlines,
which are used as the link's text. the link's URI and title which are used as the link's text. the link's URI and title
are provided by the matching link reference definition. are provided by the matching link reference definition.
Thus, [foo] is equivalent to [foo][]. Thus, [foo] is equivalent to [foo][].
skipping to change at line 7268 skipping to change at line 7374
. .
![](/url) ![](/url)
. .
<p><img src="/url" alt="" /></p> <p><img src="/url" alt="" /></p>
. .
Reference-style: Reference-style:
. .
![foo][bar] ![foo][bar]
[bar]: /url [bar]: /url
. .
<p><img src="/url" alt="foo" /></p> <p><img src="/url" alt="foo" /></p>
. .
. .
![foo][bar] ![foo][bar]
[BAR]: /url [BAR]: /url
. .
<p><img src="/url" alt="foo" /></p> <p><img src="/url" alt="foo" /></p>
. .
Collapsed: Collapsed:
. .
![foo][] ![foo][]
skipping to change at line 7311 skipping to change at line 7417
The labels are case-insensitive: The labels are case-insensitive:
. .
![Foo][] ![Foo][]
[foo]: /url "title" [foo]: /url "title"
. .
<p><img src="/url" alt="Foo" title="title" /></p> <p><img src="/url" alt="Foo" title="title" /></p>
. .
As with allowed As with reference links, [whitespace] is not allowed
between the two sets of brackets: between the two sets of brackets:
. .
![foo] ![foo]
[] []
[foo]: /url "title" [foo]: /url "title"
. .
<p><img src="/url" alt="foo" title="title" <p><img src="/url" alt="foo" title="title" />
[]</p>
. .
Shortcut: Shortcut:
. .
![foo] ![foo]
[foo]: /url "title" [foo]: /url "title"
. .
<p><img src="/url" alt="foo" title="title" /></p> <p><img src="/url" alt="foo" title="title" /></p>
skipping to change at line 7594 skipping to change at line 7701
A [single-quoted attribute value](@single-quoted-attribute-value) A [single-quoted attribute value](@single-quoted-attribute-value)
consists of ', zero or more consists of ', zero or more
characters not including ', and a final '. characters not including ', and a final '.
A [double-quoted attribute value](@double-quoted-attribute-value) A [double-quoted attribute value](@double-quoted-attribute-value)
consists of ", zero or more consists of ", zero or more
characters not including ", and a final ". characters not including ", and a final ".
An [open tag](@open-tag) consists of a < character, a [tag name], An [open tag](@open-tag) consists of a < character, a [tag name],
zero or more [attribute, optional [whitespace], an optional / zero or more [attribute]s, optional [whitespace], an optional /
character, and a > character. character, and a > character.
A [closing tag](@closing-tag) consists of the string </, a A [closing tag](@closing-tag) consists of the string </, a
[tag name], optional [whitespace], and the character >. [tag name], optional [whitespace], and the character >.
An [HTML comment](@html-comment) consists of <!-- + *text* + -->, An [HTML comment](@html-comment) consists of <!-- + *text* + -->,
where *text* does not start with > or ->, does not end with -, where *text* does not start with > or ->, does not end with -,
and does not contain --. (See the and does not contain --. (See the
skipping to change at line 7662 skipping to change at line 7769
<a foo="bar" bam = 'baz <em>"</em>' <a foo="bar" bam = 'baz <em>"</em>'
_boolean zoop:33=zoop:33 /> _boolean zoop:33=zoop:33 />
. .
<p><a foo="bar" bam = 'baz <em>"</em>' <p><a foo="bar" bam = 'baz <em>"</em>'
_boolean zoop:33=zoop:33 /></p> _boolean zoop:33=zoop:33 /></p>
. .
Custom tag names can be used: Custom tag names can be used:
. .
<responsive-image src="foo.jpg" /> Foo <responsive-image src="foo.jpg" />
. .
<responsive-image src="foo.jpg" <p>Foo <responsive-image src="foo.jpg" /></p>
. .
Illegal tag names, not parsed as HTML: Illegal tag names, not parsed as HTML:
. .
<33> <__> <33> <__>
. .
<p>&lt;33&gt; &lt;__&gt;</p> <p>&lt;33&gt; &lt;__&gt;</p>
. .
skipping to change at line 7719 skipping to change at line 7819
. .
<a href='bar'title=title> <a href='bar'title=title>
. .
<p>&lt;a href='bar'title=title&gt;</p> <p>&lt;a href='bar'title=title&gt;</p>
. .
Closing tags: Closing tags:
. .
</a></foo >
>
. .
<p></a></foo ></p>
. .
Illegal attributes in closing tag: Illegal attributes in closing tag:
. .
</a href="foo"> </a href="foo">
. .
<p>&lt;/a href=&quot;foo&quot;&gt;</p> <p>&lt;/a href=&quot;foo&quot;&gt;</p>
. .
skipping to change at line 7785 skipping to change at line 7883
. .
CDATA sections: CDATA sections:
. .
foo <![CDATA[>&<]]> foo <![CDATA[>&<]]>
. .
<p>foo <![CDATA[>&<]]></p> <p>foo <![CDATA[>&<]]></p>
. .
are preserved in HTML attributes: Entity and numeric character references are preserved in HTML
attributes:
. .
<a href="&ouml;"> foo <a href="&ouml;">
. .
<> <p>foo <a href="&ouml;"></p>
. .
Backslash escapes do not work in HTML attributes: Backslash escapes do not work in HTML attributes:
. .
<a href="\*"> foo <a href="\*">
. .
<> <p>foo <a href="\*"></p>
. .
. .
<a href="\""> <a href="\"">
. .
<p>&lt;a href=&quot;&quot;&quot;&gt;</p> <p>&lt;a href=&quot;&quot;&quot;&gt;</p>
. .
## Hard line breaks ## Hard line breaks
skipping to change at line 8017 skipping to change at line 8116
## Overview {-} ## Overview {-}
Parsing has two phases: Parsing has two phases:
1. In the first phase, lines of input are consumed and the block 1. In the first phase, lines of input are consumed and the block
structure of the document---its division into paragraphs, block quotes, structure of the document---its division into paragraphs, block quotes,
list items, and so on---is constructed. Text is assigned to these list items, and so on---is constructed. Text is assigned to these
blocks but not parsed. Link reference definitions are parsed and a blocks but not parsed. Link reference definitions are parsed and a
2. In the second phase, the raw text contents of paragraphs and heads 2. In the second phase, the raw text contents of paragraphs and headings
are parsed into sequences of Markdown inline elements (strings, are parsed into sequences of Markdown inline elements (strings,
code spans, links, emphasis, and so on), using the map of link code spans, links, emphasis, and so on), using the map of link
references constructed in phase 1. references constructed in phase 1.
At each point in processing, the document is represented as a tree of At each point in processing, the document is represented as a tree of
**blocks**. The root of the tree is a document block. The document **blocks**. The root of the tree is a document block. The document
may have any number of other blocks as **children**. These children may have any number of other blocks as **children**. These children
may, in turn, have other blocks as children. The last child of a block may, in turn, have other blocks as children. The last child of a block
is normally considered **open**, meaning that subsequent lines of input is normally considered **open**, meaning that subsequent lines of input
can alter its contents. (Blocks that are not open are **closed**.) can alter its contents. (Blocks that are not open are **closed**.)
skipping to change at line 8080 skipping to change at line 8179
2. Next, after consuming the continuation markers for existing 2. Next, after consuming the continuation markers for existing
blocks, we look for new block starts (e.g. > for a block quote. blocks, we look for new block starts (e.g. > for a block quote.
If we encounter a new block start, we close any blocks unmatched If we encounter a new block start, we close any blocks unmatched
in step 1 before creating the new block as a child of the last in step 1 before creating the new block as a child of the last
matched block. matched block.
3. Finally, we look at the remainder of the line (after block 3. Finally, we look at the remainder of the line (after block
markers like >, list markers, and indentation have been consumed). markers like >, list markers, and indentation have been consumed).
This is text that can be incorporated into the last open This is text that can be incorporated into the last open
block (a paragraph, code block, head, or raw HTML). block (a paragraph, code block, heading, or raw HTML).
Setext are formed when we detect that the second line of Setext headings are formed when we detect that the second line of
a paragraph is a setext line. a paragraph is a setext heading line.
Reference link definitions are detected when a paragraph is closed; Reference link definitions are detected when a paragraph is closed;
the accumulated text lines are parsed to see if they begin with the accumulated text lines are parsed to see if they begin with
one or more reference link definitions. Any remainder becomes a one or more reference link definitions. Any remainder becomes a
normal paragraph. normal paragraph.
We can see how this works by considering how the tree above is We can see how this works by considering how the tree above is
generated by four lines of Markdown: generated by four lines of Markdown:
 markdown  markdown
skipping to change at line 8192 skipping to change at line 8291
-> list_item -> list_item
-> paragraph -> paragraph
"aliquando id" "aliquando id"
 
## Phase 2: inline structure {-} ## Phase 2: inline structure {-}
Once all of the input has been parsed, all open blocks are closed. Once all of the input has been parsed, all open blocks are closed.
We then "walk the tree," visiting every node, and parse raw We then "walk the tree," visiting every node, and parse raw
string contents of paragraphs and heads as inlines. At this string contents of paragraphs and headings as inlines. At this
point we have seen all the link reference definitions, so we can point we have seen all the link reference definitions, so we can
resolve reference links as we go. resolve reference links as we go.
 tree  tree
document document
block_quote block_quote
paragraph paragraph
str "Lorem ipsum dolor" str "Lorem ipsum dolor"
softbreak softbreak
str "sit amet." str "sit amet."
End of changes. 110 change blocks.
160 lines changed or deleted 258 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/