spec.txt   spec.txt 
--- ---
title: CommonMark Spec title: CommonMark Spec
author: John MacFarlane author: John MacFarlane
version: 0.28 version: 0.29
date: '2017-08-01' date: '2019-04-06'
license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)' license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)'
... ...
# Introduction # Introduction
## What is Markdown? ## What is Markdown?
Markdown is a plain text format for writing structured documents, Markdown is a plain text format for writing structured documents,
based on conventions for indicating formatting in email based on conventions for indicating formatting in email
and usenet posts. It was developed by John Gruber (with and usenet posts. It was developed by John Gruber (with
skipping to change at line 251 skipping to change at line 251
[foo][] [foo][]
``` ```
In the absence of a spec, early implementers consulted `Markdown.pl` In the absence of a spec, early implementers consulted `Markdown.pl`
to resolve these ambiguities. But `Markdown.pl` was quite buggy, and to resolve these ambiguities. But `Markdown.pl` was quite buggy, and
gave manifestly bad results in many cases, so it was not a gave manifestly bad results in many cases, so it was not a
satisfactory replacement for a spec. satisfactory replacement for a spec.
Because there is no unambiguous spec, implementations have diverged Because there is no unambiguous spec, implementations have diverged
considerably. As a result, users are often surprised to find that considerably. As a result, users are often surprised to find that
a document that renders one way on one system (say, a github wiki) a document that renders one way on one system (say, a GitHub wiki)
renders differently on another (say, converting to docbook using renders differently on another (say, converting to docbook using
pandoc). To make matters worse, because nothing in Markdown counts pandoc). To make matters worse, because nothing in Markdown counts
as a "syntax error," the divergence often isn't discovered right away. as a "syntax error," the divergence often isn't discovered right away.
## About this document ## About this document
This document attempts to specify Markdown syntax unambiguously. This document attempts to specify Markdown syntax unambiguously.
It contains many examples with side-by-side Markdown and It contains many examples with side-by-side Markdown and
HTML. These are intended to double as conformance tests. An HTML. These are intended to double as conformance tests. An
accompanying script `spec_tests.py` can be used to run the tests accompanying script `spec_tests.py` can be used to run the tests
skipping to change at line 331 skipping to change at line 331
[Unicode whitespace](@) is a sequence of one [Unicode whitespace](@) is a sequence of one
or more [Unicode whitespace characters]. or more [Unicode whitespace characters].
A [space](@) is `U+0020`. A [space](@) is `U+0020`.
A [non-whitespace character](@) is any character A [non-whitespace character](@) is any character
that is not a [whitespace character]. that is not a [whitespace character].
An [ASCII punctuation character](@) An [ASCII punctuation character](@)
is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`, is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`,
`*`, `+`, `,`, `-`, `.`, `/`, `:`, `;`, `<`, `=`, `>`, `?`, `@`, `*`, `+`, `,`, `-`, `.`, `/` (U+0021–2F),
`[`, `\`, `]`, `^`, `_`, `` ` ``, `{`, `|`, `}`, or `~`. `:`, `;`, `<`, `=`, `>`, `?`, `@` (U+003A–0040),
`[`, `\`, `]`, `^`, `_`, `` ` `` (U+005B–0060),
`{`, `|`, `}`, or `~` (U+007B–007E).
A [punctuation character](@) is an [ASCII A [punctuation character](@) is an [ASCII
punctuation character] or anything in punctuation character] or anything in
the general Unicode categories `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`. the general Unicode categories `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`.
## Tabs ## Tabs
Tabs in lines are not expanded to [spaces]. However, Tabs in lines are not expanded to [spaces]. However,
in contexts where whitespace helps to define block structure, in contexts where whitespace helps to define block structure,
tabs behave as if they were replaced by spaces with a tab stop tabs behave as if they were replaced by spaces with a tab stop
skipping to change at line 514 skipping to change at line 516
paragraphs, headings, and other block constructs can be parsed for inline paragraphs, headings, and other block constructs can be parsed for inline
structure. The second step requires information about link reference structure. The second step requires information about link reference
definitions that will be available only at the end of the first definitions that will be available only at the end of the first
step. Note that the first step requires processing lines in sequence, step. Note that the first step requires processing lines in sequence,
but the second can be parallelized, since the inline parsing of but the second can be parallelized, since the inline parsing of
one block element does not affect the inline parsing of any other. one block element does not affect the inline parsing of any other.
## Container blocks and leaf blocks ## Container blocks and leaf blocks
We can divide blocks into two types: We can divide blocks into two types:
[container block](@)s, [container blocks](@),
which can contain other blocks, and [leaf block](@)s, which can contain other blocks, and [leaf blocks](@),
which cannot. which cannot.
# Leaf blocks # Leaf blocks
This section describes the different kinds of leaf block that make up a This section describes the different kinds of leaf block that make up a
Markdown document. Markdown document.
## Thematic breaks ## Thematic breaks
A line consisting of 0-3 spaces of indentation, followed by a sequence A line consisting of 0-3 spaces of indentation, followed by a sequence
of three or more matching `-`, `_`, or `*` characters, each followed of three or more matching `-`, `_`, or `*` characters, each followed
optionally by any number of spaces, forms a optionally by any number of spaces or tabs, forms a
[thematic break](@). [thematic break](@).
```````````````````````````````` example ```````````````````````````````` example
*** ***
--- ---
___ ___
. .
<hr /> <hr />
<hr /> <hr />
<hr /> <hr />
skipping to change at line 801 skipping to change at line 803
```````````````````````````````` ````````````````````````````````
Contents are parsed as inlines: Contents are parsed as inlines:
```````````````````````````````` example ```````````````````````````````` example
# foo *bar* \*baz\* # foo *bar* \*baz\*
. .
<h1>foo <em>bar</em> *baz*</h1> <h1>foo <em>bar</em> *baz*</h1>
```````````````````````````````` ````````````````````````````````
Leading and trailing blanks are ignored in parsing inline content: Leading and trailing [whitespace] is ignored in parsing inline content:
```````````````````````````````` example ```````````````````````````````` example
# foo # foo
. .
<h1>foo</h1> <h1>foo</h1>
```````````````````````````````` ````````````````````````````````
One to three spaces indentation are allowed: One to three spaces indentation are allowed:
```````````````````````````````` example ```````````````````````````````` example
skipping to change at line 986 skipping to change at line 988
```````````````````````````````` example ```````````````````````````````` example
Foo *bar Foo *bar
baz* baz*
==== ====
. .
<h1>Foo <em>bar <h1>Foo <em>bar
baz</em></h1> baz</em></h1>
```````````````````````````````` ````````````````````````````````
The contents are the result of parsing the headings's raw
content as inlines. The heading's raw content is formed by
concatenating the lines and removing initial and final
[whitespace].
```````````````````````````````` example
Foo *bar
baz*→
====
.
<h1>Foo <em>bar
baz</em></h1>
````````````````````````````````
The underlining can be any length: The underlining can be any length:
```````````````````````````````` example ```````````````````````````````` example
Foo Foo
------------------------- -------------------------
Foo Foo
= =
. .
<h2>Foo</h2> <h2>Foo</h2>
skipping to change at line 1501 skipping to change at line 1517
## Fenced code blocks ## Fenced code blocks
A [code fence](@) is a sequence A [code fence](@) is a sequence
of at least three consecutive backtick characters (`` ` ``) or of at least three consecutive backtick characters (`` ` ``) or
tildes (`~`). (Tildes and backticks cannot be mixed.) tildes (`~`). (Tildes and backticks cannot be mixed.)
A [fenced code block](@) A [fenced code block](@)
begins with a code fence, indented no more than three spaces. begins with a code fence, indented no more than three spaces.
The line with the opening code fence may optionally contain some text The line with the opening code fence may optionally contain some text
following the code fence; this is trimmed of leading and trailing following the code fence; this is trimmed of leading and trailing
spaces and called the [info string](@). whitespace and called the [info string](@). If the [info string] comes
The [info string] may not contain any backtick after a backtick fence, it may not contain any backtick
characters. (The reason for this restriction is that otherwise characters. (The reason for this restriction is that otherwise
some inline code would be incorrectly interpreted as the some inline code would be incorrectly interpreted as the
beginning of a fenced code block.) beginning of a fenced code block.)
The content of the code block consists of all subsequent lines, until The content of the code block consists of all subsequent lines, until
a closing [code fence] of the same type as the code block a closing [code fence] of the same type as the code block
began with (backticks or tildes), and with at least as many backticks began with (backticks or tildes), and with at least as many backticks
or tildes as the opening code fence. If the leading code fence is or tildes as the opening code fence. If the leading code fence is
indented N spaces, then up to N spaces of indentation are removed from indented N spaces, then up to N spaces of indentation are removed from
each line of the content (if present). (If a content line is not each line of the content (if present). (If a content line is not
skipping to change at line 1768 skipping to change at line 1784
``` ```
</code></pre> </code></pre>
```````````````````````````````` ````````````````````````````````
Code fences (opening and closing) cannot contain internal spaces: Code fences (opening and closing) cannot contain internal spaces:
```````````````````````````````` example ```````````````````````````````` example
``` ``` ``` ```
aaa aaa
. .
<p><code></code> <p><code> </code>
aaa</p> aaa</p>
```````````````````````````````` ````````````````````````````````
```````````````````````````````` example ```````````````````````````````` example
~~~~~~ ~~~~~~
aaa aaa
~~~ ~~ ~~~ ~~
. .
<pre><code>aaa <pre><code>aaa
~~~ ~~ ~~~ ~~
skipping to change at line 1816 skipping to change at line 1832
~~~ ~~~
# baz # baz
. .
<h2>foo</h2> <h2>foo</h2>
<pre><code>bar <pre><code>bar
</code></pre> </code></pre>
<h1>baz</h1> <h1>baz</h1>
```````````````````````````````` ````````````````````````````````
An [info string] can be provided after the opening code fence. An [info string] can be provided after the opening code fence.
Opening and closing spaces will be stripped, and the first word, prefixed Although this spec doesn't mandate any particular treatment of
with `language-`, is used as the value for the `class` attribute of the the info string, the first word is typically used to specify
`code` element within the enclosing `pre` element. the language of the code block. In HTML output, the language is
normally indicated by adding a class to the `code` element consisting
of `language-` followed by the language name.
```````````````````````````````` example ```````````````````````````````` example
```ruby ```ruby
def foo(x) def foo(x)
return 3 return 3
end end
``` ```
. .
<pre><code class="language-ruby">def foo(x) <pre><code class="language-ruby">def foo(x)
return 3 return 3
skipping to change at line 1863 skipping to change at line 1881
[Info strings] for backtick code blocks cannot contain backticks: [Info strings] for backtick code blocks cannot contain backticks:
```````````````````````````````` example ```````````````````````````````` example
``` aa ``` ``` aa ```
foo foo
. .
<p><code>aa</code> <p><code>aa</code>
foo</p> foo</p>
```````````````````````````````` ````````````````````````````````
[Info strings] for tilde code blocks can contain backticks and tildes:
```````````````````````````````` example
~~~ aa ``` ~~~
foo
~~~
.
<pre><code class="language-aa">foo
</code></pre>
````````````````````````````````
Closing code fences cannot have [info strings]: Closing code fences cannot have [info strings]:
```````````````````````````````` example ```````````````````````````````` example
``` ```
``` aaa ``` aaa
``` ```
. .
<pre><code>``` aaa <pre><code>``` aaa
</code></pre> </code></pre>
```````````````````````````````` ````````````````````````````````
## HTML blocks ## HTML blocks
An [HTML block](@) is a group of lines that is treated An [HTML block](@) is a group of lines that is treated
as raw HTML (and will not be escaped in HTML output). as raw HTML (and will not be escaped in HTML output).
There are seven kinds of [HTML block], which can be defined There are seven kinds of [HTML block], which can be defined by their
by their start and end conditions. The block begins with a line that start and end conditions. The block begins with a line that meets a
meets a [start condition](@) (after up to three spaces [start condition](@) (after up to three spaces optional indentation).
optional indentation). It ends with the first subsequent line that It ends with the first subsequent line that meets a matching [end
meets a matching [end condition](@), or the last line of condition](@), or the last line of the document, or the last line of
the document or other [container block]), if no line is encountered that meets t the [container block](#container-blocks) containing the current HTML
he block, if no line is encountered that meets the [end condition]. If
[end condition]. If the first line meets both the [start condition] the first line meets both the [start condition] and the [end
and the [end condition], the block will contain just that line. condition], the block will contain just that line.
1. **Start condition:** line begins with the string `<script`, 1. **Start condition:** line begins with the string `<script`,
`<pre`, or `<style` (case-insensitive), followed by whitespace, `<pre`, or `<style` (case-insensitive), followed by whitespace,
the string `>`, or the end of the line.\ the string `>`, or the end of the line.\
**End condition:** line contains an end tag **End condition:** line contains an end tag
`</script>`, `</pre>`, or `</style>` (case-insensitive; it `</script>`, `</pre>`, or `</style>` (case-insensitive; it
need not match the start tag). need not match the start tag).
2. **Start condition:** line begins with the string `<!--`.\ 2. **Start condition:** line begins with the string `<!--`.\
**End condition:** line contains the string `-->`. **End condition:** line contains the string `-->`.
skipping to change at line 1917 skipping to change at line 1947
**End condition:** line contains the string `]]>`. **End condition:** line contains the string `]]>`.
6. **Start condition:** line begins the string `<` or `</` 6. **Start condition:** line begins the string `<` or `</`
followed by one of the strings (case-insensitive) `address`, followed by one of the strings (case-insensitive) `address`,
`article`, `aside`, `base`, `basefont`, `blockquote`, `body`, `article`, `aside`, `base`, `basefont`, `blockquote`, `body`,
`caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`, `caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`,
`dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`, `dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`,
`footer`, `form`, `frame`, `frameset`, `footer`, `form`, `frame`, `frameset`,
`h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`, `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`,
`html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`, `html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`,
`meta`, `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`, `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`,
`section`, `source`, `summary`, `table`, `tbody`, `td`, `section`, `source`, `summary`, `table`, `tbody`, `td`,
`tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed `tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed
by [whitespace], the end of the line, the string `>`, or by [whitespace], the end of the line, the string `>`, or
the string `/>`.\ the string `/>`.\
**End condition:** line is followed by a [blank line]. **End condition:** line is followed by a [blank line].
7. **Start condition:** line begins with a complete [open tag] 7. **Start condition:** line begins with a complete [open tag]
or [closing tag] (with any [tag name] other than `script`, (with any [tag name] other than `script`,
`style`, or `pre`) followed only by [whitespace] `style`, or `pre`) or a complete [closing tag],
or the end of the line.\ followed only by [whitespace] or the end of the line.\
**End condition:** line is followed by a [blank line]. **End condition:** line is followed by a [blank line].
HTML blocks continue until they are closed by their appropriate HTML blocks continue until they are closed by their appropriate
[end condition], or the last line of the document or other [container block]. [end condition], or the last line of the document or other [container
This means any HTML **within an HTML block** that might otherwise be recognised block](#container-blocks). This means any HTML **within an HTML
as a start condition will be ignored by the parser and passed through as-is, block** that might otherwise be recognised as a start condition will
without changing the parser's state. be ignored by the parser and passed through as-is, without changing
the parser's state.
For instance, `<pre>` within a HTML block started by `<table>` will not affect For instance, `<pre>` within a HTML block started by `<table>` will not affect
the parser state; as the HTML block was started in by start condition 6, it the parser state; as the HTML block was started in by start condition 6, it
will end at any blank line. This can be surprising: will end at any blank line. This can be surprising:
```````````````````````````````` example ```````````````````````````````` example
<table><tr><td> <table><tr><td>
<pre> <pre>
**Hello**, **Hello**,
skipping to change at line 1957 skipping to change at line 1988
</td></tr></table> </td></tr></table>
. .
<table><tr><td> <table><tr><td>
<pre> <pre>
**Hello**, **Hello**,
<p><em>world</em>. <p><em>world</em>.
</pre></p> </pre></p>
</td></tr></table> </td></tr></table>
```````````````````````````````` ````````````````````````````````
In this case, the HTML block is terminated by the newline — the `**hello**` In this case, the HTML block is terminated by the newline — the `**Hello**`
text remains verbatim — and regular parsing resumes, with a paragraph, text remains verbatim — and regular parsing resumes, with a paragraph,
emphasised `world` and inline and block HTML following. emphasised `world` and inline and block HTML following.
All types of [HTML blocks] except type 7 may interrupt All types of [HTML blocks] except type 7 may interrupt
a paragraph. Blocks of type 7 may not interrupt a paragraph. a paragraph. Blocks of type 7 may not interrupt a paragraph.
(This restriction is intended to prevent unwanted interpretation (This restriction is intended to prevent unwanted interpretation
of long tags inside a wrapped paragraph as starting HTML blocks.) of long tags inside a wrapped paragraph as starting HTML blocks.)
Some simple examples follow. Here are some basic HTML blocks Some simple examples follow. Here are some basic HTML blocks
of type 6: of type 6:
skipping to change at line 2462 skipping to change at line 2493
bar bar
</div> </div>
. .
<p>Foo</p> <p>Foo</p>
<div> <div>
bar bar
</div> </div>
```````````````````````````````` ````````````````````````````````
However, a following blank line is needed, except at the end of However, a following blank line is needed, except at the end of
a document, and except for blocks of types 1--5, above: a document, and except for blocks of types 1--5, [above][HTML
block]:
```````````````````````````````` example ```````````````````````````````` example
<div> <div>
bar bar
</div> </div>
*foo* *foo*
. .
<div> <div>
bar bar
</div> </div>
skipping to change at line 2602 skipping to change at line 2634
<pre><code>&lt;td&gt; <pre><code>&lt;td&gt;
Hi Hi
&lt;/td&gt; &lt;/td&gt;
</code></pre> </code></pre>
</tr> </tr>
</table> </table>
```````````````````````````````` ````````````````````````````````
Fortunately, blank lines are usually not necessary and can be Fortunately, blank lines are usually not necessary and can be
deleted. The exception is inside `<pre>` tags, but as described deleted. The exception is inside `<pre>` tags, but as described
above, raw HTML blocks starting with `<pre>` *can* contain blank [above][HTML blocks], raw HTML blocks starting with `<pre>`
lines. *can* contain blank lines.
## Link reference definitions ## Link reference definitions
A [link reference definition](@) A [link reference definition](@)
consists of a [link label], indented up to three spaces, followed consists of a [link label], indented up to three spaces, followed
by a colon (`:`), optional [whitespace] (including up to one by a colon (`:`), optional [whitespace] (including up to one
[line ending]), a [link destination], [line ending]), a [link destination],
optional [whitespace] (including up to one optional [whitespace] (including up to one
[line ending]), and an optional [link [line ending]), and an optional [link
title], which if it is present must be separated title], which if it is present must be separated
skipping to change at line 2652 skipping to change at line 2684
```````````````````````````````` example ```````````````````````````````` example
[Foo*bar\]]:my_(url) 'title (with parens)' [Foo*bar\]]:my_(url) 'title (with parens)'
[Foo*bar\]] [Foo*bar\]]
. .
<p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p> <p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p>
```````````````````````````````` ````````````````````````````````
```````````````````````````````` example ```````````````````````````````` example
[Foo bar]: [Foo bar]:
<my%20url> <my url>
'title' 'title'
[Foo bar] [Foo bar]
. .
<p><a href="my%20url" title="title">Foo bar</a></p> <p><a href="my%20url" title="title">Foo bar</a></p>
```````````````````````````````` ````````````````````````````````
The title may extend over multiple lines: The title may extend over multiple lines:
```````````````````````````````` example ```````````````````````````````` example
skipping to change at line 2714 skipping to change at line 2746
```````````````````````````````` example ```````````````````````````````` example
[foo]: [foo]:
[foo] [foo]
. .
<p>[foo]:</p> <p>[foo]:</p>
<p>[foo]</p> <p>[foo]</p>
```````````````````````````````` ````````````````````````````````
However, an empty link destination may be specified using
angle brackets:
```````````````````````````````` example
[foo]: <>
[foo]
.
<p><a href="">foo</a></p>
````````````````````````````````
The title must be separated from the link destination by
whitespace:
```````````````````````````````` example
[foo]: <bar>(baz)
[foo]
.
<p>[foo]: <bar>(baz)</p>
<p>[foo]</p>
````````````````````````````````
Both title and destination can contain backslash escapes Both title and destination can contain backslash escapes
and literal backslashes: and literal backslashes:
```````````````````````````````` example ```````````````````````````````` example
[foo]: /url\bar\*baz "foo\"bar\baz" [foo]: /url\bar\*baz "foo\"bar\baz"
[foo] [foo]
. .
<p><a href="/url%5Cbar*baz" title="foo&quot;bar\baz">foo</a></p> <p><a href="/url%5Cbar*baz" title="foo&quot;bar\baz">foo</a></p>
```````````````````````````````` ````````````````````````````````
skipping to change at line 2858 skipping to change at line 2913
# [Foo] # [Foo]
[foo]: /url [foo]: /url
> bar > bar
. .
<h1><a href="/url">Foo</a></h1> <h1><a href="/url">Foo</a></h1>
<blockquote> <blockquote>
<p>bar</p> <p>bar</p>
</blockquote> </blockquote>
```````````````````````````````` ````````````````````````````````
```````````````````````````````` example
[foo]: /url
bar
===
[foo]
.
<h1>bar</h1>
<p><a href="/url">foo</a></p>
````````````````````````````````
```````````````````````````````` example
[foo]: /url
===
[foo]
.
<p>===
<a href="/url">foo</a></p>
````````````````````````````````
Several [link reference definitions] Several [link reference definitions]
can occur one after another, without intervening blank lines. can occur one after another, without intervening blank lines.
```````````````````````````````` example ```````````````````````````````` example
[foo]: /foo-url "foo" [foo]: /foo-url "foo"
[bar]: /bar-url [bar]: /bar-url
"bar" "bar"
[baz]: /baz-url [baz]: /baz-url
[foo], [foo],
skipping to change at line 2891 skipping to change at line 2965
```````````````````````````````` example ```````````````````````````````` example
[foo] [foo]
> [foo]: /url > [foo]: /url
. .
<p><a href="/url">foo</a></p> <p><a href="/url">foo</a></p>
<blockquote> <blockquote>
</blockquote> </blockquote>
```````````````````````````````` ````````````````````````````````
Whether something is a [link reference definition] is
independent of whether the link reference it defines is
used in the document. Thus, for example, the following
document contains just a link reference definition, and
no visible content:
```````````````````````````````` example
[foo]: /url
.
````````````````````````````````
## Paragraphs ## Paragraphs
A sequence of non-blank lines that cannot be interpreted as other A sequence of non-blank lines that cannot be interpreted as other
kinds of blocks forms a [paragraph](@). kinds of blocks forms a [paragraph](@).
The contents of the paragraph are the result of parsing the The contents of the paragraph are the result of parsing the
paragraph's raw content as inlines. The paragraph's raw content paragraph's raw content as inlines. The paragraph's raw content
is formed by concatenating the lines and removing initial and final is formed by concatenating the lines and removing initial and final
[whitespace]. [whitespace].
A simple example with two paragraphs: A simple example with two paragraphs:
skipping to change at line 3013 skipping to change at line 3098
# aaa # aaa
. .
<p>aaa</p> <p>aaa</p>
<h1>aaa</h1> <h1>aaa</h1>
```````````````````````````````` ````````````````````````````````
# Container blocks # Container blocks
A [container block] is a block that has other A [container block](#container-blocks) is a block that has other
blocks as its contents. There are two basic kinds of container blocks: blocks as its contents. There are two basic kinds of container blocks:
[block quotes] and [list items]. [block quotes] and [list items].
[Lists] are meta-containers for [list items]. [Lists] are meta-containers for [list items].
We define the syntax for container blocks recursively. The general We define the syntax for container blocks recursively. The general
form of the definition is: form of the definition is:
> If X is a sequence of blocks, then the result of > If X is a sequence of blocks, then the result of
> transforming X in such-and-such a way is a container of type Y > transforming X in such-and-such a way is a container of type Y
> with these blocks as its content. > with these blocks as its content.
skipping to change at line 3449 skipping to change at line 3534
An [ordered list marker](@) An [ordered list marker](@)
is a sequence of 1--9 arabic digits (`0-9`), followed by either a is a sequence of 1--9 arabic digits (`0-9`), followed by either a
`.` character or a `)` character. (The reason for the length `.` character or a `)` character. (The reason for the length
limit is that with 10 digits we start seeing integer overflows limit is that with 10 digits we start seeing integer overflows
in some browsers.) in some browsers.)
The following rules define [list items]: The following rules define [list items]:
1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of 1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of
blocks *Bs* starting with a [non-whitespace character] and not separated blocks *Bs* starting with a [non-whitespace character], and *M* is a
from each other by more than one blank line, and *M* is a list list marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result
marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result
of prepending *M* and the following spaces to the first line of of prepending *M* and the following spaces to the first line of
*Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a
list item with *Bs* as its contents. The type of the list item list item with *Bs* as its contents. The type of the list item
(bullet or ordered) is determined by the type of its list marker. (bullet or ordered) is determined by the type of its list marker.
If the list item is ordered, then it is also assigned a start If the list item is ordered, then it is also assigned a start
number, based on the ordered list marker. number, based on the ordered list marker.
Exceptions: Exceptions:
1. When the first list item in a [list] interrupts 1. When the first list item in a [list] interrupts
skipping to change at line 3741 skipping to change at line 3825
A start number may not be negative: A start number may not be negative:
```````````````````````````````` example ```````````````````````````````` example
-1. not ok -1. not ok
. .
<p>-1. not ok</p> <p>-1. not ok</p>
```````````````````````````````` ````````````````````````````````
2. **Item starting with indented code.** If a sequence of lines *Ls* 2. **Item starting with indented code.** If a sequence of lines *Ls*
constitute a sequence of blocks *Bs* starting with an indented code constitute a sequence of blocks *Bs* starting with an indented code
block and not separated from each other by more than one blank line, block, and *M* is a list marker of width *W* followed by
and *M* is a list marker of width *W* followed by
one space, then the result of prepending *M* and the following one space, then the result of prepending *M* and the following
space to the first line of *Ls*, and indenting subsequent lines of space to the first line of *Ls*, and indenting subsequent lines of
*Ls* by *W + 1* spaces, is a list item with *Bs* as its contents. *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents.
If a line is empty, then it need not be indented. The type of the If a line is empty, then it need not be indented. The type of the
list item (bullet or ordered) is determined by the type of its list list item (bullet or ordered) is determined by the type of its list
marker. If the list item is ordered, then it is also assigned a marker. If the list item is ordered, then it is also assigned a
start number, based on the ordered list marker. start number, based on the ordered list marker.
An indented code block will have to be indented four spaces beyond An indented code block will have to be indented four spaces beyond
the edge of the region where text will be included in the list item. the edge of the region where text will be included in the list item.
skipping to change at line 4194 skipping to change at line 4277
continued here.</p> continued here.</p>
</blockquote> </blockquote>
</li> </li>
</ol> </ol>
</blockquote> </blockquote>
```````````````````````````````` ````````````````````````````````
6. **That's all.** Nothing that is not counted as a list item by rules 6. **That's all.** Nothing that is not counted as a list item by rules
#1--5 counts as a [list item](#list-items). #1--5 counts as a [list item](#list-items).
The rules for sublists follow from the general rules above. A sublist The rules for sublists follow from the general rules
must be indented the same number of spaces a paragraph would need to be [above][List items]. A sublist must be indented the same number
in order to be included in the list item. of spaces a paragraph would need to be in order to be included
in the list item.
So, in this case we need two spaces indent: So, in this case we need two spaces indent:
```````````````````````````````` example ```````````````````````````````` example
- foo - foo
- bar - bar
- baz - baz
- boo - boo
. .
<ul> <ul>
skipping to change at line 4771 skipping to change at line 4855
List items need not be indented to the same level. The following List items need not be indented to the same level. The following
list items will be treated as items at the same list level, list items will be treated as items at the same list level,
since none is indented enough to belong to the previous list since none is indented enough to belong to the previous list
item: item:
```````````````````````````````` example ```````````````````````````````` example
- a - a
- b - b
- c - c
- d - d
- e - e
- f - f
- g - g
- h
- i
. .
g
<ul> <ul>
<li>a</li> <li>a</li>
<li>b</li> <li>b</li>
<li>c</li> <li>c</li>
<li>d</li> <li>d</li>
<li>e</li> <li>e</li>
<li>f</li> <li>f</li>
<li>g</li> <li>g</li>
<li>h</li>
<li>i</li>
</ul> </ul>
```````````````````````````````` ````````````````````````````````
```````````````````````````````` example ```````````````````````````````` example
1. a 1. a
2. b 2. b
3. c 3. c
. .
<ol> <ol>
<li> <li>
<p>a</p> <p>a</p>
</li> </li>
<li> <li>
<p>b</p> <p>b</p>
</li> </li>
<li> <li>
<p>c</p> <p>c</p>
</li> </li>
</ol> </ol>
```````````````````````````````` ````````````````````````````````
Note, however, that list items may not be indented more than
three spaces. Here `- e` is treated as a paragraph continuation
line, because it is indented more than three spaces:
```````````````````````````````` example
- a
- b
- c
- d
- e
.
<ul>
<li>a</li>
<li>b</li>
<li>c</li>
<li>d
- e</li>
</ul>
````````````````````````````````
And here, `3. c` is treated as in indented code block,
because it is indented four spaces and preceded by a
blank line.
```````````````````````````````` example
1. a
2. b
3. c
.
<ol>
<li>
<p>a</p>
</li>
<li>
<p>b</p>
</li>
</ol>
<pre><code>3. c
</code></pre>
````````````````````````````````
This is a loose list, because there is a blank line between This is a loose list, because there is a blank line between
two of the list items: two of the list items:
```````````````````````````````` example ```````````````````````````````` example
- a - a
- b - b
- c - c
. .
<ul> <ul>
skipping to change at line 5117 skipping to change at line 5240
```````````````````````````````` example ```````````````````````````````` example
\*not emphasized* \*not emphasized*
\<br/> not a tag \<br/> not a tag
\[not a link](/foo) \[not a link](/foo)
\`not code` \`not code`
1\. not a list 1\. not a list
\* not a list \* not a list
\# not a heading \# not a heading
\[foo]: /url "not a reference" \[foo]: /url "not a reference"
\&ouml; not a character entity
. .
<p>*not emphasized* <p>*not emphasized*
&lt;br/&gt; not a tag &lt;br/&gt; not a tag
[not a link](/foo) [not a link](/foo)
`not code` `not code`
1. not a list 1. not a list
* not a list * not a list
# not a heading # not a heading
[foo]: /url &quot;not a reference&quot;</p> [foo]: /url &quot;not a reference&quot;
&amp;ouml; not a character entity</p>
```````````````````````````````` ````````````````````````````````
If a backslash is itself escaped, the following character is not: If a backslash is itself escaped, the following character is not:
```````````````````````````````` example ```````````````````````````````` example
\\*emphasis* \\*emphasis*
. .
<p>\<em>emphasis</em></p> <p>\<em>emphasis</em></p>
```````````````````````````````` ````````````````````````````````
skipping to change at line 5211 skipping to change at line 5336
``` foo\+bar ``` foo\+bar
foo foo
``` ```
. .
<pre><code class="language-foo+bar">foo <pre><code class="language-foo+bar">foo
</code></pre> </code></pre>
```````````````````````````````` ````````````````````````````````
## Entity and numeric character references ## Entity and numeric character references
All valid HTML entity references and numeric character Valid HTML entity references and numeric character references
references, except those occuring in code blocks and code spans, can be used in place of the corresponding Unicode character,
are recognized as such and treated as equivalent to the with the following exceptions:
corresponding Unicode characters. Conforming CommonMark parsers
need not store information about whether a particular character - Entity and character references are not recognized in code
was represented in the source using a Unicode character or blocks and code spans.
an entity reference.
- Entity and character references cannot stand in place of
special characters that define structural elements in
CommonMark. For example, although `&#42;` can be used
in place of a literal `*` character, `&#42;` cannot replace
`*` in emphasis delimiters, bullet list markers, or thematic
breaks.
Conforming CommonMark parsers need not store information about
whether a particular character was represented in the source
using a Unicode character or an entity reference.
[Entity references](@) consist of `&` + any of the valid [Entity references](@) consist of `&` + any of the valid
HTML5 entity names + `;`. The HTML5 entity names + `;`. The
document <https://html.spec.whatwg.org/multipage/entities.json> document <https://html.spec.whatwg.org/multipage/entities.json>
is used as an authoritative source for the valid entity is used as an authoritative source for the valid entity
references and their corresponding code points. references and their corresponding code points.
```````````````````````````````` example ```````````````````````````````` example
&nbsp; &amp; &copy; &AElig; &Dcaron; &nbsp; &amp; &copy; &AElig; &Dcaron;
&frac34; &HilbertSpace; &DifferentialD; &frac34; &HilbertSpace; &DifferentialD;
&ClockwiseContourIntegral; &ngE; &ClockwiseContourIntegral; &ngE;
. .
<p>  &amp; © Æ Ď <p>  &amp; © Æ Ď
¾ ℋ ⅆ ¾ ℋ ⅆ
∲ ≧̸</p> ∲ ≧̸</p>
```````````````````````````````` ````````````````````````````````
[Decimal numeric character [Decimal numeric character
references](@) references](@)
consist of `&#` + a string of 1--8 arabic digits + `;`. A consist of `&#` + a string of 1--7 arabic digits + `;`. A
numeric character reference is parsed as the corresponding numeric character reference is parsed as the corresponding
Unicode character. Invalid Unicode code points will be replaced by Unicode character. Invalid Unicode code points will be replaced by
the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons, the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons,
the code point `U+0000` will also be replaced by `U+FFFD`. the code point `U+0000` will also be replaced by `U+FFFD`.
```````````````````````````````` example ```````````````````````````````` example
&#35; &#1234; &#992; &#98765432; &#0; &#35; &#1234; &#992; &#0;
. .
<p># Ӓ Ϡ �</p> <p># Ӓ Ϡ �</p>
```````````````````````````````` ````````````````````````````````
[Hexadecimal numeric character [Hexadecimal numeric character
references](@) consist of `&#` + references](@) consist of `&#` +
either `X` or `x` + a string of 1-8 hexadecimal digits + `;`. either `X` or `x` + a string of 1-6 hexadecimal digits + `;`.
They too are parsed as the corresponding Unicode character (this They too are parsed as the corresponding Unicode character (this
time specified with a hexadecimal numeral instead of decimal). time specified with a hexadecimal numeral instead of decimal).
```````````````````````````````` example ```````````````````````````````` example
&#X22; &#XD06; &#xcab; &#X22; &#XD06; &#xcab;
. .
<p>&quot; ആ ಫ</p> <p>&quot; ആ ಫ</p>
```````````````````````````````` ````````````````````````````````
Here are some nonentities: Here are some nonentities:
```````````````````````````````` example ```````````````````````````````` example
&nbsp &x; &#; &#x; &nbsp &x; &#; &#x;
&#987654321;
&#abcdef0;
&ThisIsNotDefined; &hi?; &ThisIsNotDefined; &hi?;
. .
<p>&amp;nbsp &amp;x; &amp;#; &amp;#x; <p>&amp;nbsp &amp;x; &amp;#; &amp;#x;
&amp;#987654321;
&amp;#abcdef0;
&amp;ThisIsNotDefined; &amp;hi?;</p> &amp;ThisIsNotDefined; &amp;hi?;</p>
```````````````````````````````` ````````````````````````````````
Although HTML5 does accept some entity references Although HTML5 does accept some entity references
without a trailing semicolon (such as `&copy`), these are not without a trailing semicolon (such as `&copy`), these are not
recognized here, because it makes the grammar too ambiguous: recognized here, because it makes the grammar too ambiguous:
```````````````````````````````` example ```````````````````````````````` example
&copy &copy
. .
skipping to change at line 5339 skipping to change at line 5478
<p><code>f&amp;ouml;&amp;ouml;</code></p> <p><code>f&amp;ouml;&amp;ouml;</code></p>
```````````````````````````````` ````````````````````````````````
```````````````````````````````` example ```````````````````````````````` example
f&ouml;f&ouml; f&ouml;f&ouml;
. .
<pre><code>f&amp;ouml;f&amp;ouml; <pre><code>f&amp;ouml;f&amp;ouml;
</code></pre> </code></pre>
```````````````````````````````` ````````````````````````````````
Entity and numeric character references cannot be used
in place of symbols indicating structure in CommonMark
documents.
```````````````````````````````` example
&#42;foo&#42;
*foo*
.
<p>*foo*
<em>foo</em></p>
````````````````````````````````
```````````````````````````````` example
&#42; foo
* foo
.
<p>* foo</p>
<ul>
<li>foo</li>
</ul>
````````````````````````````````
```````````````````````````````` example
foo&#10;&#10;bar
.
<p>foo
bar</p>
````````````````````````````````
```````````````````````````````` example
&#9;foo
.
<p>→foo</p>
````````````````````````````````
```````````````````````````````` example
[a](url &quot;tit&quot;)
.
<p>[a](url &quot;tit&quot;)</p>
````````````````````````````````
## Code spans ## Code spans
A [backtick string](@) A [backtick string](@)
is a string of one or more backtick characters (`` ` ``) that is neither is a string of one or more backtick characters (`` ` ``) that is neither
preceded nor followed by a backtick. preceded nor followed by a backtick.
A [code span](@) begins with a backtick string and ends with A [code span](@) begins with a backtick string and ends with
a backtick string of equal length. The contents of the code span are a backtick string of equal length. The contents of the code span are
the characters between the two backtick strings, with leading and the characters between the two backtick strings, normalized in the
trailing spaces and [line endings] removed, and following ways:
[whitespace] collapsed to single spaces.
- First, [line endings] are converted to [spaces].
- If the resulting string both begins *and* ends with a [space]
character, but does not consist entirely of [space]
characters, a single [space] character is removed from the
front and back. This allows you to include code that begins
or ends with backtick characters, which must be separated by
whitespace from the opening or closing backtick strings.
This is a simple code span: This is a simple code span:
```````````````````````````````` example ```````````````````````````````` example
`foo` `foo`
. .
<p><code>foo</code></p> <p><code>foo</code></p>
```````````````````````````````` ````````````````````````````````
Here two backticks are used, because the code contains a backtick. Here two backticks are used, because the code contains a backtick.
This example also illustrates stripping of leading and trailing spaces: This example also illustrates stripping of a single leading and
trailing space:
```````````````````````````````` example ```````````````````````````````` example
`` foo ` bar `` `` foo ` bar ``
. .
<p><code>foo ` bar</code></p> <p><code>foo ` bar</code></p>
```````````````````````````````` ````````````````````````````````
This example shows the motivation for stripping leading and trailing This example shows the motivation for stripping leading and trailing
spaces: spaces:
```````````````````````````````` example ```````````````````````````````` example
` `` ` ` `` `
. .
<p><code>``</code></p> <p><code>``</code></p>
```````````````````````````````` ````````````````````````````````
[Line endings] are treated like spaces: Note that only *one* space is stripped:
```````````````````````````````` example ```````````````````````````````` example
`` ` `` `
foo
``
. .
<p><code>foo</code></p> <p><code> `` </code></p>
```````````````````````````````` ````````````````````````````````
Interior spaces and [line endings] are collapsed into The stripping only happens if the space is on both
single spaces, just as they would be by a browser: sides of the string:
```````````````````````````````` example ```````````````````````````````` example
`foo bar ` a`
baz`
. .
<p><code>foo bar baz</code></p> <p><code> a</code></p>
```````````````````````````````` ````````````````````````````````
Not all [Unicode whitespace] (for instance, non-breaking space) is Only [spaces], and not [unicode whitespace] in general, are
collapsed, however: stripped in this way:
```````````````````````````````` example ```````````````````````````````` example
`a  b` ` b `
. .
<p><code>a  b</code></p> <p><code> b </code></p>
```````````````````````````````` ````````````````````````````````
Q: Why not just leave the spaces, since browsers will collapse them No stripping occurs if the code span contains only spaces:
anyway? A: Because we might be targeting a non-HTML format, and we
shouldn't rely on HTML-specific rendering assumptions.
(Existing implementations differ in their treatment of internal ```````````````````````````````` example
spaces and [line endings]. Some, including `Markdown.pl` and ` `
`showdown`, convert an internal [line ending] into a ` `
`<br />` tag. But this makes things difficult for those who like to .
hard-wrap their paragraphs, since a line break in the midst of a code <p><code> </code>
span will cause an unintended line break in the output. Others just <code> </code></p>
leave internal spaces as they are, which is fine if only HTML is being ````````````````````````````````
targeted.)
[Line endings] are treated like spaces:
```````````````````````````````` example ```````````````````````````````` example
`foo `` bar` ``
foo
bar
baz
``
. .
<p><code>foo `` bar</code></p> <p><code>foo bar baz</code></p>
```````````````````````````````` ````````````````````````````````
```````````````````````````````` example
``
foo
``
.
<p><code>foo </code></p>
````````````````````````````````
Interior spaces are not collapsed:
```````````````````````````````` example
`foo bar
baz`
.
<p><code>foo bar baz</code></p>
````````````````````````````````
Note that browsers will typically collapse consecutive spaces
when rendering `<code>` elements, so it is recommended that
the following CSS be used:
code{white-space: pre-wrap;}
Note that backslash escapes do not work in code spans. All backslashes Note that backslash escapes do not work in code spans. All backslashes
are treated literally: are treated literally:
```````````````````````````````` example ```````````````````````````````` example
`foo\`bar` `foo\`bar`
. .
<p><code>foo\</code>bar`</p> <p><code>foo\</code>bar`</p>
```````````````````````````````` ````````````````````````````````
Backslash escapes are never needed, because one can always choose a Backslash escapes are never needed, because one can always choose a
string of *n* backtick characters as delimiters, where the code does string of *n* backtick characters as delimiters, where the code does
not contain any strings of exactly *n* backtick characters. not contain any strings of exactly *n* backtick characters.
```````````````````````````````` example
``foo`bar``
.
<p><code>foo`bar</code></p>
````````````````````````````````
```````````````````````````````` example
` foo `` bar `
.
<p><code>foo `` bar</code></p>
````````````````````````````````
Code span backticks have higher precedence than any other inline Code span backticks have higher precedence than any other inline
constructs except HTML tags and autolinks. Thus, for example, this is constructs except HTML tags and autolinks. Thus, for example, this is
not parsed as emphasized text, since the second `*` is part of a code not parsed as emphasized text, since the second `*` is part of a code
span: span:
```````````````````````````````` example ```````````````````````````````` example
*foo`*` *foo`*`
. .
<p>*foo<code>*</code></p> <p>*foo<code>*</code></p>
```````````````````````````````` ````````````````````````````````
skipping to change at line 5567 skipping to change at line 5792
The rules given below capture all of these patterns, while allowing The rules given below capture all of these patterns, while allowing
for efficient parsing strategies that do not backtrack. for efficient parsing strategies that do not backtrack.
First, some definitions. A [delimiter run](@) is either First, some definitions. A [delimiter run](@) is either
a sequence of one or more `*` characters that is not preceded or a sequence of one or more `*` characters that is not preceded or
followed by a non-backslash-escaped `*` character, or a sequence followed by a non-backslash-escaped `*` character, or a sequence
of one or more `_` characters that is not preceded or followed by of one or more `_` characters that is not preceded or followed by
a non-backslash-escaped `_` character. a non-backslash-escaped `_` character.
A [left-flanking delimiter run](@) is A [left-flanking delimiter run](@) is
a [delimiter run] that is (a) not followed by [Unicode whitespace], a [delimiter run] that is (1) not followed by [Unicode whitespace],
and (b) not followed by a [punctuation character], or and either (2a) not followed by a [punctuation character], or
(2b) followed by a [punctuation character] and
preceded by [Unicode whitespace] or a [punctuation character]. preceded by [Unicode whitespace] or a [punctuation character].
For purposes of this definition, the beginning and the end of For purposes of this definition, the beginning and the end of
the line count as Unicode whitespace. the line count as Unicode whitespace.
A [right-flanking delimiter run](@) is A [right-flanking delimiter run](@) is
a [delimiter run] that is (a) not preceded by [Unicode whitespace], a [delimiter run] that is (1) not preceded by [Unicode whitespace],
and (b) not preceded by a [punctuation character], or and either (2a) not preceded by a [punctuation character], or
(2b) preceded by a [punctuation character] and
followed by [Unicode whitespace] or a [punctuation character]. followed by [Unicode whitespace] or a [punctuation character].
For purposes of this definition, the beginning and the end of For purposes of this definition, the beginning and the end of
the line count as Unicode whitespace. the line count as Unicode whitespace.
Here are some examples of delimiter runs. Here are some examples of delimiter runs.
- left-flanking but not right-flanking: - left-flanking but not right-flanking:
``` ```
***abc ***abc
skipping to change at line 5667 skipping to change at line 5894
or (b) part of a [left-flanking delimiter run] or (b) part of a [left-flanking delimiter run]
followed by punctuation. followed by punctuation.
9. Emphasis begins with a delimiter that [can open emphasis] and ends 9. Emphasis begins with a delimiter that [can open emphasis] and ends
with a delimiter that [can close emphasis], and that uses the same with a delimiter that [can close emphasis], and that uses the same
character (`_` or `*`) as the opening delimiter. The character (`_` or `*`) as the opening delimiter. The
opening and closing delimiters must belong to separate opening and closing delimiters must belong to separate
[delimiter runs]. If one of the delimiters can both [delimiter runs]. If one of the delimiters can both
open and close emphasis, then the sum of the lengths of the open and close emphasis, then the sum of the lengths of the
delimiter runs containing the opening and closing delimiters delimiter runs containing the opening and closing delimiters
must not be a multiple of 3. must not be a multiple of 3 unless both lengths are
multiples of 3.
10. Strong emphasis begins with a delimiter that 10. Strong emphasis begins with a delimiter that
[can open strong emphasis] and ends with a delimiter that [can open strong emphasis] and ends with a delimiter that
[can close strong emphasis], and that uses the same character [can close strong emphasis], and that uses the same character
(`_` or `*`) as the opening delimiter. The (`_` or `*`) as the opening delimiter. The
opening and closing delimiters must belong to separate opening and closing delimiters must belong to separate
[delimiter runs]. If one of the delimiters can both open [delimiter runs]. If one of the delimiters can both open
and close strong emphasis, then the sum of the lengths of and close strong emphasis, then the sum of the lengths of
the delimiter runs containing the opening and closing the delimiter runs containing the opening and closing
delimiters must not be a multiple of 3. delimiters must not be a multiple of 3 unless both lengths
are multiples of 3.
11. A literal `*` character cannot occur at the beginning or end of 11. A literal `*` character cannot occur at the beginning or end of
`*`-delimited emphasis or `**`-delimited strong emphasis, unless it `*`-delimited emphasis or `**`-delimited strong emphasis, unless it
is backslash-escaped. is backslash-escaped.
12. A literal `_` character cannot occur at the beginning or end of 12. A literal `_` character cannot occur at the beginning or end of
`_`-delimited emphasis or `__`-delimited strong emphasis, unless it `_`-delimited emphasis or `__`-delimited strong emphasis, unless it
is backslash-escaped. is backslash-escaped.
Where rules 1--12 above are compatible with multiple parsings, Where rules 1--12 above are compatible with multiple parsings,
skipping to change at line 6234 skipping to change at line 6463
Note that in the preceding case, the interpretation Note that in the preceding case, the interpretation
``` markdown ``` markdown
<p><em>foo</em><em>bar<em></em>baz</em></p> <p><em>foo</em><em>bar<em></em>baz</em></p>
``` ```
is precluded by the condition that a delimiter that is precluded by the condition that a delimiter that
can both open and close (like the `*` after `foo`) can both open and close (like the `*` after `foo`)
cannot form emphasis if the sum of the lengths of cannot form emphasis if the sum of the lengths of
the delimiter runs containing the opening and the delimiter runs containing the opening and
closing delimiters is a multiple of 3. closing delimiters is a multiple of 3 unless
both lengths are multiples of 3.
For the same reason, we don't get two consecutive
emphasis sections in this example:
```````````````````````````````` example
*foo**bar*
.
<p><em>foo**bar</em></p>
````````````````````````````````
The same condition ensures that the following The same condition ensures that the following
cases are all strong emphasis nested inside cases are all strong emphasis nested inside
emphasis, even when the interior spaces are emphasis, even when the interior spaces are
omitted: omitted:
```````````````````````````````` example ```````````````````````````````` example
***foo** bar* ***foo** bar*
. .
<p><em><strong>foo</strong> bar</em></p> <p><em><strong>foo</strong> bar</em></p>
skipping to change at line 6259 skipping to change at line 6498
. .
<p><em>foo <strong>bar</strong></em></p> <p><em>foo <strong>bar</strong></em></p>
```````````````````````````````` ````````````````````````````````
```````````````````````````````` example ```````````````````````````````` example
*foo**bar*** *foo**bar***
. .
<p><em>foo<strong>bar</strong></em></p> <p><em>foo<strong>bar</strong></em></p>
```````````````````````````````` ````````````````````````````````
When the lengths of the interior closing and opening
delimiter runs are *both* multiples of 3, though,
they can match to create emphasis:
```````````````````````````````` example
foo***bar***baz
.
<p>foo<em><strong>bar</strong></em>baz</p>
````````````````````````````````
```````````````````````````````` example
foo******bar*********baz
.
<p>foo<strong><strong><strong>bar</strong></strong></strong>***baz</p>
````````````````````````````````
Indefinite levels of nesting are possible: Indefinite levels of nesting are possible:
```````````````````````````````` example ```````````````````````````````` example
*foo **bar *baz* bim** bop* *foo **bar *baz* bim** bop*
. .
<p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p> <p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p>
```````````````````````````````` ````````````````````````````````
```````````````````````````````` example ```````````````````````````````` example
*foo [*bar*](/url)* *foo [*bar*](/url)*
skipping to change at line 6725 skipping to change at line 6980
than the brackets in link text. Thus, for example, than the brackets in link text. Thus, for example,
`` [foo`]` `` could not be a link text, since the second `]` `` [foo`]` `` could not be a link text, since the second `]`
is part of a code span. is part of a code span.
- The brackets in link text bind more tightly than markers for - The brackets in link text bind more tightly than markers for
[emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link. [emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link.
A [link destination](@) consists of either A [link destination](@) consists of either
- a sequence of zero or more characters between an opening `<` and a - a sequence of zero or more characters between an opening `<` and a
closing `>` that contains no spaces, line breaks, or unescaped closing `>` that contains no line breaks or unescaped
`<` or `>` characters, or `<` or `>` characters, or
- a nonempty sequence of characters that does not include - a nonempty sequence of characters that does not start with
ASCII space or control characters, and includes parentheses `<`, does not include ASCII space or control characters, and
only if (a) they are backslash-escaped or (b) they are part of includes parentheses only if (a) they are backslash-escaped or
a balanced pair of unescaped parentheses. (Implementations (b) they are part of a balanced pair of unescaped parentheses.
may impose limits on parentheses nesting to avoid performance (Implementations may impose limits on parentheses nesting to
issues, but at least three levels of nesting should be supported.) avoid performance issues, but at least three levels of nesting
should be supported.)
A [link title](@) consists of either A [link title](@) consists of either
- a sequence of zero or more characters between straight double-quote - a sequence of zero or more characters between straight double-quote
characters (`"`), including a `"` character only if it is characters (`"`), including a `"` character only if it is
backslash-escaped, or backslash-escaped, or
- a sequence of zero or more characters between straight single-quote - a sequence of zero or more characters between straight single-quote
characters (`'`), including a `'` character only if it is characters (`'`), including a `'` character only if it is
backslash-escaped, or backslash-escaped, or
- a sequence of zero or more characters between matching parentheses - a sequence of zero or more characters between matching parentheses
(`(...)`), including a `)` character only if it is backslash-escaped. (`(...)`), including a `(` or `)` character only if it is
backslash-escaped.
Although [link titles] may span multiple lines, they may not contain Although [link titles] may span multiple lines, they may not contain
a [blank line]. a [blank line].
An [inline link](@) consists of a [link text] followed immediately An [inline link](@) consists of a [link text] followed immediately
by a left parenthesis `(`, optional [whitespace], an optional by a left parenthesis `(`, optional [whitespace], an optional
[link destination], an optional [link title] separated from the link [link destination], an optional [link title] separated from the link
destination by [whitespace], optional [whitespace], and a right destination by [whitespace], optional [whitespace], and a right
parenthesis `)`. The link's text consists of the inlines contained parenthesis `)`. The link's text consists of the inlines contained
in the [link text] (excluding the enclosing square brackets). in the [link text] (excluding the enclosing square brackets).
skipping to change at line 6793 skipping to change at line 7050
. .
<p><a href="">link</a></p> <p><a href="">link</a></p>
```````````````````````````````` ````````````````````````````````
```````````````````````````````` example ```````````````````````````````` example
[link](<>) [link](<>)
. .
<p><a href="">link</a></p> <p><a href="">link</a></p>
```````````````````````````````` ````````````````````````````````
The destination cannot contain spaces or line breaks, The destination can only contain spaces if it is
even if enclosed in pointy brackets: enclosed in pointy brackets:
```````````````````````````````` example ```````````````````````````````` example
[link](/my uri) [link](/my uri)
. .
<p>[link](/my uri)</p> <p>[link](/my uri)</p>
```````````````````````````````` ````````````````````````````````
```````````````````````````````` example ```````````````````````````````` example
[link](</my uri>) [link](</my uri>)
. .
<p>[link](&lt;/my uri&gt;)</p> <p><a href="/my%20uri">link</a></p>
```````````````````````````````` ````````````````````````````````
The destination cannot contain line breaks,
even if enclosed in pointy brackets:
```````````````````````````````` example ```````````````````````````````` example
[link](foo [link](foo
bar) bar)
. .
<p>[link](foo <p>[link](foo
bar)</p> bar)</p>
```````````````````````````````` ````````````````````````````````
```````````````````````````````` example ```````````````````````````````` example
[link](<foo [link](<foo
bar>) bar>)
. .
<p>[link](<foo <p>[link](<foo
bar>)</p> bar>)</p>
```````````````````````````````` ````````````````````````````````
The destination can contain `)` if it is enclosed
in pointy brackets:
```````````````````````````````` example
[a](<b)c>)
.
<p><a href="b)c">a</a></p>
````````````````````````````````
Pointy brackets that enclose links must be unescaped:
```````````````````````````````` example
[link](<foo\>)
.
<p>[link](&lt;foo&gt;)</p>
````````````````````````````````
These are not links, because the opening pointy bracket
is not matched properly:
```````````````````````````````` example
[a](<b)c
[a](<b)c>
[a](<b>c)
.
<p>[a](&lt;b)c
[a](&lt;b)c&gt;
[a](<b>c)</p>
````````````````````````````````
Parentheses inside the link destination may be escaped: Parentheses inside the link destination may be escaped:
```````````````````````````````` example ```````````````````````````````` example
[link](\(foo\)) [link](\(foo\))
. .
<p><a href="(foo)">link</a></p> <p><a href="(foo)">link</a></p>
```````````````````````````````` ````````````````````````````````
Any number of parentheses are allowed without escaping, as long as they are Any number of parentheses are allowed without escaping, as long as they are
balanced: balanced:
skipping to change at line 7837 skipping to change at line 8127
<p>!<a href="/url" title="title">foo</a></p> <p>!<a href="/url" title="title">foo</a></p>
```````````````````````````````` ````````````````````````````````
## Autolinks ## Autolinks
[Autolink](@)s are absolute URIs and email addresses inside [Autolink](@)s are absolute URIs and email addresses inside
`<` and `>`. They are parsed as links, with the URL or email address `<` and `>`. They are parsed as links, with the URL or email address
as the link label. as the link label.
A [URI autolink](@) consists of `<`, followed by an A [URI autolink](@) consists of `<`, followed by an
[absolute URI] not containing `<`, followed by `>`. It is parsed as [absolute URI] followed by `>`. It is parsed as
a link to the URI, with the URI as the link's label. a link to the URI, with the URI as the link's label.
An [absolute URI](@), An [absolute URI](@),
for these purposes, consists of a [scheme] followed by a colon (`:`) for these purposes, consists of a [scheme] followed by a colon (`:`)
followed by zero or more characters other than ASCII followed by zero or more characters other than ASCII
[whitespace] and control characters, `<`, and `>`. If [whitespace] and control characters, `<`, and `>`. If
the URI includes these characters, they must be percent-encoded the URI includes these characters, they must be percent-encoded
(e.g. `%20` for a space). (e.g. `%20` for a space).
For purposes of this spec, a [scheme](@) is any sequence For purposes of this spec, a [scheme](@) is any sequence
skipping to change at line 8031 skipping to change at line 8321
consists of optional [whitespace], consists of optional [whitespace],
a `=` character, optional [whitespace], and an [attribute a `=` character, optional [whitespace], and an [attribute
value]. value].
An [attribute value](@) An [attribute value](@)
consists of an [unquoted attribute value], consists of an [unquoted attribute value],
a [single-quoted attribute value], or a [double-quoted attribute value]. a [single-quoted attribute value], or a [double-quoted attribute value].
An [unquoted attribute value](@) An [unquoted attribute value](@)
is a nonempty string of characters not is a nonempty string of characters not
including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``. including [whitespace], `"`, `'`, `=`, `<`, `>`, or `` ` ``.
A [single-quoted attribute value](@) A [single-quoted attribute value](@)
consists of `'`, zero or more consists of `'`, zero or more
characters not including `'`, and a final `'`. characters not including `'`, and a final `'`.
A [double-quoted attribute value](@) A [double-quoted attribute value](@)
consists of `"`, zero or more consists of `"`, zero or more
characters not including `"`, and a final `"`. characters not including `"`, and a final `"`.
An [open tag](@) consists of a `<` character, a [tag name], An [open tag](@) consists of a `<` character, a [tag name],
skipping to change at line 8144 skipping to change at line 8434
<a href="hi'> <a href=hi'> <a href="hi'> <a href=hi'>
. .
<p>&lt;a href=&quot;hi'&gt; &lt;a href=hi'&gt;</p> <p>&lt;a href=&quot;hi'&gt; &lt;a href=hi'&gt;</p>
```````````````````````````````` ````````````````````````````````
Illegal [whitespace]: Illegal [whitespace]:
```````````````````````````````` example ```````````````````````````````` example
< a>< < a><
foo><bar/ > foo><bar/ >
<foo bar=baz
bim!bop />
. .
<p>&lt; a&gt;&lt; <p>&lt; a&gt;&lt;
foo&gt;&lt;bar/ &gt;</p> foo&gt;&lt;bar/ &gt;
&lt;foo bar=baz
bim!bop /&gt;</p>
```````````````````````````````` ````````````````````````````````
Missing [whitespace]: Missing [whitespace]:
```````````````````````````````` example ```````````````````````````````` example
<a href='bar'title=title> <a href='bar'title=title>
. .
<p>&lt;a href='bar'title=title&gt;</p> <p>&lt;a href='bar'title=title&gt;</p>
```````````````````````````````` ````````````````````````````````
skipping to change at line 8326 skipping to change at line 8620
<p><em>foo<br /> <p><em>foo<br />
bar</em></p> bar</em></p>
```````````````````````````````` ````````````````````````````````
Line breaks do not occur inside code spans Line breaks do not occur inside code spans
```````````````````````````````` example ```````````````````````````````` example
`code `code
span` span`
. .
<p><code>code span</code></p> <p><code>code span</code></p>
```````````````````````````````` ````````````````````````````````
```````````````````````````````` example ```````````````````````````````` example
`code\ `code\
span` span`
. .
<p><code>code\ span</code></p> <p><code>code\ span</code></p>
```````````````````````````````` ````````````````````````````````
or HTML tags: or HTML tags:
skipping to change at line 8731 skipping to change at line 9025
Parameter `stack_bottom` sets a lower bound to how far we Parameter `stack_bottom` sets a lower bound to how far we
descend in the [delimiter stack]. If it is NULL, we can descend in the [delimiter stack]. If it is NULL, we can
go all the way to the bottom. Otherwise, we stop before go all the way to the bottom. Otherwise, we stop before
visiting `stack_bottom`. visiting `stack_bottom`.
Let `current_position` point to the element on the [delimiter stack] Let `current_position` point to the element on the [delimiter stack]
just above `stack_bottom` (or the first element if `stack_bottom` just above `stack_bottom` (or the first element if `stack_bottom`
is NULL). is NULL).
We keep track of the `openers_bottom` for each delimiter We keep track of the `openers_bottom` for each delimiter
type (`*`, `_`). Initialize this to `stack_bottom`. type (`*`, `_`) and each length of the closing delimiter run
(modulo 3). Initialize this to `stack_bottom`.
Then we repeat the following until we run out of potential Then we repeat the following until we run out of potential
closers: closers:
- Move `current_position` forward in the delimiter stack (if needed) - Move `current_position` forward in the delimiter stack (if needed)
until we find the first potential closer with delimiter `*` or `_`. until we find the first potential closer with delimiter `*` or `_`.
(This will be the potential closer closest (This will be the potential closer closest
to the beginning of the input -- the first one in parse order.) to the beginning of the input -- the first one in parse order.)
- Now, look back in the stack (staying above `stack_bottom` and - Now, look back in the stack (staying above `stack_bottom` and
skipping to change at line 8763 skipping to change at line 9058
+ Remove any delimiters between the opener and closer from + Remove any delimiters between the opener and closer from
the delimiter stack. the delimiter stack.
+ Remove 1 (for regular emph) or 2 (for strong emph) delimiters + Remove 1 (for regular emph) or 2 (for strong emph) delimiters
from the opening and closing text nodes. If they become empty from the opening and closing text nodes. If they become empty
as a result, remove them and remove the corresponding element as a result, remove them and remove the corresponding element
of the delimiter stack. If the closing node is removed, reset of the delimiter stack. If the closing node is removed, reset
`current_position` to the next element in the stack. `current_position` to the next element in the stack.
- If none in found: - If none is found:
+ Set `openers_bottom` to the element before `current_position`. + Set `openers_bottom` to the element before `current_position`.
(We know that there are no openers for this kind of closer up to and (We know that there are no openers for this kind of closer up to and
including this point, so this puts a lower bound on future searches.) including this point, so this puts a lower bound on future searches.)
+ If the closer at `current_position` is not a potential opener, + If the closer at `current_position` is not a potential opener,
remove it from the delimiter stack (since we know it can't remove it from the delimiter stack (since we know it can't
be a closer either). be a closer either).
+ Advance `current_position` to the next element in the stack. + Advance `current_position` to the next element in the stack.
 End of changes. 79 change blocks. 
123 lines changed or deleted 416 lines changed or added

This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/