spec.txt | spec.txt | |||
---|---|---|---|---|
--- | --- | |||
title: CommonMark Spec | title: CommonMark Spec | |||
author: John MacFarlane | author: John MacFarlane | |||
version: 0.28 | version: 0.29 | |||
date: '2017-08-01' | date: '2019-04-06' | |||
license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)' | license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)' | |||
... | ... | |||
# Introduction | # Introduction | |||
## What is Markdown? | ## What is Markdown? | |||
Markdown is a plain text format for writing structured documents, | Markdown is a plain text format for writing structured documents, | |||
based on conventions for indicating formatting in email | based on conventions for indicating formatting in email | |||
and usenet posts. It was developed by John Gruber (with | and usenet posts. It was developed by John Gruber (with | |||
skipping to change at line 251 ¶ | skipping to change at line 251 ¶ | |||
[foo][] | [foo][] | |||
``` | ``` | |||
In the absence of a spec, early implementers consulted `Markdown.pl` | In the absence of a spec, early implementers consulted `Markdown.pl` | |||
to resolve these ambiguities. But `Markdown.pl` was quite buggy, and | to resolve these ambiguities. But `Markdown.pl` was quite buggy, and | |||
gave manifestly bad results in many cases, so it was not a | gave manifestly bad results in many cases, so it was not a | |||
satisfactory replacement for a spec. | satisfactory replacement for a spec. | |||
Because there is no unambiguous spec, implementations have diverged | Because there is no unambiguous spec, implementations have diverged | |||
considerably. As a result, users are often surprised to find that | considerably. As a result, users are often surprised to find that | |||
a document that renders one way on one system (say, a github wiki) | a document that renders one way on one system (say, a GitHub wiki) | |||
renders differently on another (say, converting to docbook using | renders differently on another (say, converting to docbook using | |||
pandoc). To make matters worse, because nothing in Markdown counts | pandoc). To make matters worse, because nothing in Markdown counts | |||
as a "syntax error," the divergence often isn't discovered right away. | as a "syntax error," the divergence often isn't discovered right away. | |||
## About this document | ## About this document | |||
This document attempts to specify Markdown syntax unambiguously. | This document attempts to specify Markdown syntax unambiguously. | |||
It contains many examples with side-by-side Markdown and | It contains many examples with side-by-side Markdown and | |||
HTML. These are intended to double as conformance tests. An | HTML. These are intended to double as conformance tests. An | |||
accompanying script `spec_tests.py` can be used to run the tests | accompanying script `spec_tests.py` can be used to run the tests | |||
skipping to change at line 331 ¶ | skipping to change at line 331 ¶ | |||
[Unicode whitespace](@) is a sequence of one | [Unicode whitespace](@) is a sequence of one | |||
or more [Unicode whitespace characters]. | or more [Unicode whitespace characters]. | |||
A [space](@) is `U+0020`. | A [space](@) is `U+0020`. | |||
A [non-whitespace character](@) is any character | A [non-whitespace character](@) is any character | |||
that is not a [whitespace character]. | that is not a [whitespace character]. | |||
An [ASCII punctuation character](@) | An [ASCII punctuation character](@) | |||
is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`, | is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`, | |||
`*`, `+`, `,`, `-`, `.`, `/`, `:`, `;`, `<`, `=`, `>`, `?`, `@`, | `*`, `+`, `,`, `-`, `.`, `/` (U+0021–2F), | |||
`[`, `\`, `]`, `^`, `_`, `` ` ``, `{`, `|`, `}`, or `~`. | `:`, `;`, `<`, `=`, `>`, `?`, `@` (U+003A–0040), | |||
`[`, `\`, `]`, `^`, `_`, `` ` `` (U+005B–0060), | ||||
`{`, `|`, `}`, or `~` (U+007B–007E). | ||||
A [punctuation character](@) is an [ASCII | A [punctuation character](@) is an [ASCII | |||
punctuation character] or anything in | punctuation character] or anything in | |||
the general Unicode categories `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`. | the general Unicode categories `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`. | |||
## Tabs | ## Tabs | |||
Tabs in lines are not expanded to [spaces]. However, | Tabs in lines are not expanded to [spaces]. However, | |||
in contexts where whitespace helps to define block structure, | in contexts where whitespace helps to define block structure, | |||
tabs behave as if they were replaced by spaces with a tab stop | tabs behave as if they were replaced by spaces with a tab stop | |||
skipping to change at line 514 ¶ | skipping to change at line 516 ¶ | |||
paragraphs, headings, and other block constructs can be parsed for inline | paragraphs, headings, and other block constructs can be parsed for inline | |||
structure. The second step requires information about link reference | structure. The second step requires information about link reference | |||
definitions that will be available only at the end of the first | definitions that will be available only at the end of the first | |||
step. Note that the first step requires processing lines in sequence, | step. Note that the first step requires processing lines in sequence, | |||
but the second can be parallelized, since the inline parsing of | but the second can be parallelized, since the inline parsing of | |||
one block element does not affect the inline parsing of any other. | one block element does not affect the inline parsing of any other. | |||
## Container blocks and leaf blocks | ## Container blocks and leaf blocks | |||
We can divide blocks into two types: | We can divide blocks into two types: | |||
[container block](@)s, | [container blocks](@), | |||
which can contain other blocks, and [leaf block](@)s, | which can contain other blocks, and [leaf blocks](@), | |||
which cannot. | which cannot. | |||
# Leaf blocks | # Leaf blocks | |||
This section describes the different kinds of leaf block that make up a | This section describes the different kinds of leaf block that make up a | |||
Markdown document. | Markdown document. | |||
## Thematic breaks | ## Thematic breaks | |||
A line consisting of 0-3 spaces of indentation, followed by a sequence | A line consisting of 0-3 spaces of indentation, followed by a sequence | |||
of three or more matching `-`, `_`, or `*` characters, each followed | of three or more matching `-`, `_`, or `*` characters, each followed | |||
optionally by any number of spaces, forms a | optionally by any number of spaces or tabs, forms a | |||
[thematic break](@). | [thematic break](@). | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
*** | *** | |||
--- | --- | |||
___ | ___ | |||
. | . | |||
<hr /> | <hr /> | |||
<hr /> | <hr /> | |||
<hr /> | <hr /> | |||
skipping to change at line 801 ¶ | skipping to change at line 803 ¶ | |||
```````````````````````````````` | ```````````````````````````````` | |||
Contents are parsed as inlines: | Contents are parsed as inlines: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
# foo *bar* \*baz\* | # foo *bar* \*baz\* | |||
. | . | |||
<h1>foo <em>bar</em> *baz*</h1> | <h1>foo <em>bar</em> *baz*</h1> | |||
```````````````````````````````` | ```````````````````````````````` | |||
Leading and trailing blanks are ignored in parsing inline content: | Leading and trailing [whitespace] is ignored in parsing inline content: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
# foo | # foo | |||
. | . | |||
<h1>foo</h1> | <h1>foo</h1> | |||
```````````````````````````````` | ```````````````````````````````` | |||
One to three spaces indentation are allowed: | One to three spaces indentation are allowed: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
skipping to change at line 986 ¶ | skipping to change at line 988 ¶ | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
Foo *bar | Foo *bar | |||
baz* | baz* | |||
==== | ==== | |||
. | . | |||
<h1>Foo <em>bar | <h1>Foo <em>bar | |||
baz</em></h1> | baz</em></h1> | |||
```````````````````````````````` | ```````````````````````````````` | |||
The contents are the result of parsing the headings's raw | ||||
content as inlines. The heading's raw content is formed by | ||||
concatenating the lines and removing initial and final | ||||
[whitespace]. | ||||
```````````````````````````````` example | ||||
Foo *bar | ||||
baz*→ | ||||
==== | ||||
. | ||||
<h1>Foo <em>bar | ||||
baz</em></h1> | ||||
```````````````````````````````` | ||||
The underlining can be any length: | The underlining can be any length: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
Foo | Foo | |||
------------------------- | ------------------------- | |||
Foo | Foo | |||
= | = | |||
. | . | |||
<h2>Foo</h2> | <h2>Foo</h2> | |||
skipping to change at line 1501 ¶ | skipping to change at line 1517 ¶ | |||
## Fenced code blocks | ## Fenced code blocks | |||
A [code fence](@) is a sequence | A [code fence](@) is a sequence | |||
of at least three consecutive backtick characters (`` ` ``) or | of at least three consecutive backtick characters (`` ` ``) or | |||
tildes (`~`). (Tildes and backticks cannot be mixed.) | tildes (`~`). (Tildes and backticks cannot be mixed.) | |||
A [fenced code block](@) | A [fenced code block](@) | |||
begins with a code fence, indented no more than three spaces. | begins with a code fence, indented no more than three spaces. | |||
The line with the opening code fence may optionally contain some text | The line with the opening code fence may optionally contain some text | |||
following the code fence; this is trimmed of leading and trailing | following the code fence; this is trimmed of leading and trailing | |||
spaces and called the [info string](@). | whitespace and called the [info string](@). If the [info string] comes | |||
The [info string] may not contain any backtick | after a backtick fence, it may not contain any backtick | |||
characters. (The reason for this restriction is that otherwise | characters. (The reason for this restriction is that otherwise | |||
some inline code would be incorrectly interpreted as the | some inline code would be incorrectly interpreted as the | |||
beginning of a fenced code block.) | beginning of a fenced code block.) | |||
The content of the code block consists of all subsequent lines, until | The content of the code block consists of all subsequent lines, until | |||
a closing [code fence] of the same type as the code block | a closing [code fence] of the same type as the code block | |||
began with (backticks or tildes), and with at least as many backticks | began with (backticks or tildes), and with at least as many backticks | |||
or tildes as the opening code fence. If the leading code fence is | or tildes as the opening code fence. If the leading code fence is | |||
indented N spaces, then up to N spaces of indentation are removed from | indented N spaces, then up to N spaces of indentation are removed from | |||
each line of the content (if present). (If a content line is not | each line of the content (if present). (If a content line is not | |||
skipping to change at line 1768 ¶ | skipping to change at line 1784 ¶ | |||
``` | ``` | |||
</code></pre> | </code></pre> | |||
```````````````````````````````` | ```````````````````````````````` | |||
Code fences (opening and closing) cannot contain internal spaces: | Code fences (opening and closing) cannot contain internal spaces: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
``` ``` | ``` ``` | |||
aaa | aaa | |||
. | . | |||
<p><code></code> | <p><code> </code> | |||
aaa</p> | aaa</p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
~~~~~~ | ~~~~~~ | |||
aaa | aaa | |||
~~~ ~~ | ~~~ ~~ | |||
. | . | |||
<pre><code>aaa | <pre><code>aaa | |||
~~~ ~~ | ~~~ ~~ | |||
skipping to change at line 1816 ¶ | skipping to change at line 1832 ¶ | |||
~~~ | ~~~ | |||
# baz | # baz | |||
. | . | |||
<h2>foo</h2> | <h2>foo</h2> | |||
<pre><code>bar | <pre><code>bar | |||
</code></pre> | </code></pre> | |||
<h1>baz</h1> | <h1>baz</h1> | |||
```````````````````````````````` | ```````````````````````````````` | |||
An [info string] can be provided after the opening code fence. | An [info string] can be provided after the opening code fence. | |||
Opening and closing spaces will be stripped, and the first word, prefixed | Although this spec doesn't mandate any particular treatment of | |||
with `language-`, is used as the value for the `class` attribute of the | the info string, the first word is typically used to specify | |||
`code` element within the enclosing `pre` element. | the language of the code block. In HTML output, the language is | |||
normally indicated by adding a class to the `code` element consisting | ||||
of `language-` followed by the language name. | ||||
```````````````````````````````` example | ```````````````````````````````` example | |||
```ruby | ```ruby | |||
def foo(x) | def foo(x) | |||
return 3 | return 3 | |||
end | end | |||
``` | ``` | |||
. | . | |||
<pre><code class="language-ruby">def foo(x) | <pre><code class="language-ruby">def foo(x) | |||
return 3 | return 3 | |||
skipping to change at line 1863 ¶ | skipping to change at line 1881 ¶ | |||
[Info strings] for backtick code blocks cannot contain backticks: | [Info strings] for backtick code blocks cannot contain backticks: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
``` aa ``` | ``` aa ``` | |||
foo | foo | |||
. | . | |||
<p><code>aa</code> | <p><code>aa</code> | |||
foo</p> | foo</p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
[Info strings] for tilde code blocks can contain backticks and tildes: | ||||
```````````````````````````````` example | ||||
~~~ aa ``` ~~~ | ||||
foo | ||||
~~~ | ||||
. | ||||
<pre><code class="language-aa">foo | ||||
</code></pre> | ||||
```````````````````````````````` | ||||
Closing code fences cannot have [info strings]: | Closing code fences cannot have [info strings]: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
``` | ``` | |||
``` aaa | ``` aaa | |||
``` | ``` | |||
. | . | |||
<pre><code>``` aaa | <pre><code>``` aaa | |||
</code></pre> | </code></pre> | |||
```````````````````````````````` | ```````````````````````````````` | |||
## HTML blocks | ## HTML blocks | |||
An [HTML block](@) is a group of lines that is treated | An [HTML block](@) is a group of lines that is treated | |||
as raw HTML (and will not be escaped in HTML output). | as raw HTML (and will not be escaped in HTML output). | |||
There are seven kinds of [HTML block], which can be defined | There are seven kinds of [HTML block], which can be defined by their | |||
by their start and end conditions. The block begins with a line that | start and end conditions. The block begins with a line that meets a | |||
meets a [start condition](@) (after up to three spaces | [start condition](@) (after up to three spaces optional indentation). | |||
optional indentation). It ends with the first subsequent line that | It ends with the first subsequent line that meets a matching [end | |||
meets a matching [end condition](@), or the last line of | condition](@), or the last line of the document, or the last line of | |||
the document or other [container block]), if no line is encountered that meets t | the [container block](#container-blocks) containing the current HTML | |||
he | block, if no line is encountered that meets the [end condition]. If | |||
[end condition]. If the first line meets both the [start condition] | the first line meets both the [start condition] and the [end | |||
and the [end condition], the block will contain just that line. | condition], the block will contain just that line. | |||
1. **Start condition:** line begins with the string `<script`, | 1. **Start condition:** line begins with the string `<script`, | |||
`<pre`, or `<style` (case-insensitive), followed by whitespace, | `<pre`, or `<style` (case-insensitive), followed by whitespace, | |||
the string `>`, or the end of the line.\ | the string `>`, or the end of the line.\ | |||
**End condition:** line contains an end tag | **End condition:** line contains an end tag | |||
`</script>`, `</pre>`, or `</style>` (case-insensitive; it | `</script>`, `</pre>`, or `</style>` (case-insensitive; it | |||
need not match the start tag). | need not match the start tag). | |||
2. **Start condition:** line begins with the string `<!--`.\ | 2. **Start condition:** line begins with the string `<!--`.\ | |||
**End condition:** line contains the string `-->`. | **End condition:** line contains the string `-->`. | |||
skipping to change at line 1917 ¶ | skipping to change at line 1947 ¶ | |||
**End condition:** line contains the string `]]>`. | **End condition:** line contains the string `]]>`. | |||
6. **Start condition:** line begins the string `<` or `</` | 6. **Start condition:** line begins the string `<` or `</` | |||
followed by one of the strings (case-insensitive) `address`, | followed by one of the strings (case-insensitive) `address`, | |||
`article`, `aside`, `base`, `basefont`, `blockquote`, `body`, | `article`, `aside`, `base`, `basefont`, `blockquote`, `body`, | |||
`caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`, | `caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`, | |||
`dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`, | `dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`, | |||
`footer`, `form`, `frame`, `frameset`, | `footer`, `form`, `frame`, `frameset`, | |||
`h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`, | `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`, | |||
`html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`, | `html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`, | |||
`meta`, `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`, | `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`, | |||
`section`, `source`, `summary`, `table`, `tbody`, `td`, | `section`, `source`, `summary`, `table`, `tbody`, `td`, | |||
`tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed | `tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed | |||
by [whitespace], the end of the line, the string `>`, or | by [whitespace], the end of the line, the string `>`, or | |||
the string `/>`.\ | the string `/>`.\ | |||
**End condition:** line is followed by a [blank line]. | **End condition:** line is followed by a [blank line]. | |||
7. **Start condition:** line begins with a complete [open tag] | 7. **Start condition:** line begins with a complete [open tag] | |||
or [closing tag] (with any [tag name] other than `script`, | (with any [tag name] other than `script`, | |||
`style`, or `pre`) followed only by [whitespace] | `style`, or `pre`) or a complete [closing tag], | |||
or the end of the line.\ | followed only by [whitespace] or the end of the line.\ | |||
**End condition:** line is followed by a [blank line]. | **End condition:** line is followed by a [blank line]. | |||
HTML blocks continue until they are closed by their appropriate | HTML blocks continue until they are closed by their appropriate | |||
[end condition], or the last line of the document or other [container block]. | [end condition], or the last line of the document or other [container | |||
This means any HTML **within an HTML block** that might otherwise be recognised | block](#container-blocks). This means any HTML **within an HTML | |||
as a start condition will be ignored by the parser and passed through as-is, | block** that might otherwise be recognised as a start condition will | |||
without changing the parser's state. | be ignored by the parser and passed through as-is, without changing | |||
the parser's state. | ||||
For instance, `<pre>` within a HTML block started by `<table>` will not affect | For instance, `<pre>` within a HTML block started by `<table>` will not affect | |||
the parser state; as the HTML block was started in by start condition 6, it | the parser state; as the HTML block was started in by start condition 6, it | |||
will end at any blank line. This can be surprising: | will end at any blank line. This can be surprising: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
<table><tr><td> | <table><tr><td> | |||
<pre> | <pre> | |||
**Hello**, | **Hello**, | |||
skipping to change at line 1957 ¶ | skipping to change at line 1988 ¶ | |||
</td></tr></table> | </td></tr></table> | |||
. | . | |||
<table><tr><td> | <table><tr><td> | |||
<pre> | <pre> | |||
**Hello**, | **Hello**, | |||
<p><em>world</em>. | <p><em>world</em>. | |||
</pre></p> | </pre></p> | |||
</td></tr></table> | </td></tr></table> | |||
```````````````````````````````` | ```````````````````````````````` | |||
In this case, the HTML block is terminated by the newline — the `**hello**` | In this case, the HTML block is terminated by the newline — the `**Hello**` | |||
text remains verbatim — and regular parsing resumes, with a paragraph, | text remains verbatim — and regular parsing resumes, with a paragraph, | |||
emphasised `world` and inline and block HTML following. | emphasised `world` and inline and block HTML following. | |||
All types of [HTML blocks] except type 7 may interrupt | All types of [HTML blocks] except type 7 may interrupt | |||
a paragraph. Blocks of type 7 may not interrupt a paragraph. | a paragraph. Blocks of type 7 may not interrupt a paragraph. | |||
(This restriction is intended to prevent unwanted interpretation | (This restriction is intended to prevent unwanted interpretation | |||
of long tags inside a wrapped paragraph as starting HTML blocks.) | of long tags inside a wrapped paragraph as starting HTML blocks.) | |||
Some simple examples follow. Here are some basic HTML blocks | Some simple examples follow. Here are some basic HTML blocks | |||
of type 6: | of type 6: | |||
skipping to change at line 2462 ¶ | skipping to change at line 2493 ¶ | |||
bar | bar | |||
</div> | </div> | |||
. | . | |||
<p>Foo</p> | <p>Foo</p> | |||
<div> | <div> | |||
bar | bar | |||
</div> | </div> | |||
```````````````````````````````` | ```````````````````````````````` | |||
However, a following blank line is needed, except at the end of | However, a following blank line is needed, except at the end of | |||
a document, and except for blocks of types 1--5, above: | a document, and except for blocks of types 1--5, [above][HTML | |||
block]: | ||||
```````````````````````````````` example | ```````````````````````````````` example | |||
<div> | <div> | |||
bar | bar | |||
</div> | </div> | |||
*foo* | *foo* | |||
. | . | |||
<div> | <div> | |||
bar | bar | |||
</div> | </div> | |||
skipping to change at line 2602 ¶ | skipping to change at line 2634 ¶ | |||
<pre><code><td> | <pre><code><td> | |||
Hi | Hi | |||
</td> | </td> | |||
</code></pre> | </code></pre> | |||
</tr> | </tr> | |||
</table> | </table> | |||
```````````````````````````````` | ```````````````````````````````` | |||
Fortunately, blank lines are usually not necessary and can be | Fortunately, blank lines are usually not necessary and can be | |||
deleted. The exception is inside `<pre>` tags, but as described | deleted. The exception is inside `<pre>` tags, but as described | |||
above, raw HTML blocks starting with `<pre>` *can* contain blank | [above][HTML blocks], raw HTML blocks starting with `<pre>` | |||
lines. | *can* contain blank lines. | |||
## Link reference definitions | ## Link reference definitions | |||
A [link reference definition](@) | A [link reference definition](@) | |||
consists of a [link label], indented up to three spaces, followed | consists of a [link label], indented up to three spaces, followed | |||
by a colon (`:`), optional [whitespace] (including up to one | by a colon (`:`), optional [whitespace] (including up to one | |||
[line ending]), a [link destination], | [line ending]), a [link destination], | |||
optional [whitespace] (including up to one | optional [whitespace] (including up to one | |||
[line ending]), and an optional [link | [line ending]), and an optional [link | |||
title], which if it is present must be separated | title], which if it is present must be separated | |||
skipping to change at line 2652 ¶ | skipping to change at line 2684 ¶ | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
[Foo*bar\]]:my_(url) 'title (with parens)' | [Foo*bar\]]:my_(url) 'title (with parens)' | |||
[Foo*bar\]] | [Foo*bar\]] | |||
. | . | |||
<p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p> | <p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
[Foo bar]: | [Foo bar]: | |||
<my%20url> | <my url> | |||
'title' | 'title' | |||
[Foo bar] | [Foo bar] | |||
. | . | |||
<p><a href="my%20url" title="title">Foo bar</a></p> | <p><a href="my%20url" title="title">Foo bar</a></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
The title may extend over multiple lines: | The title may extend over multiple lines: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
skipping to change at line 2714 ¶ | skipping to change at line 2746 ¶ | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
[foo]: | [foo]: | |||
[foo] | [foo] | |||
. | . | |||
<p>[foo]:</p> | <p>[foo]:</p> | |||
<p>[foo]</p> | <p>[foo]</p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
However, an empty link destination may be specified using | ||||
angle brackets: | ||||
```````````````````````````````` example | ||||
[foo]: <> | ||||
[foo] | ||||
. | ||||
<p><a href="">foo</a></p> | ||||
```````````````````````````````` | ||||
The title must be separated from the link destination by | ||||
whitespace: | ||||
```````````````````````````````` example | ||||
[foo]: <bar>(baz) | ||||
[foo] | ||||
. | ||||
<p>[foo]: <bar>(baz)</p> | ||||
<p>[foo]</p> | ||||
```````````````````````````````` | ||||
Both title and destination can contain backslash escapes | Both title and destination can contain backslash escapes | |||
and literal backslashes: | and literal backslashes: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
[foo]: /url\bar\*baz "foo\"bar\baz" | [foo]: /url\bar\*baz "foo\"bar\baz" | |||
[foo] | [foo] | |||
. | . | |||
<p><a href="/url%5Cbar*baz" title="foo"bar\baz">foo</a></p> | <p><a href="/url%5Cbar*baz" title="foo"bar\baz">foo</a></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
skipping to change at line 2858 ¶ | skipping to change at line 2913 ¶ | |||
# [Foo] | # [Foo] | |||
[foo]: /url | [foo]: /url | |||
> bar | > bar | |||
. | . | |||
<h1><a href="/url">Foo</a></h1> | <h1><a href="/url">Foo</a></h1> | |||
<blockquote> | <blockquote> | |||
<p>bar</p> | <p>bar</p> | |||
</blockquote> | </blockquote> | |||
```````````````````````````````` | ```````````````````````````````` | |||
```````````````````````````````` example | ||||
[foo]: /url | ||||
bar | ||||
=== | ||||
[foo] | ||||
. | ||||
<h1>bar</h1> | ||||
<p><a href="/url">foo</a></p> | ||||
```````````````````````````````` | ||||
```````````````````````````````` example | ||||
[foo]: /url | ||||
=== | ||||
[foo] | ||||
. | ||||
<p>=== | ||||
<a href="/url">foo</a></p> | ||||
```````````````````````````````` | ||||
Several [link reference definitions] | Several [link reference definitions] | |||
can occur one after another, without intervening blank lines. | can occur one after another, without intervening blank lines. | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
[foo]: /foo-url "foo" | [foo]: /foo-url "foo" | |||
[bar]: /bar-url | [bar]: /bar-url | |||
"bar" | "bar" | |||
[baz]: /baz-url | [baz]: /baz-url | |||
[foo], | [foo], | |||
skipping to change at line 2891 ¶ | skipping to change at line 2965 ¶ | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
[foo] | [foo] | |||
> [foo]: /url | > [foo]: /url | |||
. | . | |||
<p><a href="/url">foo</a></p> | <p><a href="/url">foo</a></p> | |||
<blockquote> | <blockquote> | |||
</blockquote> | </blockquote> | |||
```````````````````````````````` | ```````````````````````````````` | |||
Whether something is a [link reference definition] is | ||||
independent of whether the link reference it defines is | ||||
used in the document. Thus, for example, the following | ||||
document contains just a link reference definition, and | ||||
no visible content: | ||||
```````````````````````````````` example | ||||
[foo]: /url | ||||
. | ||||
```````````````````````````````` | ||||
## Paragraphs | ## Paragraphs | |||
A sequence of non-blank lines that cannot be interpreted as other | A sequence of non-blank lines that cannot be interpreted as other | |||
kinds of blocks forms a [paragraph](@). | kinds of blocks forms a [paragraph](@). | |||
The contents of the paragraph are the result of parsing the | The contents of the paragraph are the result of parsing the | |||
paragraph's raw content as inlines. The paragraph's raw content | paragraph's raw content as inlines. The paragraph's raw content | |||
is formed by concatenating the lines and removing initial and final | is formed by concatenating the lines and removing initial and final | |||
[whitespace]. | [whitespace]. | |||
A simple example with two paragraphs: | A simple example with two paragraphs: | |||
skipping to change at line 3013 ¶ | skipping to change at line 3098 ¶ | |||
# aaa | # aaa | |||
. | . | |||
<p>aaa</p> | <p>aaa</p> | |||
<h1>aaa</h1> | <h1>aaa</h1> | |||
```````````````````````````````` | ```````````````````````````````` | |||
# Container blocks | # Container blocks | |||
A [container block] is a block that has other | A [container block](#container-blocks) is a block that has other | |||
blocks as its contents. There are two basic kinds of container blocks: | blocks as its contents. There are two basic kinds of container blocks: | |||
[block quotes] and [list items]. | [block quotes] and [list items]. | |||
[Lists] are meta-containers for [list items]. | [Lists] are meta-containers for [list items]. | |||
We define the syntax for container blocks recursively. The general | We define the syntax for container blocks recursively. The general | |||
form of the definition is: | form of the definition is: | |||
> If X is a sequence of blocks, then the result of | > If X is a sequence of blocks, then the result of | |||
> transforming X in such-and-such a way is a container of type Y | > transforming X in such-and-such a way is a container of type Y | |||
> with these blocks as its content. | > with these blocks as its content. | |||
skipping to change at line 3449 ¶ | skipping to change at line 3534 ¶ | |||
An [ordered list marker](@) | An [ordered list marker](@) | |||
is a sequence of 1--9 arabic digits (`0-9`), followed by either a | is a sequence of 1--9 arabic digits (`0-9`), followed by either a | |||
`.` character or a `)` character. (The reason for the length | `.` character or a `)` character. (The reason for the length | |||
limit is that with 10 digits we start seeing integer overflows | limit is that with 10 digits we start seeing integer overflows | |||
in some browsers.) | in some browsers.) | |||
The following rules define [list items]: | The following rules define [list items]: | |||
1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of | 1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of | |||
blocks *Bs* starting with a [non-whitespace character] and not separated | blocks *Bs* starting with a [non-whitespace character], and *M* is a | |||
from each other by more than one blank line, and *M* is a list | list marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result | |||
marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result | ||||
of prepending *M* and the following spaces to the first line of | of prepending *M* and the following spaces to the first line of | |||
*Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a | *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a | |||
list item with *Bs* as its contents. The type of the list item | list item with *Bs* as its contents. The type of the list item | |||
(bullet or ordered) is determined by the type of its list marker. | (bullet or ordered) is determined by the type of its list marker. | |||
If the list item is ordered, then it is also assigned a start | If the list item is ordered, then it is also assigned a start | |||
number, based on the ordered list marker. | number, based on the ordered list marker. | |||
Exceptions: | Exceptions: | |||
1. When the first list item in a [list] interrupts | 1. When the first list item in a [list] interrupts | |||
skipping to change at line 3741 ¶ | skipping to change at line 3825 ¶ | |||
A start number may not be negative: | A start number may not be negative: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
-1. not ok | -1. not ok | |||
. | . | |||
<p>-1. not ok</p> | <p>-1. not ok</p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
2. **Item starting with indented code.** If a sequence of lines *Ls* | 2. **Item starting with indented code.** If a sequence of lines *Ls* | |||
constitute a sequence of blocks *Bs* starting with an indented code | constitute a sequence of blocks *Bs* starting with an indented code | |||
block and not separated from each other by more than one blank line, | block, and *M* is a list marker of width *W* followed by | |||
and *M* is a list marker of width *W* followed by | ||||
one space, then the result of prepending *M* and the following | one space, then the result of prepending *M* and the following | |||
space to the first line of *Ls*, and indenting subsequent lines of | space to the first line of *Ls*, and indenting subsequent lines of | |||
*Ls* by *W + 1* spaces, is a list item with *Bs* as its contents. | *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents. | |||
If a line is empty, then it need not be indented. The type of the | If a line is empty, then it need not be indented. The type of the | |||
list item (bullet or ordered) is determined by the type of its list | list item (bullet or ordered) is determined by the type of its list | |||
marker. If the list item is ordered, then it is also assigned a | marker. If the list item is ordered, then it is also assigned a | |||
start number, based on the ordered list marker. | start number, based on the ordered list marker. | |||
An indented code block will have to be indented four spaces beyond | An indented code block will have to be indented four spaces beyond | |||
the edge of the region where text will be included in the list item. | the edge of the region where text will be included in the list item. | |||
skipping to change at line 4194 ¶ | skipping to change at line 4277 ¶ | |||
continued here.</p> | continued here.</p> | |||
</blockquote> | </blockquote> | |||
</li> | </li> | |||
</ol> | </ol> | |||
</blockquote> | </blockquote> | |||
```````````````````````````````` | ```````````````````````````````` | |||
6. **That's all.** Nothing that is not counted as a list item by rules | 6. **That's all.** Nothing that is not counted as a list item by rules | |||
#1--5 counts as a [list item](#list-items). | #1--5 counts as a [list item](#list-items). | |||
The rules for sublists follow from the general rules above. A sublist | The rules for sublists follow from the general rules | |||
must be indented the same number of spaces a paragraph would need to be | [above][List items]. A sublist must be indented the same number | |||
in order to be included in the list item. | of spaces a paragraph would need to be in order to be included | |||
in the list item. | ||||
So, in this case we need two spaces indent: | So, in this case we need two spaces indent: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
- foo | - foo | |||
- bar | - bar | |||
- baz | - baz | |||
- boo | - boo | |||
. | . | |||
<ul> | <ul> | |||
skipping to change at line 4771 ¶ | skipping to change at line 4855 ¶ | |||
List items need not be indented to the same level. The following | List items need not be indented to the same level. The following | |||
list items will be treated as items at the same list level, | list items will be treated as items at the same list level, | |||
since none is indented enough to belong to the previous list | since none is indented enough to belong to the previous list | |||
item: | item: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
- a | - a | |||
- b | - b | |||
- c | - c | |||
- d | - d | |||
- e | - e | |||
- f | - f | |||
- g | - g | |||
- h | ||||
- i | ||||
. | . | |||
g | ||||
<ul> | <ul> | |||
<li>a</li> | <li>a</li> | |||
<li>b</li> | <li>b</li> | |||
<li>c</li> | <li>c</li> | |||
<li>d</li> | <li>d</li> | |||
<li>e</li> | <li>e</li> | |||
<li>f</li> | <li>f</li> | |||
<li>g</li> | <li>g</li> | |||
<li>h</li> | ||||
<li>i</li> | ||||
</ul> | </ul> | |||
```````````````````````````````` | ```````````````````````````````` | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
1. a | 1. a | |||
2. b | 2. b | |||
3. c | 3. c | |||
. | . | |||
<ol> | <ol> | |||
<li> | <li> | |||
<p>a</p> | <p>a</p> | |||
</li> | </li> | |||
<li> | <li> | |||
<p>b</p> | <p>b</p> | |||
</li> | </li> | |||
<li> | <li> | |||
<p>c</p> | <p>c</p> | |||
</li> | </li> | |||
</ol> | </ol> | |||
```````````````````````````````` | ```````````````````````````````` | |||
Note, however, that list items may not be indented more than | ||||
three spaces. Here `- e` is treated as a paragraph continuation | ||||
line, because it is indented more than three spaces: | ||||
```````````````````````````````` example | ||||
- a | ||||
- b | ||||
- c | ||||
- d | ||||
- e | ||||
. | ||||
<ul> | ||||
<li>a</li> | ||||
<li>b</li> | ||||
<li>c</li> | ||||
<li>d | ||||
- e</li> | ||||
</ul> | ||||
```````````````````````````````` | ||||
And here, `3. c` is treated as in indented code block, | ||||
because it is indented four spaces and preceded by a | ||||
blank line. | ||||
```````````````````````````````` example | ||||
1. a | ||||
2. b | ||||
3. c | ||||
. | ||||
<ol> | ||||
<li> | ||||
<p>a</p> | ||||
</li> | ||||
<li> | ||||
<p>b</p> | ||||
</li> | ||||
</ol> | ||||
<pre><code>3. c | ||||
</code></pre> | ||||
```````````````````````````````` | ||||
This is a loose list, because there is a blank line between | This is a loose list, because there is a blank line between | |||
two of the list items: | two of the list items: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
- a | - a | |||
- b | - b | |||
- c | - c | |||
. | . | |||
<ul> | <ul> | |||
skipping to change at line 5117 ¶ | skipping to change at line 5240 ¶ | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
\*not emphasized* | \*not emphasized* | |||
\<br/> not a tag | \<br/> not a tag | |||
\[not a link](/foo) | \[not a link](/foo) | |||
\`not code` | \`not code` | |||
1\. not a list | 1\. not a list | |||
\* not a list | \* not a list | |||
\# not a heading | \# not a heading | |||
\[foo]: /url "not a reference" | \[foo]: /url "not a reference" | |||
\ö not a character entity | ||||
. | . | |||
<p>*not emphasized* | <p>*not emphasized* | |||
<br/> not a tag | <br/> not a tag | |||
[not a link](/foo) | [not a link](/foo) | |||
`not code` | `not code` | |||
1. not a list | 1. not a list | |||
* not a list | * not a list | |||
# not a heading | # not a heading | |||
[foo]: /url "not a reference"</p> | [foo]: /url "not a reference" | |||
&ouml; not a character entity</p> | ||||
```````````````````````````````` | ```````````````````````````````` | |||
If a backslash is itself escaped, the following character is not: | If a backslash is itself escaped, the following character is not: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
\\*emphasis* | \\*emphasis* | |||
. | . | |||
<p>\<em>emphasis</em></p> | <p>\<em>emphasis</em></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
skipping to change at line 5211 ¶ | skipping to change at line 5336 ¶ | |||
``` foo\+bar | ``` foo\+bar | |||
foo | foo | |||
``` | ``` | |||
. | . | |||
<pre><code class="language-foo+bar">foo | <pre><code class="language-foo+bar">foo | |||
</code></pre> | </code></pre> | |||
```````````````````````````````` | ```````````````````````````````` | |||
## Entity and numeric character references | ## Entity and numeric character references | |||
All valid HTML entity references and numeric character | Valid HTML entity references and numeric character references | |||
references, except those occuring in code blocks and code spans, | can be used in place of the corresponding Unicode character, | |||
are recognized as such and treated as equivalent to the | with the following exceptions: | |||
corresponding Unicode characters. Conforming CommonMark parsers | ||||
need not store information about whether a particular character | - Entity and character references are not recognized in code | |||
was represented in the source using a Unicode character or | blocks and code spans. | |||
an entity reference. | ||||
- Entity and character references cannot stand in place of | ||||
special characters that define structural elements in | ||||
CommonMark. For example, although `*` can be used | ||||
in place of a literal `*` character, `*` cannot replace | ||||
`*` in emphasis delimiters, bullet list markers, or thematic | ||||
breaks. | ||||
Conforming CommonMark parsers need not store information about | ||||
whether a particular character was represented in the source | ||||
using a Unicode character or an entity reference. | ||||
[Entity references](@) consist of `&` + any of the valid | [Entity references](@) consist of `&` + any of the valid | |||
HTML5 entity names + `;`. The | HTML5 entity names + `;`. The | |||
document <https://html.spec.whatwg.org/multipage/entities.json> | document <https://html.spec.whatwg.org/multipage/entities.json> | |||
is used as an authoritative source for the valid entity | is used as an authoritative source for the valid entity | |||
references and their corresponding code points. | references and their corresponding code points. | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
& © Æ Ď | & © Æ Ď | |||
¾ ℋ ⅆ | ¾ ℋ ⅆ | |||
∲ ≧̸ | ∲ ≧̸ | |||
. | . | |||
<p> & © Æ Ď | <p> & © Æ Ď | |||
¾ ℋ ⅆ | ¾ ℋ ⅆ | |||
∲ ≧̸</p> | ∲ ≧̸</p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
[Decimal numeric character | [Decimal numeric character | |||
references](@) | references](@) | |||
consist of `&#` + a string of 1--8 arabic digits + `;`. A | consist of `&#` + a string of 1--7 arabic digits + `;`. A | |||
numeric character reference is parsed as the corresponding | numeric character reference is parsed as the corresponding | |||
Unicode character. Invalid Unicode code points will be replaced by | Unicode character. Invalid Unicode code points will be replaced by | |||
the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons, | the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons, | |||
the code point `U+0000` will also be replaced by `U+FFFD`. | the code point `U+0000` will also be replaced by `U+FFFD`. | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
# Ӓ Ϡ � � | # Ӓ Ϡ � | |||
. | . | |||
<p># Ӓ Ϡ � �</p> | <p># Ӓ Ϡ �</p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
[Hexadecimal numeric character | [Hexadecimal numeric character | |||
references](@) consist of `&#` + | references](@) consist of `&#` + | |||
either `X` or `x` + a string of 1-8 hexadecimal digits + `;`. | either `X` or `x` + a string of 1-6 hexadecimal digits + `;`. | |||
They too are parsed as the corresponding Unicode character (this | They too are parsed as the corresponding Unicode character (this | |||
time specified with a hexadecimal numeral instead of decimal). | time specified with a hexadecimal numeral instead of decimal). | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
" ആ ಫ | " ആ ಫ | |||
. | . | |||
<p>" ആ ಫ</p> | <p>" ആ ಫ</p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
Here are some nonentities: | Here are some nonentities: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
  &x; &#; &#x; |   &x; &#; &#x; | |||
� | ||||
&#abcdef0; | ||||
&ThisIsNotDefined; &hi?; | &ThisIsNotDefined; &hi?; | |||
. | . | |||
<p>&nbsp &x; &#; &#x; | <p>&nbsp &x; &#; &#x; | |||
&#987654321; | ||||
&#abcdef0; | ||||
&ThisIsNotDefined; &hi?;</p> | &ThisIsNotDefined; &hi?;</p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
Although HTML5 does accept some entity references | Although HTML5 does accept some entity references | |||
without a trailing semicolon (such as `©`), these are not | without a trailing semicolon (such as `©`), these are not | |||
recognized here, because it makes the grammar too ambiguous: | recognized here, because it makes the grammar too ambiguous: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
© | © | |||
. | . | |||
skipping to change at line 5339 ¶ | skipping to change at line 5478 ¶ | |||
<p><code>f&ouml;&ouml;</code></p> | <p><code>f&ouml;&ouml;</code></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
föfö | föfö | |||
. | . | |||
<pre><code>f&ouml;f&ouml; | <pre><code>f&ouml;f&ouml; | |||
</code></pre> | </code></pre> | |||
```````````````````````````````` | ```````````````````````````````` | |||
Entity and numeric character references cannot be used | ||||
in place of symbols indicating structure in CommonMark | ||||
documents. | ||||
```````````````````````````````` example | ||||
*foo* | ||||
*foo* | ||||
. | ||||
<p>*foo* | ||||
<em>foo</em></p> | ||||
```````````````````````````````` | ||||
```````````````````````````````` example | ||||
* foo | ||||
* foo | ||||
. | ||||
<p>* foo</p> | ||||
<ul> | ||||
<li>foo</li> | ||||
</ul> | ||||
```````````````````````````````` | ||||
```````````````````````````````` example | ||||
foo bar | ||||
. | ||||
<p>foo | ||||
bar</p> | ||||
```````````````````````````````` | ||||
```````````````````````````````` example | ||||
	foo | ||||
. | ||||
<p>→foo</p> | ||||
```````````````````````````````` | ||||
```````````````````````````````` example | ||||
[a](url "tit") | ||||
. | ||||
<p>[a](url "tit")</p> | ||||
```````````````````````````````` | ||||
## Code spans | ## Code spans | |||
A [backtick string](@) | A [backtick string](@) | |||
is a string of one or more backtick characters (`` ` ``) that is neither | is a string of one or more backtick characters (`` ` ``) that is neither | |||
preceded nor followed by a backtick. | preceded nor followed by a backtick. | |||
A [code span](@) begins with a backtick string and ends with | A [code span](@) begins with a backtick string and ends with | |||
a backtick string of equal length. The contents of the code span are | a backtick string of equal length. The contents of the code span are | |||
the characters between the two backtick strings, with leading and | the characters between the two backtick strings, normalized in the | |||
trailing spaces and [line endings] removed, and | following ways: | |||
[whitespace] collapsed to single spaces. | ||||
- First, [line endings] are converted to [spaces]. | ||||
- If the resulting string both begins *and* ends with a [space] | ||||
character, but does not consist entirely of [space] | ||||
characters, a single [space] character is removed from the | ||||
front and back. This allows you to include code that begins | ||||
or ends with backtick characters, which must be separated by | ||||
whitespace from the opening or closing backtick strings. | ||||
This is a simple code span: | This is a simple code span: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
`foo` | `foo` | |||
. | . | |||
<p><code>foo</code></p> | <p><code>foo</code></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
Here two backticks are used, because the code contains a backtick. | Here two backticks are used, because the code contains a backtick. | |||
This example also illustrates stripping of leading and trailing spaces: | This example also illustrates stripping of a single leading and | |||
trailing space: | ||||
```````````````````````````````` example | ```````````````````````````````` example | |||
`` foo ` bar `` | `` foo ` bar `` | |||
. | . | |||
<p><code>foo ` bar</code></p> | <p><code>foo ` bar</code></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
This example shows the motivation for stripping leading and trailing | This example shows the motivation for stripping leading and trailing | |||
spaces: | spaces: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
` `` ` | ` `` ` | |||
. | . | |||
<p><code>``</code></p> | <p><code>``</code></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
[Line endings] are treated like spaces: | Note that only *one* space is stripped: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
`` | ` `` ` | |||
foo | ||||
`` | ||||
. | . | |||
<p><code>foo</code></p> | <p><code> `` </code></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
Interior spaces and [line endings] are collapsed into | The stripping only happens if the space is on both | |||
single spaces, just as they would be by a browser: | sides of the string: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
`foo bar | ` a` | |||
baz` | ||||
. | . | |||
<p><code>foo bar baz</code></p> | <p><code> a</code></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
Not all [Unicode whitespace] (for instance, non-breaking space) is | Only [spaces], and not [unicode whitespace] in general, are | |||
collapsed, however: | stripped in this way: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
`a b` | ` b ` | |||
. | . | |||
<p><code>a b</code></p> | <p><code> b </code></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
Q: Why not just leave the spaces, since browsers will collapse them | No stripping occurs if the code span contains only spaces: | |||
anyway? A: Because we might be targeting a non-HTML format, and we | ||||
shouldn't rely on HTML-specific rendering assumptions. | ||||
(Existing implementations differ in their treatment of internal | ```````````````````````````````` example | |||
spaces and [line endings]. Some, including `Markdown.pl` and | ` ` | |||
`showdown`, convert an internal [line ending] into a | ` ` | |||
`<br />` tag. But this makes things difficult for those who like to | . | |||
hard-wrap their paragraphs, since a line break in the midst of a code | <p><code> </code> | |||
span will cause an unintended line break in the output. Others just | <code> </code></p> | |||
leave internal spaces as they are, which is fine if only HTML is being | ```````````````````````````````` | |||
targeted.) | ||||
[Line endings] are treated like spaces: | ||||
```````````````````````````````` example | ```````````````````````````````` example | |||
`foo `` bar` | `` | |||
foo | ||||
bar | ||||
baz | ||||
`` | ||||
. | . | |||
<p><code>foo `` bar</code></p> | <p><code>foo bar baz</code></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
```````````````````````````````` example | ||||
`` | ||||
foo | ||||
`` | ||||
. | ||||
<p><code>foo </code></p> | ||||
```````````````````````````````` | ||||
Interior spaces are not collapsed: | ||||
```````````````````````````````` example | ||||
`foo bar | ||||
baz` | ||||
. | ||||
<p><code>foo bar baz</code></p> | ||||
```````````````````````````````` | ||||
Note that browsers will typically collapse consecutive spaces | ||||
when rendering `<code>` elements, so it is recommended that | ||||
the following CSS be used: | ||||
code{white-space: pre-wrap;} | ||||
Note that backslash escapes do not work in code spans. All backslashes | Note that backslash escapes do not work in code spans. All backslashes | |||
are treated literally: | are treated literally: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
`foo\`bar` | `foo\`bar` | |||
. | . | |||
<p><code>foo\</code>bar`</p> | <p><code>foo\</code>bar`</p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
Backslash escapes are never needed, because one can always choose a | Backslash escapes are never needed, because one can always choose a | |||
string of *n* backtick characters as delimiters, where the code does | string of *n* backtick characters as delimiters, where the code does | |||
not contain any strings of exactly *n* backtick characters. | not contain any strings of exactly *n* backtick characters. | |||
```````````````````````````````` example | ||||
``foo`bar`` | ||||
. | ||||
<p><code>foo`bar</code></p> | ||||
```````````````````````````````` | ||||
```````````````````````````````` example | ||||
` foo `` bar ` | ||||
. | ||||
<p><code>foo `` bar</code></p> | ||||
```````````````````````````````` | ||||
Code span backticks have higher precedence than any other inline | Code span backticks have higher precedence than any other inline | |||
constructs except HTML tags and autolinks. Thus, for example, this is | constructs except HTML tags and autolinks. Thus, for example, this is | |||
not parsed as emphasized text, since the second `*` is part of a code | not parsed as emphasized text, since the second `*` is part of a code | |||
span: | span: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
*foo`*` | *foo`*` | |||
. | . | |||
<p>*foo<code>*</code></p> | <p>*foo<code>*</code></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
skipping to change at line 5567 ¶ | skipping to change at line 5792 ¶ | |||
The rules given below capture all of these patterns, while allowing | The rules given below capture all of these patterns, while allowing | |||
for efficient parsing strategies that do not backtrack. | for efficient parsing strategies that do not backtrack. | |||
First, some definitions. A [delimiter run](@) is either | First, some definitions. A [delimiter run](@) is either | |||
a sequence of one or more `*` characters that is not preceded or | a sequence of one or more `*` characters that is not preceded or | |||
followed by a non-backslash-escaped `*` character, or a sequence | followed by a non-backslash-escaped `*` character, or a sequence | |||
of one or more `_` characters that is not preceded or followed by | of one or more `_` characters that is not preceded or followed by | |||
a non-backslash-escaped `_` character. | a non-backslash-escaped `_` character. | |||
A [left-flanking delimiter run](@) is | A [left-flanking delimiter run](@) is | |||
a [delimiter run] that is (a) not followed by [Unicode whitespace], | a [delimiter run] that is (1) not followed by [Unicode whitespace], | |||
and (b) not followed by a [punctuation character], or | and either (2a) not followed by a [punctuation character], or | |||
(2b) followed by a [punctuation character] and | ||||
preceded by [Unicode whitespace] or a [punctuation character]. | preceded by [Unicode whitespace] or a [punctuation character]. | |||
For purposes of this definition, the beginning and the end of | For purposes of this definition, the beginning and the end of | |||
the line count as Unicode whitespace. | the line count as Unicode whitespace. | |||
A [right-flanking delimiter run](@) is | A [right-flanking delimiter run](@) is | |||
a [delimiter run] that is (a) not preceded by [Unicode whitespace], | a [delimiter run] that is (1) not preceded by [Unicode whitespace], | |||
and (b) not preceded by a [punctuation character], or | and either (2a) not preceded by a [punctuation character], or | |||
(2b) preceded by a [punctuation character] and | ||||
followed by [Unicode whitespace] or a [punctuation character]. | followed by [Unicode whitespace] or a [punctuation character]. | |||
For purposes of this definition, the beginning and the end of | For purposes of this definition, the beginning and the end of | |||
the line count as Unicode whitespace. | the line count as Unicode whitespace. | |||
Here are some examples of delimiter runs. | Here are some examples of delimiter runs. | |||
- left-flanking but not right-flanking: | - left-flanking but not right-flanking: | |||
``` | ``` | |||
***abc | ***abc | |||
skipping to change at line 5667 ¶ | skipping to change at line 5894 ¶ | |||
or (b) part of a [left-flanking delimiter run] | or (b) part of a [left-flanking delimiter run] | |||
followed by punctuation. | followed by punctuation. | |||
9. Emphasis begins with a delimiter that [can open emphasis] and ends | 9. Emphasis begins with a delimiter that [can open emphasis] and ends | |||
with a delimiter that [can close emphasis], and that uses the same | with a delimiter that [can close emphasis], and that uses the same | |||
character (`_` or `*`) as the opening delimiter. The | character (`_` or `*`) as the opening delimiter. The | |||
opening and closing delimiters must belong to separate | opening and closing delimiters must belong to separate | |||
[delimiter runs]. If one of the delimiters can both | [delimiter runs]. If one of the delimiters can both | |||
open and close emphasis, then the sum of the lengths of the | open and close emphasis, then the sum of the lengths of the | |||
delimiter runs containing the opening and closing delimiters | delimiter runs containing the opening and closing delimiters | |||
must not be a multiple of 3. | must not be a multiple of 3 unless both lengths are | |||
multiples of 3. | ||||
10. Strong emphasis begins with a delimiter that | 10. Strong emphasis begins with a delimiter that | |||
[can open strong emphasis] and ends with a delimiter that | [can open strong emphasis] and ends with a delimiter that | |||
[can close strong emphasis], and that uses the same character | [can close strong emphasis], and that uses the same character | |||
(`_` or `*`) as the opening delimiter. The | (`_` or `*`) as the opening delimiter. The | |||
opening and closing delimiters must belong to separate | opening and closing delimiters must belong to separate | |||
[delimiter runs]. If one of the delimiters can both open | [delimiter runs]. If one of the delimiters can both open | |||
and close strong emphasis, then the sum of the lengths of | and close strong emphasis, then the sum of the lengths of | |||
the delimiter runs containing the opening and closing | the delimiter runs containing the opening and closing | |||
delimiters must not be a multiple of 3. | delimiters must not be a multiple of 3 unless both lengths | |||
are multiples of 3. | ||||
11. A literal `*` character cannot occur at the beginning or end of | 11. A literal `*` character cannot occur at the beginning or end of | |||
`*`-delimited emphasis or `**`-delimited strong emphasis, unless it | `*`-delimited emphasis or `**`-delimited strong emphasis, unless it | |||
is backslash-escaped. | is backslash-escaped. | |||
12. A literal `_` character cannot occur at the beginning or end of | 12. A literal `_` character cannot occur at the beginning or end of | |||
`_`-delimited emphasis or `__`-delimited strong emphasis, unless it | `_`-delimited emphasis or `__`-delimited strong emphasis, unless it | |||
is backslash-escaped. | is backslash-escaped. | |||
Where rules 1--12 above are compatible with multiple parsings, | Where rules 1--12 above are compatible with multiple parsings, | |||
skipping to change at line 6234 ¶ | skipping to change at line 6463 ¶ | |||
Note that in the preceding case, the interpretation | Note that in the preceding case, the interpretation | |||
``` markdown | ``` markdown | |||
<p><em>foo</em><em>bar<em></em>baz</em></p> | <p><em>foo</em><em>bar<em></em>baz</em></p> | |||
``` | ``` | |||
is precluded by the condition that a delimiter that | is precluded by the condition that a delimiter that | |||
can both open and close (like the `*` after `foo`) | can both open and close (like the `*` after `foo`) | |||
cannot form emphasis if the sum of the lengths of | cannot form emphasis if the sum of the lengths of | |||
the delimiter runs containing the opening and | the delimiter runs containing the opening and | |||
closing delimiters is a multiple of 3. | closing delimiters is a multiple of 3 unless | |||
both lengths are multiples of 3. | ||||
For the same reason, we don't get two consecutive | ||||
emphasis sections in this example: | ||||
```````````````````````````````` example | ||||
*foo**bar* | ||||
. | ||||
<p><em>foo**bar</em></p> | ||||
```````````````````````````````` | ||||
The same condition ensures that the following | The same condition ensures that the following | |||
cases are all strong emphasis nested inside | cases are all strong emphasis nested inside | |||
emphasis, even when the interior spaces are | emphasis, even when the interior spaces are | |||
omitted: | omitted: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
***foo** bar* | ***foo** bar* | |||
. | . | |||
<p><em><strong>foo</strong> bar</em></p> | <p><em><strong>foo</strong> bar</em></p> | |||
skipping to change at line 6259 ¶ | skipping to change at line 6498 ¶ | |||
. | . | |||
<p><em>foo <strong>bar</strong></em></p> | <p><em>foo <strong>bar</strong></em></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
*foo**bar*** | *foo**bar*** | |||
. | . | |||
<p><em>foo<strong>bar</strong></em></p> | <p><em>foo<strong>bar</strong></em></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
When the lengths of the interior closing and opening | ||||
delimiter runs are *both* multiples of 3, though, | ||||
they can match to create emphasis: | ||||
```````````````````````````````` example | ||||
foo***bar***baz | ||||
. | ||||
<p>foo<em><strong>bar</strong></em>baz</p> | ||||
```````````````````````````````` | ||||
```````````````````````````````` example | ||||
foo******bar*********baz | ||||
. | ||||
<p>foo<strong><strong><strong>bar</strong></strong></strong>***baz</p> | ||||
```````````````````````````````` | ||||
Indefinite levels of nesting are possible: | Indefinite levels of nesting are possible: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
*foo **bar *baz* bim** bop* | *foo **bar *baz* bim** bop* | |||
. | . | |||
<p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p> | <p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
*foo [*bar*](/url)* | *foo [*bar*](/url)* | |||
skipping to change at line 6725 ¶ | skipping to change at line 6980 ¶ | |||
than the brackets in link text. Thus, for example, | than the brackets in link text. Thus, for example, | |||
`` [foo`]` `` could not be a link text, since the second `]` | `` [foo`]` `` could not be a link text, since the second `]` | |||
is part of a code span. | is part of a code span. | |||
- The brackets in link text bind more tightly than markers for | - The brackets in link text bind more tightly than markers for | |||
[emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link. | [emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link. | |||
A [link destination](@) consists of either | A [link destination](@) consists of either | |||
- a sequence of zero or more characters between an opening `<` and a | - a sequence of zero or more characters between an opening `<` and a | |||
closing `>` that contains no spaces, line breaks, or unescaped | closing `>` that contains no line breaks or unescaped | |||
`<` or `>` characters, or | `<` or `>` characters, or | |||
- a nonempty sequence of characters that does not include | - a nonempty sequence of characters that does not start with | |||
ASCII space or control characters, and includes parentheses | `<`, does not include ASCII space or control characters, and | |||
only if (a) they are backslash-escaped or (b) they are part of | includes parentheses only if (a) they are backslash-escaped or | |||
a balanced pair of unescaped parentheses. (Implementations | (b) they are part of a balanced pair of unescaped parentheses. | |||
may impose limits on parentheses nesting to avoid performance | (Implementations may impose limits on parentheses nesting to | |||
issues, but at least three levels of nesting should be supported.) | avoid performance issues, but at least three levels of nesting | |||
should be supported.) | ||||
A [link title](@) consists of either | A [link title](@) consists of either | |||
- a sequence of zero or more characters between straight double-quote | - a sequence of zero or more characters between straight double-quote | |||
characters (`"`), including a `"` character only if it is | characters (`"`), including a `"` character only if it is | |||
backslash-escaped, or | backslash-escaped, or | |||
- a sequence of zero or more characters between straight single-quote | - a sequence of zero or more characters between straight single-quote | |||
characters (`'`), including a `'` character only if it is | characters (`'`), including a `'` character only if it is | |||
backslash-escaped, or | backslash-escaped, or | |||
- a sequence of zero or more characters between matching parentheses | - a sequence of zero or more characters between matching parentheses | |||
(`(...)`), including a `)` character only if it is backslash-escaped. | (`(...)`), including a `(` or `)` character only if it is | |||
backslash-escaped. | ||||
Although [link titles] may span multiple lines, they may not contain | Although [link titles] may span multiple lines, they may not contain | |||
a [blank line]. | a [blank line]. | |||
An [inline link](@) consists of a [link text] followed immediately | An [inline link](@) consists of a [link text] followed immediately | |||
by a left parenthesis `(`, optional [whitespace], an optional | by a left parenthesis `(`, optional [whitespace], an optional | |||
[link destination], an optional [link title] separated from the link | [link destination], an optional [link title] separated from the link | |||
destination by [whitespace], optional [whitespace], and a right | destination by [whitespace], optional [whitespace], and a right | |||
parenthesis `)`. The link's text consists of the inlines contained | parenthesis `)`. The link's text consists of the inlines contained | |||
in the [link text] (excluding the enclosing square brackets). | in the [link text] (excluding the enclosing square brackets). | |||
skipping to change at line 6793 ¶ | skipping to change at line 7050 ¶ | |||
. | . | |||
<p><a href="">link</a></p> | <p><a href="">link</a></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
[link](<>) | [link](<>) | |||
. | . | |||
<p><a href="">link</a></p> | <p><a href="">link</a></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
The destination cannot contain spaces or line breaks, | The destination can only contain spaces if it is | |||
even if enclosed in pointy brackets: | enclosed in pointy brackets: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
[link](/my uri) | [link](/my uri) | |||
. | . | |||
<p>[link](/my uri)</p> | <p>[link](/my uri)</p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
[link](</my uri>) | [link](</my uri>) | |||
. | . | |||
<p>[link](</my uri>)</p> | <p><a href="/my%20uri">link</a></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
The destination cannot contain line breaks, | ||||
even if enclosed in pointy brackets: | ||||
```````````````````````````````` example | ```````````````````````````````` example | |||
[link](foo | [link](foo | |||
bar) | bar) | |||
. | . | |||
<p>[link](foo | <p>[link](foo | |||
bar)</p> | bar)</p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
[link](<foo | [link](<foo | |||
bar>) | bar>) | |||
. | . | |||
<p>[link](<foo | <p>[link](<foo | |||
bar>)</p> | bar>)</p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
The destination can contain `)` if it is enclosed | ||||
in pointy brackets: | ||||
```````````````````````````````` example | ||||
[a](<b)c>) | ||||
. | ||||
<p><a href="b)c">a</a></p> | ||||
```````````````````````````````` | ||||
Pointy brackets that enclose links must be unescaped: | ||||
```````````````````````````````` example | ||||
[link](<foo\>) | ||||
. | ||||
<p>[link](<foo>)</p> | ||||
```````````````````````````````` | ||||
These are not links, because the opening pointy bracket | ||||
is not matched properly: | ||||
```````````````````````````````` example | ||||
[a](<b)c | ||||
[a](<b)c> | ||||
[a](<b>c) | ||||
. | ||||
<p>[a](<b)c | ||||
[a](<b)c> | ||||
[a](<b>c)</p> | ||||
```````````````````````````````` | ||||
Parentheses inside the link destination may be escaped: | Parentheses inside the link destination may be escaped: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
[link](\(foo\)) | [link](\(foo\)) | |||
. | . | |||
<p><a href="(foo)">link</a></p> | <p><a href="(foo)">link</a></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
Any number of parentheses are allowed without escaping, as long as they are | Any number of parentheses are allowed without escaping, as long as they are | |||
balanced: | balanced: | |||
skipping to change at line 7837 ¶ | skipping to change at line 8127 ¶ | |||
<p>!<a href="/url" title="title">foo</a></p> | <p>!<a href="/url" title="title">foo</a></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
## Autolinks | ## Autolinks | |||
[Autolink](@)s are absolute URIs and email addresses inside | [Autolink](@)s are absolute URIs and email addresses inside | |||
`<` and `>`. They are parsed as links, with the URL or email address | `<` and `>`. They are parsed as links, with the URL or email address | |||
as the link label. | as the link label. | |||
A [URI autolink](@) consists of `<`, followed by an | A [URI autolink](@) consists of `<`, followed by an | |||
[absolute URI] not containing `<`, followed by `>`. It is parsed as | [absolute URI] followed by `>`. It is parsed as | |||
a link to the URI, with the URI as the link's label. | a link to the URI, with the URI as the link's label. | |||
An [absolute URI](@), | An [absolute URI](@), | |||
for these purposes, consists of a [scheme] followed by a colon (`:`) | for these purposes, consists of a [scheme] followed by a colon (`:`) | |||
followed by zero or more characters other than ASCII | followed by zero or more characters other than ASCII | |||
[whitespace] and control characters, `<`, and `>`. If | [whitespace] and control characters, `<`, and `>`. If | |||
the URI includes these characters, they must be percent-encoded | the URI includes these characters, they must be percent-encoded | |||
(e.g. `%20` for a space). | (e.g. `%20` for a space). | |||
For purposes of this spec, a [scheme](@) is any sequence | For purposes of this spec, a [scheme](@) is any sequence | |||
skipping to change at line 8031 ¶ | skipping to change at line 8321 ¶ | |||
consists of optional [whitespace], | consists of optional [whitespace], | |||
a `=` character, optional [whitespace], and an [attribute | a `=` character, optional [whitespace], and an [attribute | |||
value]. | value]. | |||
An [attribute value](@) | An [attribute value](@) | |||
consists of an [unquoted attribute value], | consists of an [unquoted attribute value], | |||
a [single-quoted attribute value], or a [double-quoted attribute value]. | a [single-quoted attribute value], or a [double-quoted attribute value]. | |||
An [unquoted attribute value](@) | An [unquoted attribute value](@) | |||
is a nonempty string of characters not | is a nonempty string of characters not | |||
including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``. | including [whitespace], `"`, `'`, `=`, `<`, `>`, or `` ` ``. | |||
A [single-quoted attribute value](@) | A [single-quoted attribute value](@) | |||
consists of `'`, zero or more | consists of `'`, zero or more | |||
characters not including `'`, and a final `'`. | characters not including `'`, and a final `'`. | |||
A [double-quoted attribute value](@) | A [double-quoted attribute value](@) | |||
consists of `"`, zero or more | consists of `"`, zero or more | |||
characters not including `"`, and a final `"`. | characters not including `"`, and a final `"`. | |||
An [open tag](@) consists of a `<` character, a [tag name], | An [open tag](@) consists of a `<` character, a [tag name], | |||
skipping to change at line 8144 ¶ | skipping to change at line 8434 ¶ | |||
<a href="hi'> <a href=hi'> | <a href="hi'> <a href=hi'> | |||
. | . | |||
<p><a href="hi'> <a href=hi'></p> | <p><a href="hi'> <a href=hi'></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
Illegal [whitespace]: | Illegal [whitespace]: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
< a>< | < a>< | |||
foo><bar/ > | foo><bar/ > | |||
<foo bar=baz | ||||
bim!bop /> | ||||
. | . | |||
<p>< a>< | <p>< a>< | |||
foo><bar/ ></p> | foo><bar/ > | |||
<foo bar=baz | ||||
bim!bop /></p> | ||||
```````````````````````````````` | ```````````````````````````````` | |||
Missing [whitespace]: | Missing [whitespace]: | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
<a href='bar'title=title> | <a href='bar'title=title> | |||
. | . | |||
<p><a href='bar'title=title></p> | <p><a href='bar'title=title></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
skipping to change at line 8326 ¶ | skipping to change at line 8620 ¶ | |||
<p><em>foo<br /> | <p><em>foo<br /> | |||
bar</em></p> | bar</em></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
Line breaks do not occur inside code spans | Line breaks do not occur inside code spans | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
`code | `code | |||
span` | span` | |||
. | . | |||
<p><code>code span</code></p> | <p><code>code span</code></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
```````````````````````````````` example | ```````````````````````````````` example | |||
`code\ | `code\ | |||
span` | span` | |||
. | . | |||
<p><code>code\ span</code></p> | <p><code>code\ span</code></p> | |||
```````````````````````````````` | ```````````````````````````````` | |||
or HTML tags: | or HTML tags: | |||
skipping to change at line 8731 ¶ | skipping to change at line 9025 ¶ | |||
Parameter `stack_bottom` sets a lower bound to how far we | Parameter `stack_bottom` sets a lower bound to how far we | |||
descend in the [delimiter stack]. If it is NULL, we can | descend in the [delimiter stack]. If it is NULL, we can | |||
go all the way to the bottom. Otherwise, we stop before | go all the way to the bottom. Otherwise, we stop before | |||
visiting `stack_bottom`. | visiting `stack_bottom`. | |||
Let `current_position` point to the element on the [delimiter stack] | Let `current_position` point to the element on the [delimiter stack] | |||
just above `stack_bottom` (or the first element if `stack_bottom` | just above `stack_bottom` (or the first element if `stack_bottom` | |||
is NULL). | is NULL). | |||
We keep track of the `openers_bottom` for each delimiter | We keep track of the `openers_bottom` for each delimiter | |||
type (`*`, `_`). Initialize this to `stack_bottom`. | type (`*`, `_`) and each length of the closing delimiter run | |||
(modulo 3). Initialize this to `stack_bottom`. | ||||
Then we repeat the following until we run out of potential | Then we repeat the following until we run out of potential | |||
closers: | closers: | |||
- Move `current_position` forward in the delimiter stack (if needed) | - Move `current_position` forward in the delimiter stack (if needed) | |||
until we find the first potential closer with delimiter `*` or `_`. | until we find the first potential closer with delimiter `*` or `_`. | |||
(This will be the potential closer closest | (This will be the potential closer closest | |||
to the beginning of the input -- the first one in parse order.) | to the beginning of the input -- the first one in parse order.) | |||
- Now, look back in the stack (staying above `stack_bottom` and | - Now, look back in the stack (staying above `stack_bottom` and | |||
skipping to change at line 8763 ¶ | skipping to change at line 9058 ¶ | |||
+ Remove any delimiters between the opener and closer from | + Remove any delimiters between the opener and closer from | |||
the delimiter stack. | the delimiter stack. | |||
+ Remove 1 (for regular emph) or 2 (for strong emph) delimiters | + Remove 1 (for regular emph) or 2 (for strong emph) delimiters | |||
from the opening and closing text nodes. If they become empty | from the opening and closing text nodes. If they become empty | |||
as a result, remove them and remove the corresponding element | as a result, remove them and remove the corresponding element | |||
of the delimiter stack. If the closing node is removed, reset | of the delimiter stack. If the closing node is removed, reset | |||
`current_position` to the next element in the stack. | `current_position` to the next element in the stack. | |||
- If none in found: | - If none is found: | |||
+ Set `openers_bottom` to the element before `current_position`. | + Set `openers_bottom` to the element before `current_position`. | |||
(We know that there are no openers for this kind of closer up to and | (We know that there are no openers for this kind of closer up to and | |||
including this point, so this puts a lower bound on future searches.) | including this point, so this puts a lower bound on future searches.) | |||
+ If the closer at `current_position` is not a potential opener, | + If the closer at `current_position` is not a potential opener, | |||
remove it from the delimiter stack (since we know it can't | remove it from the delimiter stack (since we know it can't | |||
be a closer either). | be a closer either). | |||
+ Advance `current_position` to the next element in the stack. | + Advance `current_position` to the next element in the stack. | |||
End of changes. 79 change blocks. | ||||
123 lines changed or deleted | 416 lines changed or added | |||
This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |