spec.txt | spec.txt | |||
---|---|---|---|---|
--- | --- | |||
title: CommonMark Spec | title: CommonMark Spec | |||
author: | author: | |||
- John MacFarlane | - John MacFarlane | |||
version: 0.13 | version: 0.14 | |||
date: 2014-12-10 | date: 2014-12-10 | |||
license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)' | ||||
... | ... | |||
# Introduction | # Introduction | |||
## What is Markdown? | ## What is Markdown? | |||
Markdown is a plain text format for writing structured documents, | Markdown is a plain text format for writing structured documents, | |||
based on conventions used for indicating formatting in email and | based on conventions used for indicating formatting in email and | |||
usenet posts. It was developed in 2004 by John Gruber, who wrote | usenet posts. It was developed in 2004 by John Gruber, who wrote | |||
the first Markdown-to-HTML converter in perl, and it soon became | the first Markdown-to-HTML converter in perl, and it soon became | |||
skipping to change at line 131 | skipping to change at line 132 | |||
``` | ``` | |||
10. What are the precedence rules between block-level and inline-level | 10. What are the precedence rules between block-level and inline-level | |||
structure? For example, how should the following be parsed? | structure? For example, how should the following be parsed? | |||
``` markdown | ``` markdown | |||
- `a long code span can contain a hyphen like this | - `a long code span can contain a hyphen like this | |||
- and it can screw things up` | - and it can screw things up` | |||
``` | ``` | |||
11. Can list items include headers? (`Markdown.pl` does not allow this, | 11. Can list items include section headers? (`Markdown.pl` does not | |||
but headers can occur in blockquotes.) | allow this, but does allow blockquotes to include headers.) | |||
``` markdown | ``` markdown | |||
- # Heading | - # Heading | |||
``` | ``` | |||
12. Can link references be defined inside block quotes or list items? | 12. Can list items be empty? | |||
``` markdown | ||||
* a | ||||
* | ||||
* b | ||||
``` | ||||
13. Can link references be defined inside block quotes or list items? | ||||
``` markdown | ``` markdown | |||
> Blockquote [foo]. | > Blockquote [foo]. | |||
> | > | |||
> [foo]: /url | > [foo]: /url | |||
``` | ``` | |||
13. If there are multiple definitions for the same reference, which takes | 14. If there are multiple definitions for the same reference, which takes | |||
precedence? | precedence? | |||
``` markdown | ``` markdown | |||
[foo]: /url1 | [foo]: /url1 | |||
[foo]: /url2 | [foo]: /url2 | |||
[foo][] | [foo][] | |||
``` | ``` | |||
In the absence of a spec, early implementers consulted `Markdown.pl` | In the absence of a spec, early implementers consulted `Markdown.pl` | |||
skipping to change at line 192 | skipping to change at line 201 | |||
choice of HTML for the tests makes it possible to run the tests against | choice of HTML for the tests makes it possible to run the tests against | |||
an implementation without writing an abstract syntax tree renderer. | an implementation without writing an abstract syntax tree renderer. | |||
This document is generated from a text file, `spec.txt`, written | This document is generated from a text file, `spec.txt`, written | |||
in Markdown with a small extension for the side-by-side tests. | in Markdown with a small extension for the side-by-side tests. | |||
The script `spec2md.pl` can be used to turn `spec.txt` into pandoc | The script `spec2md.pl` can be used to turn `spec.txt` into pandoc | |||
Markdown, which can then be converted into other formats. | Markdown, which can then be converted into other formats. | |||
In the examples, the `→` character is used to represent tabs. | In the examples, the `→` character is used to represent tabs. | |||
# Preprocessing | # Preliminaries | |||
## Characters and lines | ||||
The input is a sequence of zero or more [lines](#line). | ||||
A [line](@line) | A [line](@line) | |||
is a sequence of zero or more [characters](#character) followed by a | is a sequence of zero or more [characters](#character) followed by a | |||
line ending (CR, LF, or CRLF) or by the end of file. | [line ending](#line-ending) or by the end of file. | |||
A [character](@character) is a unicode code point. | A [character](@character) is a unicode code point. | |||
This spec does not specify an encoding; it thinks of lines as composed | This spec does not specify an encoding; it thinks of lines as composed | |||
of characters rather than bytes. A conforming parser may be limited | of characters rather than bytes. A conforming parser may be limited | |||
to a certain encoding. | to a certain encoding. | |||
A [line ending](@line-ending) is, depending on the platform, a | ||||
newline (`U+000A`), carriage return (`U+000D`), or | ||||
carriage return + newline. | ||||
For security reasons, a conforming parser must strip or replace the | ||||
Unicode character `U+0000`. | ||||
A line containing no characters, or a line containing only spaces | ||||
(`U+0020`) or tabs (`U+0009`), is called a [blank line](@blank-line). | ||||
The following definitions of character classes will be used in this spec: | ||||
A [whitespace character](@whitespace-character) is a space | ||||
(`U+0020`), tab (`U+0009`), carriage return (`U+000D`), or | ||||
newline (`U+000A`). | ||||
[Whitespace](@whitespace) is a sequence of one or more [whitespace | ||||
characters](#whitespace-character). | ||||
A [unicode whitespace character](@unicode-whitespace-character) is | ||||
any code point in the unicode `Zs` class, or a tab (`U+0009`), | ||||
carriage return (`U+000D`), newline (`U+000A`), or form feed | ||||
(`U+000C`). | ||||
[Unicode whitespace](@unicode-whitespace) is a sequence of one | ||||
or more [unicode whitespace characters](#unicode-whitespace-character). | ||||
A [non-space character](@non-space-character) is anything but `U+0020`. | ||||
An [ASCII punctuation character](@ascii-punctuation-character) | ||||
is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`, | ||||
`*`, `+`, `,`, `-`, `.`, `/`, `:`, `;`, `<`, `=`, `>`, `?`, `@`, | ||||
`[`, `\`, `]`, `^`, `_`, `` ` ``, `{`, `|`, `}`, or `~`. | ||||
A [punctuation character](@punctuation-character) is an [ASCII | ||||
punctuation character](#ascii-punctuation-character) or anything in | ||||
the unicode classes `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`. | ||||
## Tab expansion | ||||
Tabs in lines are expanded to spaces, with a tab stop of 4 characters: | Tabs in lines are expanded to spaces, with a tab stop of 4 characters: | |||
. | . | |||
→foo→baz→→bim | →foo→baz→→bim | |||
. | . | |||
<pre><code>foo baz bim | <pre><code>foo baz bim | |||
</code></pre> | </code></pre> | |||
. | . | |||
. | . | |||
a→a | a→a | |||
ὐ→a | ὐ→a | |||
. | . | |||
<pre><code>a a | <pre><code>a a | |||
ὐ a | ὐ a | |||
</code></pre> | </code></pre> | |||
. | . | |||
Line endings are replaced by newline characters (LF). | ||||
A line containing no characters, or a line containing only spaces (after | ||||
tab expansion), is called a [blank line](@blank-line). | ||||
For security reasons, a conforming parser must strip or replace the | ||||
Unicode character `U+0000`. | ||||
# Blocks and inlines | # Blocks and inlines | |||
We can think of a document as a sequence of | We can think of a document as a sequence of | |||
[blocks](@block)---structural | [blocks](@block)---structural | |||
elements like paragraphs, block quotations, | elements like paragraphs, block quotations, | |||
lists, headers, rules, and code blocks. Blocks can contain other | lists, headers, rules, and code blocks. Blocks can contain other | |||
blocks, or they can contain [inline](@inline) content: | blocks, or they can contain [inline](@inline) content: | |||
words, spaces, links, emphasized text, images, and inline code. | words, spaces, links, emphasized text, images, and inline code. | |||
## Precedence | ## Precedence | |||
skipping to change at line 397 | skipping to change at line 442 | |||
a------ | a------ | |||
---a--- | ---a--- | |||
. | . | |||
<p>_ _ _ _ a</p> | <p>_ _ _ _ a</p> | |||
<p>a------</p> | <p>a------</p> | |||
<p>---a---</p> | <p>---a---</p> | |||
. | . | |||
It is required that all of the non-space characters be the same. | It is required that all of the | |||
[non-space characters](#non-space-character) be the same. | ||||
So, this is not a horizontal rule: | So, this is not a horizontal rule: | |||
. | . | |||
*-* | *-* | |||
. | . | |||
<p><em>-</em></p> | <p><em>-</em></p> | |||
. | . | |||
Horizontal rules do not need blank lines before or after: | Horizontal rules do not need blank lines before or after: | |||
skipping to change at line 686 | skipping to change at line 732 | |||
. | . | |||
## Setext headers | ## Setext headers | |||
A [setext header](@setext-header) | A [setext header](@setext-header) | |||
consists of a line of text, containing at least one nonspace character, | consists of a line of text, containing at least one nonspace character, | |||
with no more than 3 spaces indentation, followed by a [setext header | with no more than 3 spaces indentation, followed by a [setext header | |||
underline](#setext-header-underline). The line of text must be | underline](#setext-header-underline). The line of text must be | |||
one that, were it not followed by the setext header underline, | one that, were it not followed by the setext header underline, | |||
would be interpreted as part of a paragraph: it cannot be a code | would be interpreted as part of a paragraph: it cannot be a code | |||
block, header, blockquote, horizontal rule, or list. A [setext header | block, header, blockquote, horizontal rule, or list. | |||
underline](@setext-header-underline) | ||||
is a sequence of `=` characters or a sequence of `-` characters, with no | A [setext header underline](@setext-header-underline) is a sequence of | |||
more than 3 spaces indentation and any number of trailing | `=` characters or a sequence of `-` characters, with no more than 3 | |||
spaces. The header is a level 1 header if `=` characters are used, and | spaces indentation and any number of trailing spaces. If a line | |||
a level 2 header if `-` characters are used. The contents of the header | containing a single `-` can be interpreted as an | |||
are the result of parsing the first line as Markdown inline content. | empty [list item](#list-items), it should be interpreted this way | |||
and not as a [setext header underline](#setext-header-underline). | ||||
The header is a level 1 header if `=` characters are used in the | ||||
[setext header underline](#setext-header-underline), and a level 2 | ||||
header if `-` characters are used. The contents of the header are the | ||||
result of parsing the first line as Markdown inline content. | ||||
In general, a setext header need not be preceded or followed by a | In general, a setext header need not be preceded or followed by a | |||
blank line. However, it cannot interrupt a paragraph, so when a | blank line. However, it cannot interrupt a paragraph, so when a | |||
setext header comes after a paragraph, a blank line is needed between | setext header comes after a paragraph, a blank line is needed between | |||
them. | them. | |||
Simple examples: | Simple examples: | |||
. | . | |||
Foo *bar* | Foo *bar* | |||
skipping to change at line 951 | skipping to change at line 1003 | |||
. | . | |||
\> foo | \> foo | |||
------ | ------ | |||
. | . | |||
<h2>> foo</h2> | <h2>> foo</h2> | |||
. | . | |||
## Indented code blocks | ## Indented code blocks | |||
An [indented code block](@indented-code-block) | An [indented code block](@indented-code-block) is composed of one or more | |||
is composed of one or more | ||||
[indented chunks](#indented-chunk) separated by blank lines. | [indented chunks](#indented-chunk) separated by blank lines. | |||
An [indented chunk](@indented-chunk) | An [indented chunk](@indented-chunk) is a sequence of non-blank lines, | |||
is a sequence of non-blank lines, each indented four or more | each indented four or more spaces. The contents of the code block are | |||
spaces. An indented code block cannot interrupt a paragraph, so | the literal contents of the lines, including trailing | |||
if it occurs before or after a paragraph, there must be an | [line endings](#line-ending), minus four spaces of indentation. | |||
intervening blank line. The contents of the code block are | An indented code block has no attributes. | |||
the literal contents of the lines, including trailing newlines, | ||||
minus four spaces of indentation. An indented code block has no | An indented code block cannot interrupt a paragraph, so there must be | |||
attributes. | a blank line between a paragraph and a following indented code block. | |||
(A blank line is not needed, however, between a code block and a following | ||||
paragraph.) | ||||
. | . | |||
a simple | a simple | |||
indented code block | indented code block | |||
. | . | |||
<pre><code>a simple | <pre><code>a simple | |||
indented code block | indented code block | |||
</code></pre> | </code></pre> | |||
. | . | |||
skipping to change at line 1742 | skipping to change at line 1795 | |||
Moreover, blank lines are usually not necessary and can be | Moreover, blank lines are usually not necessary and can be | |||
deleted. The exception is inside `<pre>` tags; here, one can | deleted. The exception is inside `<pre>` tags; here, one can | |||
replace the blank lines with ` ` entities. | replace the blank lines with ` ` entities. | |||
So there is no important loss of expressive power with the new rule. | So there is no important loss of expressive power with the new rule. | |||
## Link reference definitions | ## Link reference definitions | |||
A [link reference definition](@link-reference-definition) | A [link reference definition](@link-reference-definition) | |||
consists of a [link | consists of a [link label](#link-label), indented up to three spaces, followed | |||
label](#link-label), indented up to three spaces, followed | by a colon (`:`), optional [whitespace](#whitespace) (including up to one | |||
by a colon (`:`), optional blank space (including up to one | [line ending](#line-ending)), a [link destination](#link-destination), | |||
newline), a [link destination](#link-destination), optional | optional [whitespace](#whitespace) (including up to one | |||
blank space (including up to one newline), and an optional [link | [line ending](#line-ending)), and an optional [link | |||
title](#link-title), which if it is present must be separated | title](#link-title), which if it is present must be separated | |||
from the [link destination](#link-destination) by whitespace. | from the [link destination](#link-destination) by [whitespace](#whitespace). | |||
No further non-space characters may occur on the line. | No further [non-space characters](#non-space-character) may occur on the line. | |||
A [link reference-definition](#link-reference-definition) | A [link reference-definition](#link-reference-definition) | |||
does not correspond to a structural element of a document. Instead, it | does not correspond to a structural element of a document. Instead, it | |||
defines a label which can be used in [reference links](#reference-link) | defines a label which can be used in [reference links](#reference-link) | |||
and reference-style [images](#image) elsewhere in the document. [Link | and reference-style [images](#images) elsewhere in the document. [Link | |||
reference definitions] can come either before or after the links that use | reference definitions] can come either before or after the links that use | |||
them. | them. | |||
. | . | |||
[foo]: /url "title" | [foo]: /url "title" | |||
[foo] | [foo] | |||
. | . | |||
<p><a href="/url" title="title">foo</a></p> | <p><a href="/url" title="title">foo</a></p> | |||
. | . | |||
skipping to change at line 1866 | skipping to change at line 1919 | |||
Here is a link reference definition with no corresponding link. | Here is a link reference definition with no corresponding link. | |||
It contributes nothing to the document. | It contributes nothing to the document. | |||
. | . | |||
[foo]: /url | [foo]: /url | |||
. | . | |||
. | . | |||
This is not a link reference definition, because there are | This is not a link reference definition, because there are | |||
non-space characters after the title: | [non-space characters](#non-space-character) after the title: | |||
. | . | |||
[foo]: /url "title" ok | [foo]: /url "title" ok | |||
. | . | |||
<p>[foo]: /url "title" ok</p> | <p>[foo]: /url "title" ok</p> | |||
. | . | |||
This is not a link reference definition, because it is indented | This is not a link reference definition, because it is indented | |||
four spaces: | four spaces: | |||
skipping to change at line 1930 | skipping to change at line 1983 | |||
# [Foo] | # [Foo] | |||
[foo]: /url | [foo]: /url | |||
> bar | > bar | |||
. | . | |||
<h1><a href="/url">Foo</a></h1> | <h1><a href="/url">Foo</a></h1> | |||
<blockquote> | <blockquote> | |||
<p>bar</p> | <p>bar</p> | |||
</blockquote> | </blockquote> | |||
. | . | |||
Several [link references](#link-reference) can occur one after another, | Several [link references definitions](#link-reference-definition) | |||
without intervening blank lines. | can occur one after another, without intervening blank lines. | |||
. | . | |||
[foo]: /foo-url "foo" | [foo]: /foo-url "foo" | |||
[bar]: /bar-url | [bar]: /bar-url | |||
"bar" | "bar" | |||
[baz]: /baz-url | [baz]: /baz-url | |||
[foo], | [foo], | |||
[bar], | [bar], | |||
[baz] | [baz] | |||
skipping to change at line 2119 | skipping to change at line 2172 | |||
The following rules define [block quotes](@block-quote): | The following rules define [block quotes](@block-quote): | |||
1. **Basic case.** If a string of lines *Ls* constitute a sequence | 1. **Basic case.** If a string of lines *Ls* constitute a sequence | |||
of blocks *Bs*, then the result of prepending a [block quote | of blocks *Bs*, then the result of prepending a [block quote | |||
marker](#block-quote-marker) to the beginning of each line in *Ls* | marker](#block-quote-marker) to the beginning of each line in *Ls* | |||
is a [block quote](#block-quote) containing *Bs*. | is a [block quote](#block-quote) containing *Bs*. | |||
2. **Laziness.** If a string of lines *Ls* constitute a [block | 2. **Laziness.** If a string of lines *Ls* constitute a [block | |||
quote](#block-quote) with contents *Bs*, then the result of deleting | quote](#block-quote) with contents *Bs*, then the result of deleting | |||
the initial [block quote marker](#block-quote-marker) from one or | the initial [block quote marker](#block-quote-marker) from one or | |||
more lines in which the next non-space character after the [block | more lines in which the next | |||
[non-space character](#non-space-character) after the [block | ||||
quote marker](#block-quote-marker) is [paragraph continuation | quote marker](#block-quote-marker) is [paragraph continuation | |||
text](#paragraph-continuation-text) is a block quote with *Bs* as | text](#paragraph-continuation-text) is a block quote with *Bs* as | |||
its content. | its content. | |||
[Paragraph continuation text](@paragraph-continuation-text) is text | [Paragraph continuation text](@paragraph-continuation-text) is text | |||
that will be parsed as part of the content of a paragraph, but does | that will be parsed as part of the content of a paragraph, but does | |||
not occur at the beginning of the paragraph. | not occur at the beginning of the paragraph. | |||
3. **Consecutiveness.** A document cannot contain two [block | 3. **Consecutiveness.** A document cannot contain two [block | |||
quotes](#block-quote) in a row unless there is a [blank | quotes](#block-quote) in a row unless there is a [blank | |||
line](#blank-line) between them. | line](#blank-line) between them. | |||
skipping to change at line 2479 | skipping to change at line 2533 | |||
A [bullet list marker](@bullet-list-marker) | A [bullet list marker](@bullet-list-marker) | |||
is a `-`, `+`, or `*` character. | is a `-`, `+`, or `*` character. | |||
An [ordered list marker](@ordered-list-marker) | An [ordered list marker](@ordered-list-marker) | |||
is a sequence of one of more digits (`0-9`), followed by either a | is a sequence of one of more digits (`0-9`), followed by either a | |||
`.` character or a `)` character. | `.` character or a `)` character. | |||
The following rules define [list items](@list-item): | The following rules define [list items](@list-item): | |||
1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of | 1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of | |||
blocks *Bs* starting with a non-space character and not separated | blocks *Bs* starting with a [non-space character](#non-space-character) | |||
and not separated | ||||
from each other by more than one blank line, and *M* is a list | from each other by more than one blank line, and *M* is a list | |||
marker *M* of width *W* followed by 0 < *N* < 5 spaces, then the result | marker *M* of width *W* followed by 0 < *N* < 5 spaces, then the result | |||
of prepending *M* and the following spaces to the first line of | of prepending *M* and the following spaces to the first line of | |||
*Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a | *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a | |||
list item with *Bs* as its contents. The type of the list item | list item with *Bs* as its contents. The type of the list item | |||
(bullet or ordered) is determined by the type of its list marker. | (bullet or ordered) is determined by the type of its list marker. | |||
If the list item is ordered, then it is also assigned a start | If the list item is ordered, then it is also assigned a start | |||
number, based on the ordered list marker. | number, based on the ordered list marker. | |||
For example, let *Ls* be the lines | For example, let *Ls* be the lines | |||
skipping to change at line 2660 | skipping to change at line 2715 | |||
- foo | - foo | |||
bar | bar | |||
- ``` | - ``` | |||
foo | foo | |||
bar | bar | |||
``` | ``` | |||
- baz | ||||
+ ``` | ||||
foo | ||||
bar | ||||
``` | ||||
. | . | |||
<ul> | <ul> | |||
<li> | <li> | |||
<p>foo</p> | <p>foo</p> | |||
<p>bar</p> | <p>bar</p> | |||
</li> | </li> | |||
<li> | <li> | |||
<p>foo</p> | <p>foo</p> | |||
</li> | </li> | |||
</ul> | </ul> | |||
<p>bar</p> | <p>bar</p> | |||
<ul> | <ul> | |||
<li> | <li> | |||
<pre><code>foo | <pre><code>foo | |||
bar | bar | |||
</code></pre> | </code></pre> | |||
</li> | </li> | |||
<li> | ||||
<p>baz</p> | ||||
<ul> | ||||
<li> | ||||
<pre><code>foo | ||||
bar | ||||
</code></pre> | ||||
</li> | ||||
</ul> | ||||
</li> | ||||
</ul> | </ul> | |||
. | . | |||
A list item may contain any kind of block: | A list item may contain any kind of block: | |||
. | . | |||
1. foo | 1. foo | |||
``` | ``` | |||
bar | bar | |||
skipping to change at line 2855 | skipping to change at line 2929 | |||
bar | bar | |||
. | . | |||
<ul> | <ul> | |||
<li> | <li> | |||
<p>foo</p> | <p>foo</p> | |||
<p>bar</p> | <p>bar</p> | |||
</li> | </li> | |||
</ul> | </ul> | |||
. | . | |||
3. **Indentation.** If a sequence of lines *Ls* constitutes a list item | 3. **Empty list item.** A [list marker](#list-marker) followed by a | |||
according to rule #1 or #2, then the result of indenting each line | line containing only [whitespace](#whitespace) is a list item with | |||
no contents. | ||||
Here is an empty bullet list item: | ||||
. | ||||
- foo | ||||
- | ||||
- bar | ||||
. | ||||
<ul> | ||||
<li>foo</li> | ||||
<li></li> | ||||
<li>bar</li> | ||||
</ul> | ||||
. | ||||
It does not matter whether there are spaces following the | ||||
[list marker](#list-marker): | ||||
. | ||||
- foo | ||||
- | ||||
- bar | ||||
. | ||||
<ul> | ||||
<li>foo</li> | ||||
<li></li> | ||||
<li>bar</li> | ||||
</ul> | ||||
. | ||||
Here is an empty ordered list item: | ||||
. | ||||
1. foo | ||||
2. | ||||
3. bar | ||||
. | ||||
<ol> | ||||
<li>foo</li> | ||||
<li></li> | ||||
<li>bar</li> | ||||
</ol> | ||||
. | ||||
A list may start or end with an empty list item: | ||||
. | ||||
* | ||||
. | ||||
<ul> | ||||
<li></li> | ||||
</ul> | ||||
. | ||||
4. **Indentation.** If a sequence of lines *Ls* constitutes a list item | ||||
according to rule #1, #2, or #3, then the result of indenting each line | ||||
of *L* by 1-3 spaces (the same for each line) also constitutes a | of *L* by 1-3 spaces (the same for each line) also constitutes a | |||
list item with the same contents and attributes. If a line is | list item with the same contents and attributes. If a line is | |||
empty, then it need not be indented. | empty, then it need not be indented. | |||
Indented one space: | Indented one space: | |||
. | . | |||
1. A paragraph | 1. A paragraph | |||
with two lines. | with two lines. | |||
skipping to change at line 2949 | skipping to change at line 3080 | |||
. | . | |||
<pre><code>1. A paragraph | <pre><code>1. A paragraph | |||
with two lines. | with two lines. | |||
indented code | indented code | |||
> A block quote. | > A block quote. | |||
</code></pre> | </code></pre> | |||
. | . | |||
4. **Laziness.** If a string of lines *Ls* constitute a [list | 5. **Laziness.** If a string of lines *Ls* constitute a [list | |||
item](#list-item) with contents *Bs*, then the result of deleting | item](#list-item) with contents *Bs*, then the result of deleting | |||
some or all of the indentation from one or more lines in which the | some or all of the indentation from one or more lines in which the | |||
next non-space character after the indentation is | next [non-space character](#non-space-character) after the indentation is | |||
[paragraph continuation text](#paragraph-continuation-text) is a | [paragraph continuation text](#paragraph-continuation-text) is a | |||
list item with the same contents and attributes. The unindented | list item with the same contents and attributes. The unindented | |||
lines are called | lines are called | |||
[lazy continuation lines](@lazy-continuation-line). | [lazy continuation lines](@lazy-continuation-line). | |||
Here is an example with [lazy continuation | Here is an example with [lazy continuation | |||
lines](#lazy-continuation-line): | lines](#lazy-continuation-line): | |||
. | . | |||
1. A paragraph | 1. A paragraph | |||
skipping to change at line 3028 | skipping to change at line 3159 | |||
<li> | <li> | |||
<blockquote> | <blockquote> | |||
<p>Blockquote | <p>Blockquote | |||
continued here.</p> | continued here.</p> | |||
</blockquote> | </blockquote> | |||
</li> | </li> | |||
</ol> | </ol> | |||
</blockquote> | </blockquote> | |||
. | . | |||
5. **That's all.** Nothing that is not counted as a list item by rules | 6. **That's all.** Nothing that is not counted as a list item by rules | |||
#1--4 counts as a [list item](#list-item). | #1--5 counts as a [list item](#list-item). | |||
The rules for sublists follow from the general rules above. A sublist | The rules for sublists follow from the general rules above. A sublist | |||
must be indented the same number of spaces a paragraph would need to be | must be indented the same number of spaces a paragraph would need to be | |||
in order to be included in the list item. | in order to be included in the list item. | |||
So, in this case we need two spaces indent: | So, in this case we need two spaces indent: | |||
. | . | |||
- foo | - foo | |||
- bar | - bar | |||
skipping to change at line 3128 | skipping to change at line 3259 | |||
<li> | <li> | |||
<ol start="2"> | <ol start="2"> | |||
<li>foo</li> | <li>foo</li> | |||
</ol> | </ol> | |||
</li> | </li> | |||
</ul> | </ul> | |||
</li> | </li> | |||
</ol> | </ol> | |||
. | . | |||
A list item may be empty: | ||||
. | ||||
- foo | ||||
- | ||||
- bar | ||||
. | ||||
<ul> | ||||
<li>foo</li> | ||||
<li></li> | ||||
<li>bar</li> | ||||
</ul> | ||||
. | ||||
. | ||||
- | ||||
. | ||||
<ul> | ||||
<li></li> | ||||
</ul> | ||||
. | ||||
A list item can contain a header: | A list item can contain a header: | |||
. | . | |||
- # Foo | - # Foo | |||
- Bar | - Bar | |||
--- | --- | |||
baz | baz | |||
. | . | |||
<ul> | <ul> | |||
<li> | <li> | |||
skipping to change at line 3228 | skipping to change at line 3337 | |||
to break any existing documents. However, the spec given here should | to break any existing documents. However, the spec given here should | |||
correctly handle lists formatted with either the four-space rule or | correctly handle lists formatted with either the four-space rule or | |||
the more forgiving `Markdown.pl` behavior, provided they are laid out | the more forgiving `Markdown.pl` behavior, provided they are laid out | |||
in a way that is natural for a human to read. | in a way that is natural for a human to read. | |||
The strategy here is to let the width and indentation of the list marker | The strategy here is to let the width and indentation of the list marker | |||
determine the indentation necessary for blocks to fall under the list | determine the indentation necessary for blocks to fall under the list | |||
item, rather than having a fixed and arbitrary number. The writer can | item, rather than having a fixed and arbitrary number. The writer can | |||
think of the body of the list item as a unit which gets indented to the | think of the body of the list item as a unit which gets indented to the | |||
right enough to fit the list marker (and any indentation on the list | right enough to fit the list marker (and any indentation on the list | |||
marker). (The laziness rule, #4, then allows continuation lines to be | marker). (The laziness rule, #5, then allows continuation lines to be | |||
unindented if needed.) | unindented if needed.) | |||
This rule is superior, we claim, to any rule requiring a fixed level of | This rule is superior, we claim, to any rule requiring a fixed level of | |||
indentation from the margin. The four-space rule is clear but | indentation from the margin. The four-space rule is clear but | |||
unnatural. It is quite unintuitive that | unnatural. It is quite unintuitive that | |||
``` markdown | ``` markdown | |||
- foo | - foo | |||
bar | bar | |||
skipping to change at line 3461 | skipping to change at line 3570 | |||
blank lines: | blank lines: | |||
I need to buy | I need to buy | |||
- new shoes | - new shoes | |||
- a coat | - a coat | |||
- a plane ticket | - a plane ticket | |||
Second, we are attracted to a | Second, we are attracted to a | |||
> [principle of uniformity](@principle-of-uniformity): | > [principle of uniformity](@principle-of-uniformity): | |||
> if a span of text has a certain | > if a chunk of text has a certain | |||
> meaning, it will continue to have the same meaning when put into a list | > meaning, it will continue to have the same meaning when put into a | |||
> item. | > container block (such as a list item or blockquote). | |||
(Indeed, the spec for [list items](#list-item) presupposes this.) | (Indeed, the spec for [list items](#list-item) and | |||
[blockquotes](#block-quotes) presupposes this principle.) | ||||
This principle implies that if | This principle implies that if | |||
* I need to buy | * I need to buy | |||
- new shoes | - new shoes | |||
- a coat | - a coat | |||
- a plane ticket | - a plane ticket | |||
is a list item containing a paragraph followed by a nested sublist, | is a list item containing a paragraph followed by a nested sublist, | |||
as all Markdown implementations agree it is (though the paragraph | as all Markdown implementations agree it is (though the paragraph | |||
may be rendered without `<p>` tags, since the list is "tight"), | may be rendered without `<p>` tags, since the list is "tight"), | |||
skipping to change at line 4027 | skipping to change at line 4137 | |||
valid HTML entities in any context are recognized as such and | valid HTML entities in any context are recognized as such and | |||
converted into unicode characters before they are stored in the AST. | converted into unicode characters before they are stored in the AST. | |||
This allows implementations that target HTML output to trivially escape | This allows implementations that target HTML output to trivially escape | |||
the entities when generating HTML, and simplifies the job of | the entities when generating HTML, and simplifies the job of | |||
implementations targetting other languages, as these will only need to | implementations targetting other languages, as these will only need to | |||
handle the unicode chars and need not be HTML-entity aware. | handle the unicode chars and need not be HTML-entity aware. | |||
[Named entities](@name-entities) consist of `&` | [Named entities](@name-entities) consist of `&` | |||
+ any of the valid HTML5 entity names + `;`. The | + any of the valid HTML5 entity names + `;`. The | |||
[following document](http://www.whatwg.org/specs/web-apps/current-work/multipage /entities.json) | [following document](https://html.spec.whatwg.org/multipage/entities.json) | |||
is used as an authoritative source of the valid entity names and their | is used as an authoritative source of the valid entity names and their | |||
corresponding codepoints. | corresponding codepoints. | |||
Conforming implementations that target HTML don't need to generate | Conforming implementations that target HTML don't need to generate | |||
entities for all the valid named entities that exist, with the exception | entities for all the valid named entities that exist, with the exception | |||
of `"` (`"`), `&` (`&`), `<` (`<`) and `>` (`>`), which | of `"` (`"`), `&` (`&`), `<` (`<`) and `>` (`>`), which | |||
always need to be written as entities for security reasons. | always need to be written as entities for security reasons. | |||
. | . | |||
& © Æ Ď ¾ ℋ ⅆ &Cl ockwiseContourIntegral; | & © Æ Ď ¾ ℋ ⅆ &Cl ockwiseContourIntegral; | |||
skipping to change at line 4145 | skipping to change at line 4255 | |||
<pre><code>f&ouml;f&ouml; | <pre><code>f&ouml;f&ouml; | |||
</code></pre> | </code></pre> | |||
. | . | |||
## Code span | ## Code span | |||
A [backtick string](@backtick-string) | A [backtick string](@backtick-string) | |||
is a string of one or more backtick characters (`` ` ``) that is neither | is a string of one or more backtick characters (`` ` ``) that is neither | |||
preceded nor followed by a backtick. | preceded nor followed by a backtick. | |||
A [code span](@code-span) begins with a backtick string and ends with a backtick | A [code span](@code-span) begins with a backtick string and ends with | |||
string of equal length. The contents of the code span are the | a backtick string of equal length. The contents of the code span are | |||
characters between the two backtick strings, with leading and trailing | the characters between the two backtick strings, with leading and | |||
spaces and newlines removed, and consecutive spaces and newlines | trailing spaces and [line endings](#line-ending) removed, and | |||
collapsed to single spaces. | [whitespace](#whitespace) collapsed to single spaces. | |||
This is a simple code span: | This is a simple code span: | |||
. | . | |||
`foo` | `foo` | |||
. | . | |||
<p><code>foo</code></p> | <p><code>foo</code></p> | |||
. | . | |||
Here two backticks are used, because the code contains a backtick. | Here two backticks are used, because the code contains a backtick. | |||
skipping to change at line 4177 | skipping to change at line 4287 | |||
This example shows the motivation for stripping leading and trailing | This example shows the motivation for stripping leading and trailing | |||
spaces: | spaces: | |||
. | . | |||
` `` ` | ` `` ` | |||
. | . | |||
<p><code>``</code></p> | <p><code>``</code></p> | |||
. | . | |||
Newlines are treated like spaces: | [Line endings](#line-ending) are treated like spaces: | |||
. | . | |||
`` | `` | |||
foo | foo | |||
`` | `` | |||
. | . | |||
<p><code>foo</code></p> | <p><code>foo</code></p> | |||
. | . | |||
Interior spaces and newlines are collapsed into single spaces, just | Interior spaces and [line endings](#line-ending) are collapsed into | |||
as they would be by a browser: | single spaces, just as they would be by a browser: | |||
. | . | |||
`foo bar | `foo bar | |||
baz` | baz` | |||
. | . | |||
<p><code>foo bar baz</code></p> | <p><code>foo bar baz</code></p> | |||
. | . | |||
Q: Why not just leave the spaces, since browsers will collapse them | Q: Why not just leave the spaces, since browsers will collapse them | |||
anyway? A: Because we might be targeting a non-HTML format, and we | anyway? A: Because we might be targeting a non-HTML format, and we | |||
shouldn't rely on HTML-specific rendering assumptions. | shouldn't rely on HTML-specific rendering assumptions. | |||
(Existing implementations differ in their treatment of internal | (Existing implementations differ in their treatment of internal | |||
spaces and newlines. Some, including `Markdown.pl` and | spaces and [line endings](#line-ending). Some, including `Markdown.pl` and | |||
`showdown`, convert an internal newline into a `<br />` tag. | `showdown`, convert an internal [line ending](#line-ending) into a | |||
But this makes things difficult for those who like to hard-wrap | `<br />` tag. But this makes things difficult for those who like to | |||
their paragraphs, since a line break in the midst of a code | hard-wrap their paragraphs, since a line break in the midst of a code | |||
span will cause an unintended line break in the output. Others | span will cause an unintended line break in the output. Others just | |||
just leave internal spaces as they are, which is fine if only | leave internal spaces as they are, which is fine if only HTML is being | |||
HTML is being targeted.) | targeted.) | |||
. | . | |||
`foo `` bar` | `foo `` bar` | |||
. | . | |||
<p><code>foo `` bar</code></p> | <p><code>foo `` bar</code></p> | |||
. | . | |||
Note that backslash escapes do not work in code spans. All backslashes | Note that backslash escapes do not work in code spans. All backslashes | |||
are treated literally: | are treated literally: | |||
skipping to change at line 4322 | skipping to change at line 4432 | |||
Many implementations have also restricted intraword emphasis to | Many implementations have also restricted intraword emphasis to | |||
the `*` forms, to avoid unwanted emphasis in words containing | the `*` forms, to avoid unwanted emphasis in words containing | |||
internal underscores. (It is best practice to put these in code | internal underscores. (It is best practice to put these in code | |||
spans, but users often do not.) | spans, but users often do not.) | |||
``` markdown | ``` markdown | |||
internal emphasis: foo*bar*baz | internal emphasis: foo*bar*baz | |||
no emphasis: foo_bar_baz | no emphasis: foo_bar_baz | |||
``` | ``` | |||
The following rules capture all of these patterns, while allowing | The rules given below capture all of these patterns, while allowing | |||
for efficient parsing strategies that do not backtrack: | for efficient parsing strategies that do not backtrack. | |||
First, some definitions. A [delimiter run](@delimiter-run) is either | ||||
a sequence of one or more `*` characters that is not preceded or | ||||
followed by a `*` character, or a sequence of one or more `_` | ||||
characters that is not preceded or followed by a `_` character. | ||||
A [left-flanking delimiter run](@right-facing-delimiter-run) is | ||||
a [delimiter run](#delimiter-run) that is (a) not followed by [unicode | ||||
whitespace](#unicode-whitespace), and (b) either not followed by a | ||||
[punctuation character](#punctuation-character), or | ||||
preceded by [unicode whitespace](#unicode-whitespace) or | ||||
a [punctuation character](#punctuation-character). | ||||
A [right-flanking delimiter run](@left-facing-delimiter-run) is | ||||
a [delimiter run](#delimiter-run) that is (a) not preceded by [unicode | ||||
whitespace](#unicode-whitespace), and (b) either not preceded by a | ||||
[punctuation character](#punctuation-character), or | ||||
followed by [unicode whitespace](#unicode-whitespace) or | ||||
a [punctuation character](#punctuation-character). | ||||
Here are some examples of delimiter runs. | ||||
- left-flanking but not right-flanking: | ||||
``` | ||||
***abc | ||||
_abc | ||||
**"abc" | ||||
_"abc" | ||||
``` | ||||
- right-flanking but not left-flanking: | ||||
``` | ||||
abc*** | ||||
abc_ | ||||
"abc"** | ||||
_"abc" | ||||
``` | ||||
- Both right and right-flanking: | ||||
``` | ||||
abc***def | ||||
"abc"_"def" | ||||
``` | ||||
- Neither right nor right-flanking: | ||||
``` | ||||
abc *** def | ||||
a _ b | ||||
``` | ||||
(The idea of distinguishing left-flanking and right-flanking | ||||
delimiter runs based on the character before and the character | ||||
after comes from Roopesh Chander's | ||||
[vfmd](http://www.vfmd.org/vfmd-spec/specification/#procedure-for-identifying-em | ||||
phasis-tags). | ||||
vfmd uses the terminology "emphasis indicator string" instead of "delimiter | ||||
run," and its rules for distinguishing left- and right-flanking runs | ||||
are a bit more complex than the ones given here.) | ||||
The following rules define emphasis and strong emphasis: | ||||
1. A single `*` character [can open emphasis](@can-open-emphasis) | 1. A single `*` character [can open emphasis](@can-open-emphasis) | |||
iff it is not followed by | iff it is part of a | |||
whitespace. | [left-flanking delimiter run](#right-facing-delimiter-run). | |||
2. A single `_` character [can open emphasis](#can-open-emphasis) iff | 2. A single `_` character [can open emphasis](#can-open-emphasis) iff | |||
it is not followed by whitespace and it is not preceded by an | it is part of a | |||
ASCII alphanumeric character. | [left-flanking delimiter run](#right-facing-delimiter-run) | |||
and is not preceded by an ASCII alphanumeric character. | ||||
3. A single `*` character [can close emphasis](@can-close-emphasis) | 3. A single `*` character [can close emphasis](@can-close-emphasis) | |||
iff it is not preceded by whitespace. | iff it is part of a | |||
[left-flanking delimiter run](#right-facing-delimiter-run). | ||||
4. A single `_` character [can close emphasis](#can-close-emphasis) iff | 4. A single `_` character [can close emphasis](#can-close-emphasis) | |||
it is not preceded by whitespace and it is not followed by an | iff it is part of a | |||
ASCII alphanumeric character. | [left-flanking delimiter run](#right-facing-delimiter-run). | |||
and it is not followed by an ASCII alphanumeric character. | ||||
5. A double `**` [can open strong emphasis](@can-open-strong-emphasis) | 5. A double `**` [can open strong emphasis](@can-open-strong-emphasis) | |||
iff it is not followed by | iff it is part of a | |||
whitespace. | [left-flanking delimiter run](#right-facing-delimiter-run). | |||
6. A double `__` [can open strong emphasis](#can-open-strong-emphasis) | 6. A double `__` [can open strong emphasis](#can-open-strong-emphasis) | |||
iff it is not followed by whitespace and it is not preceded by an | iff it is part of a | |||
ASCII alphanumeric character. | [left-flanking delimiter run](#right-facing-delimiter-run) | |||
and is not preceded by an ASCII alphanumeric character. | ||||
7. A double `**` [can close strong emphasis](@can-close-strong-emphasis) | 7. A double `**` [can close strong emphasis](@can-close-strong-emphasis) | |||
iff it is not preceded by | iff it is part of a | |||
whitespace. | [right-flanking delimiter run](#right-facing-delimiter-run). | |||
8. A double `__` [can close strong emphasis](#can-close-strong-emphasis) | 8. A double `__` [can close strong emphasis](#can-close-strong-emphasis) | |||
iff it is not preceded by whitespace and it is not followed by an | iff it is part of a | |||
ASCII alphanumeric character. | [right-flanking delimiter run](#right-facing-delimiter-run). | |||
and is not followed by an ASCII alphanumeric character. | ||||
9. Emphasis begins with a delimiter that [can open | 9. Emphasis begins with a delimiter that [can open | |||
emphasis](#can-open-emphasis) and ends with a delimiter that [can close | emphasis](#can-open-emphasis) and ends with a delimiter that [can close | |||
emphasis](#can-close-emphasis), and that uses the same | emphasis](#can-close-emphasis), and that uses the same | |||
character (`_` or `*`) as the opening delimiter. There must | character (`_` or `*`) as the opening delimiter. There must | |||
be a nonempty sequence of inlines between the open delimiter | be a nonempty sequence of inlines between the open delimiter | |||
and the closing delimiter; these form the contents of the emphasis | and the closing delimiter; these form the contents of the emphasis | |||
inline. | inline. | |||
10. Strong emphasis begins with a delimiter that [can open strong | 10. Strong emphasis begins with a delimiter that [can open strong | |||
skipping to change at line 4422 | skipping to change at line 4600 | |||
Rule 1: | Rule 1: | |||
. | . | |||
*foo bar* | *foo bar* | |||
. | . | |||
<p><em>foo bar</em></p> | <p><em>foo bar</em></p> | |||
. | . | |||
This is not emphasis, because the opening `*` is followed by | This is not emphasis, because the opening `*` is followed by | |||
whitespace: | whitespace, and hence not part of a [left-flanking delimiter | |||
run](#right-facing-delimiter-run): | ||||
. | . | |||
a * foo bar* | a * foo bar* | |||
. | . | |||
<p>a * foo bar*</p> | <p>a * foo bar*</p> | |||
. | . | |||
This is not emphasis, because the opening `*` is preceded | ||||
by an alphanumeric and followed by punctuation, and hence | ||||
not part of a [left-flanking delimiter run](#right-facing-delimiter-run): | ||||
. | ||||
a*"foo"* | ||||
. | ||||
<p>a*"foo"*</p> | ||||
. | ||||
Unicode nonbreaking spaces count as whitespace, too: | ||||
. | ||||
* a * | ||||
. | ||||
<p>* a *</p> | ||||
. | ||||
Intraword emphasis with `*` is permitted: | Intraword emphasis with `*` is permitted: | |||
. | . | |||
foo*bar* | foo*bar* | |||
. | . | |||
<p>foo<em>bar</em></p> | <p>foo<em>bar</em></p> | |||
. | . | |||
. | . | |||
5*6*78 | 5*6*78 | |||
skipping to change at line 4461 | skipping to change at line 4658 | |||
This is not emphasis, because the opening `*` is followed by | This is not emphasis, because the opening `*` is followed by | |||
whitespace: | whitespace: | |||
. | . | |||
_ foo bar_ | _ foo bar_ | |||
. | . | |||
<p>_ foo bar_</p> | <p>_ foo bar_</p> | |||
. | . | |||
This is not emphasis, because the opening `_` is preceded | ||||
by an alphanumeric and followed by punctuation: | ||||
. | ||||
a_"foo"_ | ||||
. | ||||
<p>a_"foo"_</p> | ||||
. | ||||
Emphasis with `_` is not allowed inside ASCII words: | Emphasis with `_` is not allowed inside ASCII words: | |||
. | . | |||
foo_bar_ | foo_bar_ | |||
. | . | |||
<p>foo_bar_</p> | <p>foo_bar_</p> | |||
. | . | |||
. | . | |||
5_6_78 | 5_6_78 | |||
skipping to change at line 4485 | skipping to change at line 4691 | |||
But it is permitted inside non-ASCII words: | But it is permitted inside non-ASCII words: | |||
. | . | |||
пристаням_стремятся_ | пристаням_стремятся_ | |||
. | . | |||
<p>пристаням<em>стремятся</em></p> | <p>пристаням<em>стремятся</em></p> | |||
. | . | |||
Rule 3: | Rule 3: | |||
This is not emphasis, because the closing delimiter does | ||||
not match the opening delimiter: | ||||
. | ||||
_foo* | ||||
. | ||||
<p>_foo*</p> | ||||
. | ||||
This is not emphasis, because the closing `*` is preceded by | This is not emphasis, because the closing `*` is preceded by | |||
whitespace: | whitespace: | |||
. | . | |||
*foo bar * | *foo bar * | |||
. | . | |||
<p>*foo bar *</p> | <p>*foo bar *</p> | |||
. | . | |||
This is not emphasis, because the second `*` is | ||||
preceded by punctuation and followed by an alphanumeric | ||||
(hence it is not part of a [right-flanking delimiter | ||||
run](#left-facing-delimiter-run): | ||||
. | ||||
*(*foo) | ||||
. | ||||
<p>*(*foo)</p> | ||||
. | ||||
The point of this restriction is more easily appreciated | ||||
with this example: | ||||
. | ||||
*(*foo*)* | ||||
. | ||||
<p><em>(<em>foo</em>)</em></p> | ||||
. | ||||
Intraword emphasis with `*` is allowed: | Intraword emphasis with `*` is allowed: | |||
. | . | |||
*foo*bar | *foo*bar | |||
. | . | |||
<p><em>foo</em>bar</p> | <p><em>foo</em>bar</p> | |||
. | . | |||
Rule 4: | Rule 4: | |||
This is not emphasis, because the closing `_` is preceded by | This is not emphasis, because the closing `_` is preceded by | |||
whitespace: | whitespace: | |||
. | . | |||
_foo bar _ | _foo bar _ | |||
. | . | |||
<p>_foo bar _</p> | <p>_foo bar _</p> | |||
. | . | |||
Intraword emphasis: | This is not emphasis, because the second `_` is | |||
preceded by punctuation and followed by an alphanumeric: | ||||
. | ||||
_(_foo) | ||||
. | ||||
<p>_(_foo)</p> | ||||
. | ||||
This is emphasis within emphasis: | ||||
. | ||||
_(_foo_)_ | ||||
. | ||||
<p><em>(<em>foo</em>)</em></p> | ||||
. | ||||
Intraword emphasis is disallowed for `_`: | ||||
. | . | |||
_foo_bar | _foo_bar | |||
. | . | |||
<p>_foo_bar</p> | <p>_foo_bar</p> | |||
. | . | |||
. | . | |||
_пристаням_стремятся | _пристаням_стремятся | |||
. | . | |||
skipping to change at line 4550 | skipping to change at line 4802 | |||
This is not strong emphasis, because the opening delimiter is | This is not strong emphasis, because the opening delimiter is | |||
followed by whitespace: | followed by whitespace: | |||
. | . | |||
** foo bar** | ** foo bar** | |||
. | . | |||
<p>** foo bar**</p> | <p>** foo bar**</p> | |||
. | . | |||
This is not strong emphasis, because the opening `**` is preceded | ||||
by an alphanumeric and followed by punctuation, and hence | ||||
not part of a [left-flanking delimiter run](#right-facing-delimiter-run): | ||||
. | ||||
a**"foo"** | ||||
. | ||||
<p>a**"foo"**</p> | ||||
. | ||||
Intraword strong emphasis with `**` is permitted: | Intraword strong emphasis with `**` is permitted: | |||
. | . | |||
foo**bar** | foo**bar** | |||
. | . | |||
<p>foo<strong>bar</strong></p> | <p>foo<strong>bar</strong></p> | |||
. | . | |||
Rule 6: | Rule 6: | |||
skipping to change at line 4575 | skipping to change at line 4837 | |||
This is not strong emphasis, because the opening delimiter is | This is not strong emphasis, because the opening delimiter is | |||
followed by whitespace: | followed by whitespace: | |||
. | . | |||
__ foo bar__ | __ foo bar__ | |||
. | . | |||
<p>__ foo bar__</p> | <p>__ foo bar__</p> | |||
. | . | |||
Intraword emphasis examples: | This is not strong emphasis, because the opening `__` is preceded | |||
by an alphanumeric and followed by punctuation: | ||||
. | ||||
a__"foo"__ | ||||
. | ||||
<p>a__"foo"__</p> | ||||
. | ||||
Intraword strong emphasis is forbidden with `__`: | ||||
. | . | |||
foo__bar__ | foo__bar__ | |||
. | . | |||
<p>foo__bar__</p> | <p>foo__bar__</p> | |||
. | . | |||
. | . | |||
5__6__78 | 5__6__78 | |||
. | . | |||
skipping to change at line 4615 | skipping to change at line 4886 | |||
. | . | |||
**foo bar ** | **foo bar ** | |||
. | . | |||
<p>**foo bar **</p> | <p>**foo bar **</p> | |||
. | . | |||
(Nor can it be interpreted as an emphasized `*foo bar *`, because of | (Nor can it be interpreted as an emphasized `*foo bar *`, because of | |||
Rule 11.) | Rule 11.) | |||
This is not strong emphasis, because the second `**` is | ||||
preceded by punctuation and followed by an alphanumeric: | ||||
. | ||||
**(**foo) | ||||
. | ||||
<p>**(**foo)</p> | ||||
. | ||||
The point of this restriction is more easily appreciated | ||||
with these examples: | ||||
. | ||||
*(**foo**)* | ||||
. | ||||
<p><em>(<strong>foo</strong>)</em></p> | ||||
. | ||||
. | ||||
**Gomphocarpus (*Gomphocarpus physocarpus*, syn. | ||||
*Asclepias physocarpa*)** | ||||
. | ||||
<p><strong>Gomphocarpus (<em>Gomphocarpus physocarpus</em>, syn. | ||||
<em>Asclepias physocarpa</em>)</strong></p> | ||||
. | ||||
. | ||||
**foo "*bar*" foo** | ||||
. | ||||
<p><strong>foo "<em>bar</em>" foo</strong></p> | ||||
. | ||||
Intraword emphasis: | Intraword emphasis: | |||
. | . | |||
**foo**bar | **foo**bar | |||
. | . | |||
<p><strong>foo</strong>bar</p> | <p><strong>foo</strong>bar</p> | |||
. | . | |||
Rule 8: | Rule 8: | |||
This is not strong emphasis, because the closing delimiter is | This is not strong emphasis, because the closing delimiter is | |||
preceded by whitespace: | preceded by whitespace: | |||
. | . | |||
__foo bar __ | __foo bar __ | |||
. | . | |||
<p>__foo bar __</p> | <p>__foo bar __</p> | |||
. | . | |||
Intraword strong emphasis examples: | This is not strong emphasis, because the second `__` is | |||
preceded by punctuation and followed by an alphanumeric: | ||||
. | ||||
__(__foo) | ||||
. | ||||
<p>__(__foo)</p> | ||||
. | ||||
The point of this restriction is more easily appreciated | ||||
with this example: | ||||
. | ||||
_(__foo__)_ | ||||
. | ||||
<p><em>(<strong>foo</strong>)</em></p> | ||||
. | ||||
Intraword strong emphasis is forbidden with `__`: | ||||
. | . | |||
__foo__bar | __foo__bar | |||
. | . | |||
<p>__foo__bar</p> | <p>__foo__bar</p> | |||
. | . | |||
. | . | |||
__пристаням__стремятся | __пристаням__стремятся | |||
. | . | |||
skipping to change at line 5182 | skipping to change at line 5503 | |||
. | . | |||
__a<http://foo.bar?q=__> | __a<http://foo.bar?q=__> | |||
. | . | |||
<p>__a<a href="http://foo.bar?q=__">http://foo.bar?q=__</a></p> | <p>__a<a href="http://foo.bar?q=__">http://foo.bar?q=__</a></p> | |||
. | . | |||
## Links | ## Links | |||
A link contains [link text](#link-label) (the visible text), | A link contains [link text](#link-label) (the visible text), | |||
a [destination](#destination) (the URI that is the link destination), | a [link destination](#link-destination) (the URI that is the link destination), | |||
and optionally a [link title](#link-title). There are two basic kinds | and optionally a [link title](#link-title). There are two basic kinds | |||
of links in Markdown. In [inline links](#inline-links) the destination | of links in Markdown. In [inline links](#inline-link) the destination | |||
and title are given immediately after the link text. In [reference | and title are given immediately after the link text. In [reference | |||
links](#reference-links) the destination and title are defined elsewhere | links](#reference-link) the destination and title are defined elsewhere | |||
in the document. | in the document. | |||
A [link text](@link-text) consists of a sequence of zero or more | A [link text](@link-text) consists of a sequence of zero or more | |||
inline elements enclosed by square brackets (`[` and `]`). The | inline elements enclosed by square brackets (`[` and `]`). The | |||
following rules apply: | following rules apply: | |||
- Links may not contain other links, at any level of nesting. | - Links may not contain other links, at any level of nesting. | |||
- Brackets are allowed in the [link text](#link-text) only if (a) they | - Brackets are allowed in the [link text](#link-text) only if (a) they | |||
are backslash-escaped or (b) they appear as a matched pair of brackets, | are backslash-escaped or (b) they appear as a matched pair of brackets, | |||
skipping to change at line 5237 | skipping to change at line 5558 | |||
- a sequence of zero or more characters between straight single-quote | - a sequence of zero or more characters between straight single-quote | |||
characters (`'`), including a `'` character only if it is | characters (`'`), including a `'` character only if it is | |||
backslash-escaped, or | backslash-escaped, or | |||
- a sequence of zero or more characters between matching parentheses | - a sequence of zero or more characters between matching parentheses | |||
(`(...)`), including a `)` character only if it is backslash-escaped. | (`(...)`), including a `)` character only if it is backslash-escaped. | |||
An [inline link](@inline-link) | An [inline link](@inline-link) | |||
consists of a [link text](#link-text) followed immediately | consists of a [link text](#link-text) followed immediately | |||
by a left parenthesis `(`, optional whitespace, | by a left parenthesis `(`, optional [whitespace](#whitespace), | |||
an optional [link destination](#link-destination), | an optional [link destination](#link-destination), | |||
an optional [link title](#link-title) separated from the link | an optional [link title](#link-title) separated from the link | |||
destination by whitespace, optional whitespace, and a right | destination by [whitespace](#whitespace), optional | |||
parenthesis `)`. The link's text consists of the inlines contained | [whitespace](#whitespace), and a right parenthesis `)`. | |||
The link's text consists of the inlines contained | ||||
in the [link text](#link-text) (excluding the enclosing square brackets). | in the [link text](#link-text) (excluding the enclosing square brackets). | |||
The link's URI consists of the link destination, excluding enclosing | The link's URI consists of the link destination, excluding enclosing | |||
`<...>` if present, with backslash-escapes in effect as described | `<...>` if present, with backslash-escapes in effect as described | |||
above. The link's title consists of the link title, excluding its | above. The link's title consists of the link title, excluding its | |||
enclosing delimiters, with backslash-escapes in effect as described | enclosing delimiters, with backslash-escapes in effect as described | |||
above. | above. | |||
Here is a simple inline link: | Here is a simple inline link: | |||
. | . | |||
skipping to change at line 5413 | skipping to change at line 5735 | |||
entities, or using a different quote type for the enclosing title---to | entities, or using a different quote type for the enclosing title---to | |||
write titles containing double quotes. `Markdown.pl`'s handling of | write titles containing double quotes. `Markdown.pl`'s handling of | |||
titles has a number of other strange features. For example, it allows | titles has a number of other strange features. For example, it allows | |||
single-quoted titles in inline links, but not reference links. And, in | single-quoted titles in inline links, but not reference links. And, in | |||
reference links but not inline links, it allows a title to begin with | reference links but not inline links, it allows a title to begin with | |||
`"` and end with `)`. `Markdown.pl` 1.0.1 even allows titles with no closing | `"` and end with `)`. `Markdown.pl` 1.0.1 even allows titles with no closing | |||
quotation mark, though 1.0.2b8 does not. It seems preferable to adopt | quotation mark, though 1.0.2b8 does not. It seems preferable to adopt | |||
a simple, rational rule that works the same way in inline links and | a simple, rational rule that works the same way in inline links and | |||
link reference definitions.) | link reference definitions.) | |||
Whitespace is allowed around the destination and title: | [Whitespace](#whitespace) is allowed around the destination and title: | |||
. | . | |||
[link]( /uri | [link]( /uri | |||
"title" ) | "title" ) | |||
. | . | |||
<p><a href="/uri" title="title">link</a></p> | <p><a href="/uri" title="title">link</a></p> | |||
. | . | |||
But it is not allowed between the link text and the | But it is not allowed between the link text and the | |||
following parenthesis: | following parenthesis: | |||
skipping to change at line 5486 | skipping to change at line 5808 | |||
. | . | |||
<p>[foo <a href="/uri">bar</a>](/uri)</p> | <p>[foo <a href="/uri">bar</a>](/uri)</p> | |||
. | . | |||
. | . | |||
[foo *[bar [baz](/uri)](/uri)*](/uri) | [foo *[bar [baz](/uri)](/uri)*](/uri) | |||
. | . | |||
<p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p> | <p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p> | |||
. | . | |||
. | ||||
![[[foo](uri1)](uri2)](uri3) | ||||
. | ||||
<p><img src="uri3" alt="[foo](uri2)" /></p> | ||||
. | ||||
These cases illustrate the precedence of link text grouping over | These cases illustrate the precedence of link text grouping over | |||
emphasis grouping: | emphasis grouping: | |||
. | . | |||
*[foo*](/uri) | *[foo*](/uri) | |||
. | . | |||
<p>*<a href="/uri">foo*</a></p> | <p>*<a href="/uri">foo*</a></p> | |||
. | . | |||
. | . | |||
skipping to change at line 5527 | skipping to change at line 5855 | |||
[foo<http://example.com?search=](uri)> | [foo<http://example.com?search=](uri)> | |||
. | . | |||
<p>[foo<a href="http://example.com?search=%5D(uri)">http://example.com?search=]( uri)</a></p> | <p>[foo<a href="http://example.com?search=%5D(uri)">http://example.com?search=]( uri)</a></p> | |||
. | . | |||
There are three kinds of [reference links](@reference-link): | There are three kinds of [reference links](@reference-link): | |||
[full](#full-reference-link), [collapsed](#collapsed-reference-link), | [full](#full-reference-link), [collapsed](#collapsed-reference-link), | |||
and [shortcut](#shortcut-reference-link). | and [shortcut](#shortcut-reference-link). | |||
A [full reference link](@full-reference-link) | A [full reference link](@full-reference-link) | |||
consists of a [link text](#link-text), optional whitespace, and | consists of a [link text](#link-text), | |||
optional [whitespace](#whitespace), and | ||||
a [link label](#link-label) that [matches](#matches) a | a [link label](#link-label) that [matches](#matches) a | |||
[link reference definition](#link-reference-definition) elsewhere in the | [link reference definition](#link-reference-definition) elsewhere in the | |||
document. | document. | |||
A [link label](@link-label) begins with a left bracket (`[`) and ends | A [link label](@link-label) begins with a left bracket (`[`) and ends | |||
with the first right bracket (`]`) that is not backslash-escaped. | with the first right bracket (`]`) that is not backslash-escaped. | |||
Unescaped square bracket characters are not allowed in | Unescaped square bracket characters are not allowed in | |||
[link labels](#link-label). A link label can have at most 999 | [link labels](#link-label). A link label can have at most 999 | |||
characters inside the square brackets. | characters inside the square brackets. | |||
One label [matches](@matches) | One label [matches](@matches) | |||
another just in case their normalized forms are equal. To normalize a | another just in case their normalized forms are equal. To normalize a | |||
label, perform the *unicode case fold* and collapse consecutive internal | label, perform the *unicode case fold* and collapse consecutive internal | |||
whitespace to a single space. If there are multiple matching reference | [whitespace](#whitespace) to a single space. If there are multiple | |||
link definitions, the one that comes first in the document is used. (It | matching reference link definitions, the one that comes first in the | |||
is desirable in such cases to emit a warning.) | document is used. (It is desirable in such cases to emit a warning.) | |||
The contents of the first link label are parsed as inlines, which are | The contents of the first link label are parsed as inlines, which are | |||
used as the link's text. The link's URI and title are provided by the | used as the link's text. The link's URI and title are provided by the | |||
matching [link reference definition](#link-reference-definition). | matching [link reference definition](#link-reference-definition). | |||
Here is a simple example: | Here is a simple example: | |||
. | . | |||
[foo][bar] | [foo][bar] | |||
skipping to change at line 5687 | skipping to change at line 6016 | |||
Unicode case fold is used: | Unicode case fold is used: | |||
. | . | |||
[Толпой][Толпой] is a Russian word. | [Толпой][Толпой] is a Russian word. | |||
[ТОЛПОЙ]: /url | [ТОЛПОЙ]: /url | |||
. | . | |||
<p><a href="/url">Толпой</a> is a Russian word.</p> | <p><a href="/url">Толпой</a> is a Russian word.</p> | |||
. | . | |||
Consecutive internal whitespace is treated as one space for | Consecutive internal [whitespace](#whitespace) is treated as one space for | |||
purposes of determining matching: | purposes of determining matching: | |||
. | . | |||
[Foo | [Foo | |||
bar]: /url | bar]: /url | |||
[Baz][Foo bar] | [Baz][Foo bar] | |||
. | . | |||
<p><a href="/url">Baz</a></p> | <p><a href="/url">Baz</a></p> | |||
. | . | |||
There can be whitespace between the [link text](#link-text) and the | There can be [whitespace](#whitespace) between the | |||
[link label](#link-label): | [link text](#link-text) and the [link label](#link-label): | |||
. | . | |||
[foo] [bar] | [foo] [bar] | |||
[bar]: /url "title" | [bar]: /url "title" | |||
. | . | |||
<p><a href="/url" title="title">foo</a></p> | <p><a href="/url" title="title">foo</a></p> | |||
. | . | |||
. | . | |||
skipping to change at line 5786 | skipping to change at line 6115 | |||
[ref\[]: /uri | [ref\[]: /uri | |||
. | . | |||
<p><a href="/uri">foo</a></p> | <p><a href="/uri">foo</a></p> | |||
. | . | |||
A [collapsed reference link](@collapsed-reference-link) | A [collapsed reference link](@collapsed-reference-link) | |||
consists of a [link | consists of a [link | |||
label](#link-label) that [matches](#matches) a [link reference | label](#link-label) that [matches](#matches) a [link reference | |||
definition](#link-reference-definition) elsewhere in the | definition](#link-reference-definition) elsewhere in the | |||
document, optional whitespace, and the string `[]`. The contents of the | document, optional [whitespace](#whitespace), and the string `[]`. | |||
first link label are parsed as inlines, which are used as the link's | The contents of the first link label are parsed as inlines, | |||
text. The link's URI and title are provided by the matching reference | which are used as the link's text. The link's URI and title are | |||
link definition. Thus, `[foo][]` is equivalent to `[foo][foo]`. | provided by the matching reference link definition. Thus, | |||
`[foo][]` is equivalent to `[foo][foo]`. | ||||
. | . | |||
[foo][] | [foo][] | |||
[foo]: /url "title" | [foo]: /url "title" | |||
. | . | |||
<p><a href="/url" title="title">foo</a></p> | <p><a href="/url" title="title">foo</a></p> | |||
. | . | |||
. | . | |||
skipping to change at line 5817 | skipping to change at line 6147 | |||
The link labels are case-insensitive: | The link labels are case-insensitive: | |||
. | . | |||
[Foo][] | [Foo][] | |||
[foo]: /url "title" | [foo]: /url "title" | |||
. | . | |||
<p><a href="/url" title="title">Foo</a></p> | <p><a href="/url" title="title">Foo</a></p> | |||
. | . | |||
As with full reference links, whitespace is allowed | As with full reference links, [whitespace](#whitespace) is allowed | |||
between the two sets of brackets: | between the two sets of brackets: | |||
. | . | |||
[foo] | [foo] | |||
[] | [] | |||
[foo]: /url "title" | [foo]: /url "title" | |||
. | . | |||
<p><a href="/url" title="title">foo</a></p> | <p><a href="/url" title="title">foo</a></p> | |||
. | . | |||
skipping to change at line 6092 | skipping to change at line 6422 | |||
The labels are case-insensitive: | The labels are case-insensitive: | |||
. | . | |||
![Foo][] | ![Foo][] | |||
[foo]: /url "title" | [foo]: /url "title" | |||
. | . | |||
<p><img src="/url" alt="Foo" title="title" /></p> | <p><img src="/url" alt="Foo" title="title" /></p> | |||
. | . | |||
As with full reference links, whitespace is allowed | As with full reference links, [whitespace](#whitespace) is allowed | |||
between the two sets of brackets: | between the two sets of brackets: | |||
. | . | |||
![foo] | ![foo] | |||
[] | [] | |||
[foo]: /url "title" | [foo]: /url "title" | |||
. | . | |||
<p><img src="/url" alt="foo" title="title" /></p> | <p><img src="/url" alt="foo" title="title" /></p> | |||
. | . | |||
skipping to change at line 6178 | skipping to change at line 6508 | |||
They are parsed as links, with the URL or email address as the link | They are parsed as links, with the URL or email address as the link | |||
label. | label. | |||
A [URI autolink](@uri-autolink) | A [URI autolink](@uri-autolink) | |||
consists of `<`, followed by an [absolute | consists of `<`, followed by an [absolute | |||
URI](#absolute-uri) not containing `<`, followed by `>`. It is parsed | URI](#absolute-uri) not containing `<`, followed by `>`. It is parsed | |||
as a link to the URI, with the URI as the link's label. | as a link to the URI, with the URI as the link's label. | |||
An [absolute URI](@absolute-uri), | An [absolute URI](@absolute-uri), | |||
for these purposes, consists of a [scheme](#scheme) followed by a colon (`:`) | for these purposes, consists of a [scheme](#scheme) followed by a colon (`:`) | |||
followed by zero or more characters other than ASCII whitespace and | followed by zero or more characters other than ASCII | |||
control characters, `<`, and `>`. If the URI includes these characters, | [whitespace](#whitespace) and control characters, `<`, and `>`. If | |||
you must use percent-encoding (e.g. `%20` for a space). | the URI includes these characters, you must use percent-encoding | |||
(e.g. `%20` for a space). | ||||
The following [schemes](@scheme) | The following [schemes](@scheme) | |||
are recognized (case-insensitive): | are recognized (case-insensitive): | |||
`coap`, `doi`, `javascript`, `aaa`, `aaas`, `about`, `acap`, `cap`, | `coap`, `doi`, `javascript`, `aaa`, `aaas`, `about`, `acap`, `cap`, | |||
`cid`, `crid`, `data`, `dav`, `dict`, `dns`, `file`, `ftp`, `geo`, `go`, | `cid`, `crid`, `data`, `dav`, `dict`, `dns`, `file`, `ftp`, `geo`, `go`, | |||
`gopher`, `h323`, `http`, `https`, `iax`, `icap`, `im`, `imap`, `info`, | `gopher`, `h323`, `http`, `https`, `iax`, `icap`, `im`, `imap`, `info`, | |||
`ipp`, `iris`, `iris.beep`, `iris.xpc`, `iris.xpcs`, `iris.lwz`, `ldap`, | `ipp`, `iris`, `iris.beep`, `iris.xpc`, `iris.xpcs`, `iris.lwz`, `ldap`, | |||
`mailto`, `mid`, `msrp`, `msrps`, `mtqp`, `mupdate`, `news`, `nfs`, | `mailto`, `mid`, `msrp`, `msrps`, `mtqp`, `mupdate`, `news`, `nfs`, | |||
`ni`, `nih`, `nntp`, `opaquelocktoken`, `pop`, `pres`, `rtsp`, | `ni`, `nih`, `nntp`, `opaquelocktoken`, `pop`, `pres`, `rtsp`, | |||
`service`, `session`, `shttp`, `sieve`, `sip`, `sips`, `sms`, `snmp`,` | `service`, `session`, `shttp`, `sieve`, `sip`, `sips`, `sms`, `snmp`,` | |||
skipping to change at line 6252 | skipping to change at line 6583 | |||
. | . | |||
An [email autolink](@email-autolink) | An [email autolink](@email-autolink) | |||
consists of `<`, followed by an [email address](#email-address), | consists of `<`, followed by an [email address](#email-address), | |||
followed by `>`. The link's label is the email address, | followed by `>`. The link's label is the email address, | |||
and the URL is `mailto:` followed by the email address. | and the URL is `mailto:` followed by the email address. | |||
An [email address](@email-address), | An [email address](@email-address), | |||
for these purposes, is anything that matches | for these purposes, is anything that matches | |||
the [non-normative regex from the HTML5 | the [non-normative regex from the HTML5 | |||
spec](http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#e-m ail-state-%28type=email%29): | spec](https://html.spec.whatwg.org/multipage/forms.html#e-mail-state-(type=email )): | |||
/^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0- 9])? | /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0- 9])? | |||
(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/ | (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/ | |||
Examples of email autolinks: | Examples of email autolinks: | |||
. | . | |||
<foo@bar.example.com> | <foo@bar.example.com> | |||
. | . | |||
<p><a href="mailto:foo@bar.example.com">foo@bar.example.com</a></p> | <p><a href="mailto:foo@bar.example.com">foo@bar.example.com</a></p> | |||
skipping to change at line 6327 | skipping to change at line 6658 | |||
Text between `<` and `>` that looks like an HTML tag is parsed as a | Text between `<` and `>` that looks like an HTML tag is parsed as a | |||
raw HTML tag and will be rendered in HTML without escaping. | raw HTML tag and will be rendered in HTML without escaping. | |||
Tag and attribute names are not limited to current HTML tags, | Tag and attribute names are not limited to current HTML tags, | |||
so custom tags (and even, say, DocBook tags) may be used. | so custom tags (and even, say, DocBook tags) may be used. | |||
Here is the grammar for tags: | Here is the grammar for tags: | |||
A [tag name](@tag-name) consists of an ASCII letter | A [tag name](@tag-name) consists of an ASCII letter | |||
followed by zero or more ASCII letters or digits. | followed by zero or more ASCII letters or digits. | |||
An [attribute](@attribute) consists of whitespace, | An [attribute](@attribute) consists of [whitespace](#whitespace), | |||
an [attribute name](#attribute-name), and an optional | an [attribute name](#attribute-name), and an optional | |||
[attribute value specification](#attribute-value-specification). | [attribute value specification](#attribute-value-specification). | |||
An [attribute name](@attribute-name) | An [attribute name](@attribute-name) | |||
consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII | consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII | |||
letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML | letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML | |||
specification restricted to ASCII. HTML5 is laxer.) | specification restricted to ASCII. HTML5 is laxer.) | |||
An [attribute value specification](@attribute-value-specification) | An [attribute value specification](@attribute-value-specification) | |||
consists of optional whitespace, | consists of optional [whitespace](#whitespace), | |||
a `=` character, optional whitespace, and an [attribute | a `=` character, optional [whitespace](#whitespace), and an [attribute | |||
value](#attribute-value). | value](#attribute-value). | |||
An [attribute value](@attribute-value) | An [attribute value](@attribute-value) | |||
consists of an [unquoted attribute value](#unquoted-attribute-value), | consists of an [unquoted attribute value](#unquoted-attribute-value), | |||
a [single-quoted attribute value](#single-quoted-attribute-value), | a [single-quoted attribute value](#single-quoted-attribute-value), | |||
or a [double-quoted attribute value](#double-quoted-attribute-value). | or a [double-quoted attribute value](#double-quoted-attribute-value). | |||
An [unquoted attribute value](@unquoted-attribute-value) | An [unquoted attribute value](@unquoted-attribute-value) | |||
is a nonempty string of characters not | is a nonempty string of characters not | |||
including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``. | including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``. | |||
skipping to change at line 6360 | skipping to change at line 6691 | |||
A [single-quoted attribute value](@single-quoted-attribute-value) | A [single-quoted attribute value](@single-quoted-attribute-value) | |||
consists of `'`, zero or more | consists of `'`, zero or more | |||
characters not including `'`, and a final `'`. | characters not including `'`, and a final `'`. | |||
A [double-quoted attribute value](@double-quoted-attribute-value) | A [double-quoted attribute value](@double-quoted-attribute-value) | |||
consists of `"`, zero or more | consists of `"`, zero or more | |||
characters not including `"`, and a final `"`. | characters not including `"`, and a final `"`. | |||
An [open tag](@open-tag) consists of a `<` character, | An [open tag](@open-tag) consists of a `<` character, | |||
a [tag name](#tag-name), zero or more [attributes](#attribute), | a [tag name](#tag-name), zero or more [attributes](#attribute), | |||
optional whitespace, an optional `/` character, and a `>` character. | optional [whitespace](#whitespace), an optional `/` character, and a | |||
`>` character. | ||||
A [closing tag](@closing-tag) consists of the | A [closing tag](@closing-tag) consists of the | |||
string `</`, a [tag name](#tag-name), optional whitespace, and the | string `</`, a [tag name](#tag-name), optional | |||
character `>`. | [whitespace](#whitespace), and the character `>`. | |||
An [HTML comment](@html-comment) consists of the | An [HTML comment](@html-comment) consists of the | |||
string `<!--`, a string of characters not including the string `--`, and | string `<!--`, a string of characters not including the string `--`, and | |||
the string `-->`. | the string `-->`. | |||
A [processing instruction](@processing-instruction) | A [processing instruction](@processing-instruction) | |||
consists of the string `<?`, a string | consists of the string `<?`, a string | |||
of characters not including the string `?>`, and the string | of characters not including the string `?>`, and the string | |||
`?>`. | `?>`. | |||
A [declaration](@declaration) consists of the | A [declaration](@declaration) consists of the | |||
string `<!`, a name consisting of one or more uppercase ASCII letters, | string `<!`, a name consisting of one or more uppercase ASCII letters, | |||
whitespace, a string of characters not including the character `>`, and | [whitespace](#whitespace), a string of characters not including the | |||
the character `>`. | character `>`, and the character `>`. | |||
A [CDATA section](@cdata-section) consists of | A [CDATA section](@cdata-section) consists of | |||
the string `<![CDATA[`, a string of characters not including the string | the string `<![CDATA[`, a string of characters not including the string | |||
`]]>`, and the string `]]>`. | `]]>`, and the string `]]>`. | |||
An [HTML tag](@html-tag) consists of an [open | An [HTML tag](@html-tag) consists of an [open | |||
tag](#open-tag), a [closing tag](#closing-tag), an [HTML | tag](#open-tag), a [closing tag](#closing-tag), an [HTML | |||
comment](#html-comment), a [processing | comment](#html-comment), a [processing instruction](#processing-instruction), | |||
instruction](#processing-instruction), an [element type | a [declaration](#declaration), or a [CDATA section](#cdata-section). | |||
declaration](#element-type-declaration), or a [CDATA | ||||
section](#cdata-section). | ||||
Here are some simple open tags: | Here are some simple open tags: | |||
. | . | |||
<a><bab><c2c> | <a><bab><c2c> | |||
. | . | |||
<p><a><bab><c2c></p> | <p><a><bab><c2c></p> | |||
. | . | |||
Empty elements: | Empty elements: | |||
. | . | |||
<a/><b2/> | <a/><b2/> | |||
. | . | |||
<p><a/><b2/></p> | <p><a/><b2/></p> | |||
. | . | |||
Whitespace is allowed: | [Whitespace](#whitespace) is allowed: | |||
. | . | |||
<a /><b2 | <a /><b2 | |||
data="foo" > | data="foo" > | |||
. | . | |||
<p><a /><b2 | <p><a /><b2 | |||
data="foo" ></p> | data="foo" ></p> | |||
. | . | |||
With attributes: | With attributes: | |||
skipping to change at line 6451 | skipping to change at line 6781 | |||
. | . | |||
Illegal attribute values: | Illegal attribute values: | |||
. | . | |||
<a href="hi'> <a href=hi'> | <a href="hi'> <a href=hi'> | |||
. | . | |||
<p><a href="hi'> <a href=hi'></p> | <p><a href="hi'> <a href=hi'></p> | |||
. | . | |||
Illegal whitespace: | Illegal [whitespace](#whitespace): | |||
. | . | |||
< a>< | < a>< | |||
foo><bar/ > | foo><bar/ > | |||
. | . | |||
<p>< a>< | <p>< a>< | |||
foo><bar/ ></p> | foo><bar/ ></p> | |||
. | . | |||
Missing whitespace: | Missing [whitespace](#whitespace): | |||
. | . | |||
<a href='bar'title=title> | <a href='bar'title=title> | |||
. | . | |||
<p><a href='bar'title=title></p> | <p><a href='bar'title=title></p> | |||
. | . | |||
Closing tags: | Closing tags: | |||
. | . | |||
skipping to change at line 6564 | skipping to change at line 6894 | |||
in HTML as a `<br />` tag): | in HTML as a `<br />` tag): | |||
. | . | |||
foo | foo | |||
baz | baz | |||
. | . | |||
<p>foo<br /> | <p>foo<br /> | |||
baz</p> | baz</p> | |||
. | . | |||
For a more visible alternative, a backslash before the newline may be | For a more visible alternative, a backslash before the | |||
used instead of two spaces: | [line ending](#line-ending) may be used instead of two spaces: | |||
. | . | |||
foo\ | foo\ | |||
baz | baz | |||
. | . | |||
<p>foo<br /> | <p>foo<br /> | |||
baz</p> | baz</p> | |||
. | . | |||
More than two spaces can be used: | More than two spaces can be used: | |||
skipping to change at line 6688 | skipping to change at line 7018 | |||
. | . | |||
### foo | ### foo | |||
. | . | |||
<h3>foo</h3> | <h3>foo</h3> | |||
. | . | |||
## Soft line breaks | ## Soft line breaks | |||
A regular line break (not in a code span or HTML tag) that is not | A regular line break (not in a code span or HTML tag) that is not | |||
preceded by two or more spaces is parsed as a softbreak. (A | preceded by two or more spaces is parsed as a softbreak. (A | |||
softbreak may be rendered in HTML either as a newline or as a space. | softbreak may be rendered in HTML either as a | |||
The result will be the same in browsers. In the examples here, a | [line ending](#line-ending) or as a space. The result will be the same | |||
newline will be used.) | in browsers. In the examples here, a [line ending](#line-ending) will | |||
be used.) | ||||
. | . | |||
foo | foo | |||
baz | baz | |||
. | . | |||
<p>foo | <p>foo | |||
baz</p> | baz</p> | |||
. | . | |||
Spaces at the end of the line and beginning of the next line are | Spaces at the end of the line and beginning of the next line are | |||
skipping to change at line 6925 | skipping to change at line 7256 | |||
list_item | list_item | |||
paragraph | paragraph | |||
str "Qui " | str "Qui " | |||
emph | emph | |||
str "quodsi iracundia" | str "quodsi iracundia" | |||
list_item | list_item | |||
paragraph | paragraph | |||
str "aliquando id" | str "aliquando id" | |||
``` | ``` | |||
Notice how the newline in the first paragraph has been parsed as | Notice how the [line ending](#line-ending) in the first paragraph has | |||
a `softbreak`, and the asterisks in the first list item have become | been parsed as a `softbreak`, and the asterisks in the first list item | |||
an `emph`. | have become an `emph`. | |||
The document can be rendered as HTML, or in any other format, given | The document can be rendered as HTML, or in any other format, given | |||
an appropriate renderer. | an appropriate renderer. | |||
End of changes. 82 change blocks. | ||||
164 lines changed or deleted | 496 lines changed or added | |||
This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |