The syntactic category stx
is the grammar of specifiers that may occur in the body of a Lean.Parser.Command.syntax : command
syntax
command.
String literals are parsed as atoms (including both keywords such as if
, #eval
, or where
):
stx ::=
str
Leading and trailing spaces in the strings do not affect parsing, but they cause Lean to insert spaces in the corresponding position when displaying the syntax in proof states and error messages.
Ordinarily, valid identifiers occurring as atoms in syntax rules become reserved keywords.
Preceding a string literal with an ampersand (&
) suppresses this behavior:
stx ::= ...
| &str
Identifiers specify the syntactic category expected in a given position, and may optionally provide a precedence:
stx ::= ...
| ident(:prec)?
The *
modifier is the Kleene star, matching zero or more repetitions of the preceding syntax.
It can also be written using many
.
stx ::= ...
| `p*` is shorthand for `many(p)`. It uses parser `p` 0 or more times, and produces a
`nullNode` containing the array of parsed results. This parser has arity 1.
If `p` has arity more than 1, it is auto-grouped in the items generated by the parser.
stx *
The +
modifier matches one or more repetitions of the preceding syntax.
It can also be written using many1
.
stx ::= ...
| `p+` is shorthand for `many1(p)`. It uses parser `p` 1 or more times, and produces a
`nullNode` containing the array of parsed results. This parser has arity 1.
If `p` has arity more than 1, it is auto-grouped in the items generated by the parser.
stx +
The ?
modifier makes a subterm optional, and matches zero or one, but not more, repetitions of the preceding syntax.
It can also be written as optional
.
stx ::= ...
| `(p)?` is shorthand for `optional(p)`. It uses parser `p` 0 or 1 times, and produces a
`nullNode` containing the array of parsed results. This parser has arity 1.
`p` is allowed to have arity n > 1 (in which case the node will have either 0 or n children),
but if it has arity 0 then the result will be ambiguous.
Because `?` is an identifier character, `ident?` will not work as intended.
You have to write either `ident ?` or `(ident)?` for it to parse as the `?` combinator
applied to the `ident` parser.
stx ?
stx ::= ...
| Returns `some x` if `f` succeeds with value `x`, else returns `none`.
optional(stx)
The ,*
modifier matches zero or more repetitions of the preceding syntax with interleaved commas.
It can also be written using sepBy
.
stx ::= ...
| `p,*` is shorthand for `sepBy(p, ",")`. It parses 0 or more occurrences of
`p` separated by `,`, that is: `empty | p | p,p | p,p,p | ...`.
It produces a `nullNode` containing a `SepArray` with the interleaved parser
results. It has arity 1, and auto-groups its component parser if needed.
stx ,*
The ,+
modifier matches one or more repetitions of the preceding syntax with interleaved commas.
It can also be written using sepBy1
.
stx ::= ...
| `p,+` is shorthand for `sepBy(p, ",")`. It parses 1 or more occurrences of
`p` separated by `,`, that is: `p | p,p | p,p,p | ...`.
It produces a `nullNode` containing a `SepArray` with the interleaved parser
results. It has arity 1, and auto-groups its component parser if needed.
stx ,+
The ,*,?
modifier matches zero or more repetitions of the preceding syntax with interleaved commas, allowing an optional trailing comma after the final repetition.
It can also be written using sepBy
with the allowTrailingSep
modifier.
stx ::= ...
| `p,*,?` is shorthand for `sepBy(p, ",", allowTrailingSep)`.
It parses 0 or more occurrences of `p` separated by `,`, possibly including
a trailing `,`, that is: `empty | p | p, | p,p | p,p, | p,p,p | ...`.
It produces a `nullNode` containing a `SepArray` with the interleaved parser
results. It has arity 1, and auto-groups its component parser if needed.
stx ,*,?
The ,+,?
modifier matches one or more repetitions of the preceding syntax with interleaved commas, allowing an optional trailing comma after the final repetition.
It can also be written using sepBy1
with the allowTrailingSep
modifier.
stx ::= ...
| `p,+,?` is shorthand for `sepBy1(p, ",", allowTrailingSep)`.
It parses 1 or more occurrences of `p` separated by `,`, possibly including
a trailing `,`, that is: `p | p, | p,p | p,p, | p,p,p | ...`.
It produces a `nullNode` containing a `SepArray` with the interleaved parser
results. It has arity 1, and auto-groups its component parser if needed.
stx ,+,?
The <|>
operator, which can be written orelse
, matches either syntax.
However, if the first branch consumes any tokens, then it is committed to, and failures will not be backtracked:
stx ::= ...
| `p1 <|> p2` is shorthand for `orelse(p1, p2)`, and parses either `p1` or `p2`.
It does not backtrack, meaning that if `p1` consumes at least one token then
`p2` will not be tried. Therefore, the parsers should all differ in their first
token. The `atomic(p)` parser combinator can be used to locally backtrack a parser.
(For full backtracking, consider using extensible syntax classes instead.)
On success, if the inner parser does not generate exactly one node, it will be
automatically wrapped in a `group` node, so the result will always be arity 1.
The `<|>` combinator does not generate a node of its own, and in particular
does not tag the inner parsers to distinguish them, which can present a problem
when reconstructing the parse. A well formed `<|>` parser should use disjoint
node kinds for `p1` and `p2`.
stx <|> stx
stx ::= ...
| orelse(stx, stx)
The !
operator matches the complement of its argument.
If its argument fails, then it succeeds, resetting the parsing state.
stx ::= ...
| `!p` parses the negation of `p`. That is, it fails if `p` succeeds, and
otherwise parses nothing. It has arity 0.
! stx
Syntax specifiers may be grouped using parentheses.
stx ::= ...
| (stx)
Repetitions may be defined using many
and many1
.
The latter requires at least one instance of the repeated syntax.
stx ::= ...
| many(stx)
stx ::= ...
| many1(stx)
Repetitions with separators may be defined using sepBy
and sepBy1
, which respectively match zero or more occurrences and one or more occurrences, separated by some other syntax.
They come in three varieties:
-
The two-parameter version uses the atom provided in the string literal to parse the separators, and does not allow trailing separators.
-
The three-parameter version uses the third parameter to parse the separators, using the atom for pretty-printing.
-
The four-parameter version optionally allows the separator to occur an extra time at the end of the sequence.
The fourth argument must always literally be the keyword allowTrailingSep
.
stx ::= ...
| sepBy(stx, str)
stx ::= ...
| sepBy(stx, str, stx)
stx ::= ...
| sepBy(stx, str, stx, allowTrailingSep)
stx ::= ...
| sepBy1(stx, str)
stx ::= ...
| sepBy1(stx, str, stx)
stx ::= ...
| sepBy1(stx, str, stx, allowTrailingSep)