Quasiquotation with Binders: A Lean Metaprogramming Example

In programming languages research we often explore a language feature by formally defining a tiny language that is just big enough to have the properties we want, but is still small enough to make a fully self-contained formalization feasible. These tiny languages are typically not useful for actual programming, but it is often possible to encode more expressive features in the smaller language. Thus, the formal results can in principle apply to larger languages. Small models are also valuable for exploring new implementation techniques.

One nice thing about a formal model in an interactive theorem prover is that it can be played with! Experimentation with a system is a great way to build intuitions about it. While paper formalizations often require laborious pen-and-paper calculations to check an example, the computer can do it much more quickly. Here, notations are important: the closer the mechanization is to the notation in which we communicate with each other, the easier it is to experiment with. This is why Lean and other provers have an extensible notation system.

This post demonstrates one approach to solving both the problem of encoding features in a core language and the problem of concrete syntax: binding-aware quasiquotation. First, I'll describe a small language and show how to give it a nice syntax. This nice syntax supports escaping to Lean from the embedded language, so Lean becomes a macro language for the embedded language. Re-using Lean's own binding forms to implement the embedded language's binding forms makes it possible to use bound variables from the embedded language inside these escapes, but it also leads to leaky abstractions and confusing error messages. Fortunately, Lean's macro hygiene system can be used to fix these leaks.

This post assumes a background in programming language theory and implementation. For the sake of space, the implemented language is not a full formalization, but it exhibits the key challenges that arise when implementing quasiquotation with bindings. This is the first in a series of two posts on this topic; the next post in the series will demonstrate the use of elaborators to further improve the language embedding.

The Language

The choice of example language here is not particularly important, so I'm using the untyped lambda-calculus with built-in let-bindings. Because I first hit upon this method while experimenting with an implementation of András Kovács and Olle Frederiksen's glued evaluator, the example code is based on theirs. Please see the linked file and the smalltt repository for more details on this approach.

Glued evaluation is one approach to implementing the computations that occur in a type checker and elaborator for a dependently typed language. An elaborator converts from a convenient surface syntax to a simpler core language, simultaneously type checking the input and translating it to a form suitable to be checked by a small, simple kernel. While elaborating, it is necessary to compare expressions modulo α-equivalence, β-reduction, and often also η-expansion. At the same time, the system must maintain a collection of metavariables, which represent holes in the program that the user expects that elaborator to fill in. In Lean, for example, some of the uses of metavariables are to represent implicit arguments to functions, type class instances, and explicit user uses of the placeholder _. There is a tension between the computations that are necessary to compare two terms, which may unfold many definitions, and the size of terms that are saved as metavariable solutions, which should ideally be as small as possible. The glued evaluator computes both fully- and partially-unfolded results, sharing as much intermediate work as possible.

Theoretically, the language implemented in this post has the following structure:

M, N ::= c
       | x
       | λ x . M
       | M N
       | let x := M in N

The x terminal represents locally-bound variables, while c represents defined constants drawn from some ambient global environment. To keep the size of elaborated terms manageable, unifiers computed by the elaborator will not unfold global constants, while constants can be unfolded in the process of checking definitional equality. This is implemented through a variation of normalization by evaluation (NbE), with the domain of values suitably extended to represent both unfoldings. In NbE, expressions are first evaluated to some semantic domain that includes the equalities that we'd like to check over syntax, and then quoted or read back to syntax. Because the semantic domain that is the target of evaluation includes the desired equalities, the process of quotation must necessarily compute normal forms with respect to this equality.

In NbE, the bodies of functions are also normalized. This means that the values need to include representations for unknown variables. Because these unknown variables might later be applied to arguments, these extra values must also represent the arguments that are waiting for the variable, should its value as a function become known.

For this language, values may be one of the following:

  • a closure that provides the values of the free variables in a function along with its code, resulting from evaluating a lambda

  • a local application that pairs an unknown local variable with a spine of argument values to which it can be applied when its value becomes known

  • a global application that simultaneously includes both a global constant with a spine of arguments and a lazy value that will compute the result when forced

Because global application values contain both the result of the function call and a representation of its syntax, the read-back or quotation phase of NbE can choose a more compact representation or a normal form.

To represent the glued evaluator in Lean, the first step is to define c and x from the grammar, as TNameTName : TypeTop-level names and LNameLName : TypeLocal names respectively:

/-- Top-level names -/ structure TNameTName : TypeTop-level names where nameTName.name (self : TName) : String : StringString : Type`String` is the type of (UTF-8 encoded) strings. The compiler overrides the data representation of this type to a byte sequence, and both `String.utf8ByteSize` and `String.length` are cached and O(1). deriving BEqBEq.{u} (α : Type u) : Type u`BEq α` is a typeclass for supplying a boolean-valued equality relation on `α`, notated as `a == b`. Unlike `DecidableEq α` (which uses `a = b`), this is `Bool` valued instead of `Prop` valued, and it also does not have any axioms like being reflexive or agreeing with `=`. It is mainly intended for programming applications. See `LawfulBEq` for a version that requires that `==` and `=` coincide. , InhabitedInhabited.{u} (α : Sort u) : Sort (max 1 u)`Inhabited α` is a typeclass that says that `α` has a designated element, called `(default : α)`. This is sometimes referred to as a "pointed type". This class is used by functions that need to return a value of the type when called "out of domain". For example, `Array.get! arr i : α` returns a value of type `α` when `arr : Array α`, but if `i` is not in range of the array, it reports a panic message, but this does not halt the program, so it must still return a value of type `α` (and in fact this is required for logical consistency), so in this case it returns `default`. , ReprRepr.{u} (α : Type u) : Type u, DecidableEqDecidableEq.{u} (α : Sort u) : Sort (max 1 u)Asserts that `α` has decidable equality, that is, `a = b` is decidable for all `a b : α`. See `Decidable`. /-- Local names -/ structure LNameLName : TypeLocal names where nameLName.name (self : LName) : String : StringString : Type`String` is the type of (UTF-8 encoded) strings. The compiler overrides the data representation of this type to a byte sequence, and both `String.utf8ByteSize` and `String.length` are cached and O(1). deriving BEqBEq.{u} (α : Type u) : Type u`BEq α` is a typeclass for supplying a boolean-valued equality relation on `α`, notated as `a == b`. Unlike `DecidableEq α` (which uses `a = b`), this is `Bool` valued instead of `Prop` valued, and it also does not have any axioms like being reflexive or agreeing with `=`. It is mainly intended for programming applications. See `LawfulBEq` for a version that requires that `==` and `=` coincide. , InhabitedInhabited.{u} (α : Sort u) : Sort (max 1 u)`Inhabited α` is a typeclass that says that `α` has a designated element, called `(default : α)`. This is sometimes referred to as a "pointed type". This class is used by functions that need to return a value of the type when called "out of domain". For example, `Array.get! arr i : α` returns a value of type `α` when `arr : Array α`, but if `i` is not in range of the array, it reports a panic message, but this does not halt the program, so it must still return a value of type `α` (and in fact this is required for logical consistency), so in this case it returns `default`. , ReprRepr.{u} (α : Type u) : Type u, DecidableEqDecidableEq.{u} (α : Sort u) : Sort (max 1 u)Asserts that `α` has decidable equality, that is, `a = b` is decidable for all `a b : α`. See `Decidable`.

The definition of terms follows the grammar:

/-- Terms -/ inductiveIn Lean, every concrete type other than the universes and every type constructor other than dependent arrows is an instance of a general family of type constructions known as inductive types. It is remarkable that it is possible to construct a substantial edifice of mathematics based on nothing more than the type universes, dependent arrow types, and inductive types; everything else follows from those. Intuitively, an inductive type is built up from a specified list of constructor. For example, `List α` is the list of elements of type `α`, and is defined as follows: ``` inductive List (α : Type u) where | nil | cons (head : α) (tail : List α) ``` A list of elements of type `α` is either the empty list, `nil`, or an element `head : α` followed by a list `tail : List α`. For more information about [inductive types](https://lean-lang.org/theorem_proving_in_lean4/inductive_types.html). Tm_root_.Tm : TypeTerms where | topTm.top (a✝ : TName) : Tm : TNameTName : TypeTop-level names Tm_root_.Tm : TypeTerms | locTm.loc (a✝ : LName) : Tm : LNameLName : TypeLocal names Tm_root_.Tm : TypeTerms | lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm : LNameLName : TypeLocal names Tm_root_.Tm : TypeTerms Tm_root_.Tm : TypeTerms | appTm.app (a✝a✝¹ : Tm) : Tm : Tm_root_.Tm : TypeTerms Tm_root_.Tm : TypeTerms Tm_root_.Tm : TypeTerms | letTm.let (a✝ : LName) (a✝¹a✝² : Tm) : Tm : LNameLName : TypeLocal names Tm_root_.Tm : TypeTerms Tm_root_.Tm : TypeTerms Tm_root_.Tm : TypeTerms deriving BEqBEq.{u} (α : Type u) : Type u`BEq α` is a typeclass for supplying a boolean-valued equality relation on `α`, notated as `a == b`. Unlike `DecidableEq α` (which uses `a = b`), this is `Bool` valued instead of `Prop` valued, and it also does not have any axioms like being reflexive or agreeing with `=`. It is mainly intended for programming applications. See `LawfulBEq` for a version that requires that `==` and `=` coincide. , ReprRepr.{u} (α : Type u) : Type u, InhabitedInhabited.{u} (α : Sort u) : Sort (max 1 u)`Inhabited α` is a typeclass that says that `α` has a designated element, called `(default : α)`. This is sometimes referred to as a "pointed type". This class is used by functions that need to return a value of the type when called "out of domain". For example, `Array.get! arr i : α` returns a value of type `α` when `arr : Array α`, but if `i` is not in range of the array, it reports a panic message, but this does not halt the program, so it must still return a value of type `α` (and in fact this is required for logical consistency), so in this case it returns `default`.

Example: Church Numerals

One of the best ways to test a normalization procedure is to use Church numerals. Church numerals are an encoding of the natural numbers in the untyped lambda calculus, in which a number n is a function that takes a function f and a base value z, applying f to z a total of n times. For instance, two is \lambda f. \lambda z. f\ (f\ z) and zero is \lambda f.\lambda z. z. Church numerals are an excellent test of normalization because they are fairly easy to recognize with a bit of practice, and they tend to be good exercise for systems that rely on string representations of names due to using many similar local names. Addition of Church numerals is a function that takes two encoded numbers followed by f and z, using one of the numbers in the base value provided to the other: \lambda n. \lambda k. \lambda f.\lambda z. n\ f\ (k\ f\ z).

The corresponding Tm definitions are:

def Tm.twoTm.two : Tm : TmTm : TypeTerms := .lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm "f""f" : String (.lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm "z""z" : String (.appTm.app (a✝a✝¹ : Tm) : Tm (.locTm.loc (a✝ : LName) : Tm "f""f" : String) (.appTm.app (a✝a✝¹ : Tm) : Tm (.locTm.loc (a✝ : LName) : Tm "f""f" : String) (.locTm.loc (a✝ : LName) : Tm "z""z" : String)))) def Tm.zeroTm.zero : Tm : TmTm : TypeTerms := .lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm "f""f" : String (.lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm "z""z" : String (.locTm.loc (a✝ : LName) : Tm "z""z" : String)) def Tm.succTm.succ : Tm : TmTm : TypeTerms := .lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm "n""n" : String (.lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm "f""f" : String (.lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm "z""z" : String (.appTm.app (a✝a✝¹ : Tm) : Tm (.locTm.loc (a✝ : LName) : Tm "f""f" : String) (.appTm.app (a✝a✝¹ : Tm) : Tm (.appTm.app (a✝a✝¹ : Tm) : Tm (.locTm.loc (a✝ : LName) : Tm "n""n" : String) (.locTm.loc (a✝ : LName) : Tm "f""f" : String)) (.locTm.loc (a✝ : LName) : Tm "z""z" : String))))) def Tm.addTm.add : Tm : TmTm : TypeTerms := .lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm "n""n" : String (.lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm "k""k" : String (.lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm "f""f" : String (.lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm "z""z" : String (.appTm.app (a✝a✝¹ : Tm) : Tm (.appTm.app (a✝a✝¹ : Tm) : Tm (.locTm.loc (a✝ : LName) : Tm "n""n" : String) (.locTm.loc (a✝ : LName) : Tm "f""f" : String)) (.appTm.app (a✝a✝¹ : Tm) : Tm (.appTm.app (a✝a✝¹ : Tm) : Tm (.locTm.loc (a✝ : LName) : Tm "k""k" : String) (.locTm.loc (a✝ : LName) : Tm "f""f" : String)) (.locTm.loc (a✝ : LName) : Tm "z""z" : String))))))

They're already quite difficult to read. The downside of using Church numerals to test a normalization procedure is that it can be hard to know whether a surprising result is due to a mistake in the input or a mistake in the evaluator. It's worth it to devise a way to write ordinary numbers instead of Church numerals. One way to convert a natural number to a Church numeral is syntactic - the chain of applications of f in the body is constructed directly:

def toChurch_root_.toChurch (n : Nat) : Tm (nNat : NatNat : TypeThe type of natural numbers, starting at zero. It is defined as an inductive type freely generated by "zero is a natural number" and "the successor of a natural number is a natural number". You can prove a theorem `P n` about `n : Nat` by `induction n`, which will expect a proof of the theorem for `P 0`, and a proof of `P (succ i)` assuming a proof of `P i`. The same method also works to define functions by recursion on natural numbers: induction and recursion are two expressions of the same operation from Lean's point of view. ``` open Nat example (n : Nat) : n < succ n := by induction n with | zero => show 0 < 1 decide | succ i ih => -- ih : i < succ i show succ i < succ (succ i) exact Nat.succ_lt_succ ih ``` This type is special-cased by both the kernel and the compiler: * The type of expressions contains "`Nat` literals" as a primitive constructor, and the kernel knows how to reduce zero/succ expressions to nat literals. * If implemented naively, this type would represent a numeral `n` in unary as a linked list with `n` links, which is horribly inefficient. Instead, the runtime itself has a special representation for `Nat` which stores numbers up to 2^63 directly and larger numbers use an arbitrary precision "bignum" library (usually [GMP](https://gmplib.org/)). ) : TmTm : TypeTerms := .lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm "f""f" : String (.lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm "z""z" : String (mkBodytoChurch.mkBody (x✝ : Nat) : Tm nNat)) where mkBodytoChurch.mkBody (x✝ : Nat) : Tm | 0 => .locTm.loc (a✝ : LName) : Tm "z""z" : String | kNat + 1 => .appTm.app (a✝a✝¹ : Tm) : Tm (.locTm.loc (a✝ : LName) : Tm "f""f" : String) (mkBodytoChurch.mkBody (x✝ : Nat) : Tm kNat)

Another is more semantic, using the definitions of zero and successor:

def toChurch'_root_.toChurch' (a✝ : Nat) : Tm : NatNat : TypeThe type of natural numbers, starting at zero. It is defined as an inductive type freely generated by "zero is a natural number" and "the successor of a natural number is a natural number". You can prove a theorem `P n` about `n : Nat` by `induction n`, which will expect a proof of the theorem for `P 0`, and a proof of `P (succ i)` assuming a proof of `P i`. The same method also works to define functions by recursion on natural numbers: induction and recursion are two expressions of the same operation from Lean's point of view. ``` open Nat example (n : Nat) : n < succ n := by induction n with | zero => show 0 < 1 decide | succ i ih => -- ih : i < succ i show succ i < succ (succ i) exact Nat.succ_lt_succ ih ``` This type is special-cased by both the kernel and the compiler: * The type of expressions contains "`Nat` literals" as a primitive constructor, and the kernel knows how to reduce zero/succ expressions to nat literals. * If implemented naively, this type would represent a numeral `n` in unary as a linked list with `n` links, which is horribly inefficient. Instead, the runtime itself has a special representation for `Nat` which stores numbers up to 2^63 directly and larger numbers use an arbitrary precision "bignum" library (usually [GMP](https://gmplib.org/)). TmTm : TypeTerms | 0 => Tm.zeroTm.zero : Tm | nNat + 1 => .appTm.app (a✝a✝¹ : Tm) : Tm Tm.succTm.succ : Tm (toChurch'_root_.toChurch' (a✝ : Nat) : Tm nNat)

While toChurch'toChurch' (a✝ : Nat) : Tm is easier to check with a quick glance, toChurchtoChurch (n : Nat) : Tm generates much more readable output:

Tm.lam { name := "f" } (Tm.lam { name := "z" } (Tm.app (Tm.loc { name := "f" }) (Tm.app (Tm.loc { name := "f" }) (Tm.app (Tm.loc { name := "f" }) (Tm.loc { name := "z" }))))) #eval toChurchtoChurch (n : Nat) : Tm 3
Tm.lam
  { name := "f" }
  (Tm.lam
    { name := "z" }
    (Tm.app
      (Tm.loc { name := "f" })
      (Tm.app (Tm.loc { name := "f" }) (Tm.app (Tm.loc { name := "f" }) (Tm.loc { name := "z" })))))
Tm.app (Tm.lam { name := "n" } (Tm.lam { name := "f" } (Tm.lam { name := "z" } (Tm.app (Tm.loc { name := "f" }) (Tm.app (Tm.app (Tm.loc { name := "n" }) (Tm.loc { name := "f" })) (Tm.loc { name := "z" })))))) (Tm.app (Tm.lam { name := "n" } (Tm.lam { name := "f" } (Tm.lam { name := "z" } (Tm.app (Tm.loc { name := "f" }) (Tm.app (Tm.app (Tm.loc { name := "n" }) (Tm.loc { name := "f" })) (Tm.loc { name := "z" })))))) (Tm.app (Tm.lam { name := "n" } (Tm.lam { name := "f" } (Tm.lam { name := "z" } (Tm.app (Tm.loc { name := "f" }) (Tm.app (Tm.app (Tm.loc { name := "n" }) (Tm.loc { name := "f" })) (Tm.loc { name := "z" })))))) (Tm.lam { name := "f" } (Tm.lam { name := "z" } (Tm.loc { name := "z" }))))) #eval toChurch'toChurch' (a✝ : Nat) : Tm 3
Tm.app
  (Tm.lam
    { name := "n" }
    (Tm.lam
      { name := "f" }
      (Tm.lam
        { name := "z" }
        (Tm.app
          (Tm.loc { name := "f" })
          (Tm.app (Tm.app (Tm.loc { name := "n" }) (Tm.loc { name := "f" })) (Tm.loc { name := "z" }))))))
  (Tm.app
    (Tm.lam
      { name := "n" }
      (Tm.lam
        { name := "f" }
        (Tm.lam
          { name := "z" }
          (Tm.app
            (Tm.loc { name := "f" })
            (Tm.app (Tm.app (Tm.loc { name := "n" }) (Tm.loc { name := "f" })) (Tm.loc { name := "z" }))))))
    (Tm.app
      (Tm.lam
        { name := "n" }
        (Tm.lam
          { name := "f" }
          (Tm.lam
            { name := "z" }
            (Tm.app
              (Tm.loc { name := "f" })
              (Tm.app (Tm.app (Tm.loc { name := "n" }) (Tm.loc { name := "f" })) (Tm.loc { name := "z" }))))))
      (Tm.lam { name := "f" } (Tm.lam { name := "z" } (Tm.loc { name := "z" })))))

Defining the Evaluator

A traditional McCarthy-style eval-apply evaluator, using closures as the values of functions, works well for NbE. The eval function takes an expression and an environment of values for the expression's free variables, yielding a value. It is mutually recursively defined with apply, which takes the value of a function and the value of its argument and computes the result of the function call. eval calls apply when it has computed the values of the function and argument positions in an application, and apply calls eval in order to compute with the body of a function.

Environments

The first step in defining the evaluator is to define environments. Because global and local names are syntactically distinct, environments contain both:

/-- Local environments -/ def LocalEnvLocalEnv.{u_1} (v : Type u_1) : Type u_1Local environments vType u_1 := ListList.{u} (α : Type u) : Type u`List α` is the type of ordered lists with elements of type `α`. It is implemented as a linked list. `List α` is isomorphic to `Array α`, but they are useful for different things: * `List α` is easier for reasoning, and `Array α` is modeled as a wrapper around `List α` * `List α` works well as a persistent data structure, when many copies of the tail are shared. When the value is not shared, `Array α` will have better performance because it can do destructive updates. (LNameLName : TypeLocal names × vType u_1) /-- Top-level environments -/ def TopEnvTopEnv.{u_1} (v : Type u_1) : Type u_1Top-level environments vType u_1 := ListList.{u} (α : Type u) : Type u`List α` is the type of ordered lists with elements of type `α`. It is implemented as a linked list. `List α` is isomorphic to `Array α`, but they are useful for different things: * `List α` is easier for reasoning, and `Array α` is modeled as a wrapper around `List α` * `List α` works well as a persistent data structure, when many copies of the tail are shared. When the value is not shared, `Array α` will have better performance because it can do destructive updates. (TNameTName : TypeTop-level names × vType u_1) /-- Environments map free variables to their values -/ structure EnvEnv.{u} (v : Type u) : Type uEnvironments map free variables to their values (vType u : TypeA type universe. `Type ≡ Type 0`, `Type u ≡ Sort (u + 1)`. u) where topEnv.top.{u} {v : Type u} (self : Env v) : List (TName × v) : ListList.{u} (α : Type u) : Type u`List α` is the type of ordered lists with elements of type `α`. It is implemented as a linked list. `List α` is isomorphic to `Array α`, but they are useful for different things: * `List α` is easier for reasoning, and `Array α` is modeled as a wrapper around `List α` * `List α` works well as a persistent data structure, when many copies of the tail are shared. When the value is not shared, `Array α` will have better performance because it can do destructive updates. (TNameTName : TypeTop-level names × vType u) locEnv.loc.{u} {v : Type u} (self : Env v) : List (LName × v) : ListList.{u} (α : Type u) : Type u`List α` is the type of ordered lists with elements of type `α`. It is implemented as a linked list. `List α` is isomorphic to `Array α`, but they are useful for different things: * `List α` is easier for reasoning, and `Array α` is modeled as a wrapper around `List α` * `List α` works well as a persistent data structure, when many copies of the tail are shared. When the value is not shared, `Array α` will have better performance because it can do destructive updates. (LNameLName : TypeLocal names × vType u) deriving ReprRepr.{u} (α : Type u) : Type u

Environments can be empty, and they can be extended with additional local or top-level values:

/-- The empty environment -/ def Env.emptyEnv.empty.{u_1} {v : Type u_1} : Env vThe empty environment : EnvEnv.{u} (v : Type u) : Type uEnvironments map free variables to their values vType u_1 := [], [] /-- Add a local variable's value to an environment -/ def Env.addLocalEnv.addLocal.{u_1} {v : Type u_1} (ρ : Env v) (x : LName) (val : v) : Env vAdd a local variable's value to an environment (ρEnv v : EnvEnv.{u} (v : Type u) : Type uEnvironments map free variables to their values vType u_1) (xLName : LNameLName : TypeLocal names ) (valv : vType u_1) : EnvEnv.{u} (v : Type u) : Type uEnvironments map free variables to their values vType u_1 := {ρEnv v with locList (LName × v) := (xLName, valv) :: ρEnv v.locEnv.loc.{u} {v : Type u} (self : Env v) : List (LName × v)} /-- Add a top-level constant's definition to an environment -/ def Env.addTopEnv.addTop.{u_1} {v : Type u_1} (ρ : Env v) (x : TName) (val : v) : Env vAdd a top-level constant's definition to an environment (ρEnv v : EnvEnv.{u} (v : Type u) : Type uEnvironments map free variables to their values vType u_1) (xTName : TNameTName : TypeTop-level names ) (valv : vType u_1) : EnvEnv.{u} (v : Type u) : Type uEnvironments map free variables to their values vType u_1 := {ρEnv v with topList (TName × v) := (xTName, valv) :: ρEnv v.topEnv.top.{u} {v : Type u} (self : Env v) : List (TName × v)}

The implementation of the language will need to be able to look up names in the environment. This can be made convenient by defining two implementations of GetElemGetElem.{u, v, w} (cont : Type u) (idx : Type v) (elem : outParam (Type w)) (dom : outParam (cont → idx → Prop)) : Type (max (max u v) w)The class `GetElem cont idx elem dom` implements the `xs[i]` notation. When you write this, given `xs : cont` and `i : idx`, Lean looks for an instance of `GetElem cont idx elem dom`. Here `elem` is the type of `xs[i]`, while `dom` is whatever proof side conditions are required to make this applicable. For example, the instance for arrays looks like `GetElem (Array α) Nat α (fun xs i => i < xs.size)`. The proof side-condition `dom xs i` is automatically dispatched by the `get_elem_tactic` tactic, which can be extended by adding more clauses to `get_elem_tactic_trivial`. , but the details of the implementation are a not really the point of this post. If you're curious, please feel free to read the code, but otherwise, it's fine to skip it.

The details

Using two separate GetElemGetElem.{u, v, w} (cont : Type u) (idx : Type v) (elem : outParam (Type w)) (dom : outParam (cont → idx → Prop)) : Type (max (max u v) w)The class `GetElem cont idx elem dom` implements the `xs[i]` notation. When you write this, given `xs : cont` and `i : idx`, Lean looks for an instance of `GetElem cont idx elem dom`. Here `elem` is the type of `xs[i]`, while `dom` is whatever proof side conditions are required to make this applicable. For example, the instance for arrays looks like `GetElem (Array α) Nat α (fun xs i => i < xs.size)`. The proof side-condition `dom xs i` is automatically dispatched by the `get_elem_tactic` tactic, which can be extended by adding more clauses to `get_elem_tactic_trivial`. instances allows either form of name to be looked up, automatically selecting the correct environment.

def HasKeyHasKey.{u_1, u_2} {α : Type u_1} {β : Type u_2} (k : α) (ρ : List (α × β)) : Prop (kα : αType u_1) : (ρList (α × β) : ListList.{u} (α : Type u) : Type u`List α` is the type of ordered lists with elements of type `α`. It is implemented as a linked list. `List α` is isomorphic to `Array α`, but they are useful for different things: * `List α` is easier for reasoning, and `Array α` is modeled as a wrapper around `List α` * `List α` works well as a persistent data structure, when many copies of the tail are shared. When the value is not shared, `Array α` will have better performance because it can do destructive updates. (αType u_1 × βType u_2)) Prop | [] => FalseFalse : Prop`False` is the empty proposition. Thus, it has no introduction rules. It represents a contradiction. `False` elimination rule, `False.rec`, expresses the fact that anything follows from a contradiction. This rule is sometimes called ex falso (short for ex falso sequitur quodlibet), or the principle of explosion. For more information: [Propositional Logic](https://lean-lang.org/theorem_proving_in_lean4/propositions_and_proofs.html#propositional-logic) | (k'α, _) :: ρList (α × β) => kα = k'α HasKeyHasKey.{u_1, u_2} {α : Type u_1} {β : Type u_2} (k : α) (ρ : List (α × β)) : Prop kα ρList (α × β) def decideHasKeydecideHasKey.{u_1, u_2} {k : Type u_1} {v : Type u_2} [inst✝ : DecidableEq k] (x : k) (ρ : List (k × v)) : Decidable (HasKey x ρ) [DecidableEqDecidableEq.{u} (α : Sort u) : Sort (max 1 u)Asserts that `α` has decidable equality, that is, `a = b` is decidable for all `a b : α`. See `Decidable`. kType u_1] (xk : kType u_1) : (ρList (k × v) : ListList.{u} (α : Type u) : Type u`List α` is the type of ordered lists with elements of type `α`. It is implemented as a linked list. `List α` is isomorphic to `Array α`, but they are useful for different things: * `List α` is easier for reasoning, and `Array α` is modeled as a wrapper around `List α` * `List α` works well as a persistent data structure, when many copies of the tail are shared. When the value is not shared, `Array α` will have better performance because it can do destructive updates. (kType u_1 × vType u_2)) DecidableDecidable (p : Prop) : Type`Decidable p` is a data-carrying class that supplies a proof that `p` is either `true` or `false`. It is equivalent to `Bool` (and in fact it has the same code generation as `Bool`) together with a proof that the `Bool` is true iff `p` is. `Decidable` instances are used to infer "computation strategies" for propositions, so that you can have the convenience of writing propositions inside `if` statements and executing them (which actually executes the inferred decidability instance instead of the proposition, which has no code). If a proposition `p` is `Decidable`, then `(by decide : p)` will prove it by evaluating the decidability instance to `isTrue h` and returning `h`. Because `Decidable` carries data, when writing `@[simp]` lemmas which include a `Decidable` instance on the LHS, it is best to use `{_ : Decidable p}` rather than `[Decidable p]` so that non-canonical instances can be found via unification rather than typeclass search. (HasKeyHasKey.{u_1, u_2} {α : Type u_1} {β : Type u_2} (k : α) (ρ : List (α × β)) : Prop xk ρList (k × v)) | [] => isFalseDecidable.isFalse {p : Prop} (h : ¬p) : Decidable pProve that `p` is decidable by supplying a proof of `¬p` fun hHasKey x [] => nomatchEmpty match/ex falso. `nomatch e` is of arbitrary type `α : Sort u` if Lean can show that an empty set of patterns is exhaustive given `e`'s type, e.g. because it has no constructors. hHasKey x [] | (yk, _) :: ρ'List (k × v) => if"Dependent" if-then-else, normally written via the notation `if h : c then t(h) else e(h)`, is sugar for `dite c (fun h => t(h)) (fun h => e(h))`, and it is the same as `if c then t else e` except that `t` is allowed to depend on a proof `h : c`, and `e` can depend on `h : ¬c`. (Both branches use the same name for the hypothesis, even though it has different types in the two cases.) We use this to be able to communicate the if-then-else condition to the branches. For example, `Array.get arr ⟨i, h⟩` expects a proof `h : i < arr.size` in order to avoid a bounds check, so you can write `if h : i < arr.size then arr.get ⟨i, h⟩ else ...` to avoid the bounds check inside the if branch. (Of course in this case we have only lifted the check into an explicit `if`, but we could also use this proof multiple times or derive `i < arr.size` from some other proposition that we are checking in the `if`.) hx = y : xk = yk then"Dependent" if-then-else, normally written via the notation `if h : c then t(h) else e(h)`, is sugar for `dite c (fun h => t(h)) (fun h => e(h))`, and it is the same as `if c then t else e` except that `t` is allowed to depend on a proof `h : c`, and `e` can depend on `h : ¬c`. (Both branches use the same name for the hypothesis, even though it has different types in the two cases.) We use this to be able to communicate the if-then-else condition to the branches. For example, `Array.get arr ⟨i, h⟩` expects a proof `h : i < arr.size` in order to avoid a bounds check, so you can write `if h : i < arr.size then arr.get ⟨i, h⟩ else ...` to avoid the bounds check inside the if branch. (Of course in this case we have only lifted the check into an explicit `if`, but we could also use this proof multiple times or derive `i < arr.size` from some other proposition that we are checking in the `if`.) isTrueDecidable.isTrue {p : Prop} (h : p) : Decidable pProve that `p` is decidable by supplying a proof of `p` <| by`by tac` constructs a term of the expected type by running the tactic(s) `tac`. constructorIf the main goal's target type is an inductive type, `constructor` solves it with the first matching constructor, or else fails. ; assumption`assumption` tries to solve the main goal using a hypothesis of compatible type, or else fails. Note also the `‹t›` term notation, which is a shorthand for `show t by assumption`. else"Dependent" if-then-else, normally written via the notation `if h : c then t(h) else e(h)`, is sugar for `dite c (fun h => t(h)) (fun h => e(h))`, and it is the same as `if c then t else e` except that `t` is allowed to depend on a proof `h : c`, and `e` can depend on `h : ¬c`. (Both branches use the same name for the hypothesis, even though it has different types in the two cases.) We use this to be able to communicate the if-then-else condition to the branches. For example, `Array.get arr ⟨i, h⟩` expects a proof `h : i < arr.size` in order to avoid a bounds check, so you can write `if h : i < arr.size then arr.get ⟨i, h⟩ else ...` to avoid the bounds check inside the if branch. (Of course in this case we have only lifted the check into an explicit `if`, but we could also use this proof multiple times or derive `i < arr.size` from some other proposition that we are checking in the `if`.) matchPattern matching. `match e, ... with | p, ... => f | ...` matches each given term `e` against each pattern `p` of a match alternative. When all patterns of an alternative match, the `match` term evaluates to the value of the corresponding right-hand side `f` with the pattern variables bound to the respective matched values. If used as `match h : e, ... with | p, ... => f | ...`, `h : e = p` is available within `f`. When not constructing a proof, `match` does not automatically substitute variables matched on in dependent variables' types. Use `match (generalizing := true) ...` to enforce this. Syntax quotations can also be used in a pattern match. This matches a `Syntax` value against quotations, pattern variables, or `_`. Quoted identifiers only match identical identifiers - custom matching such as by the preresolved names only should be done explicitly. `Syntax.atom`s are ignored during matching by default except when part of a built-in literal. For users introducing new atoms, we recommend wrapping them in dedicated syntax kinds if they should participate in matching. For example, in ```lean syntax "c" ("foo" <|> "bar") ... ``` `foo` and `bar` are indistinguishable during matching, but in ```lean syntax foo := "foo" syntax "c" (foo <|> "bar") ... ``` they are not. decideHasKeydecideHasKey.{u_1, u_2} {k : Type u_1} {v : Type u_2} [inst✝ : DecidableEq k] (x : k) (ρ : List (k × v)) : Decidable (HasKey x ρ) xk ρ'List (k × v) withPattern matching. `match e, ... with | p, ... => f | ...` matches each given term `e` against each pattern `p` of a match alternative. When all patterns of an alternative match, the `match` term evaluates to the value of the corresponding right-hand side `f` with the pattern variables bound to the respective matched values. If used as `match h : e, ... with | p, ... => f | ...`, `h : e = p` is available within `f`. When not constructing a proof, `match` does not automatically substitute variables matched on in dependent variables' types. Use `match (generalizing := true) ...` to enforce this. Syntax quotations can also be used in a pattern match. This matches a `Syntax` value against quotations, pattern variables, or `_`. Quoted identifiers only match identical identifiers - custom matching such as by the preresolved names only should be done explicitly. `Syntax.atom`s are ignored during matching by default except when part of a built-in literal. For users introducing new atoms, we recommend wrapping them in dedicated syntax kinds if they should participate in matching. For example, in ```lean syntax "c" ("foo" <|> "bar") ... ``` `foo` and `bar` are indistinguishable during matching, but in ```lean syntax foo := "foo" syntax "c" (foo <|> "bar") ... ``` they are not. | isTrueDecidable.isTrue {p : Prop} (h : p) : Decidable pProve that `p` is decidable by supplying a proof of `p` yHasKey x ρ' => isTrueDecidable.isTrue {p : Prop} (h : p) : Decidable pProve that `p` is decidable by supplying a proof of `p` <| by`by tac` constructs a term of the expected type by running the tactic(s) `tac`. simpThe `simp` tactic uses lemmas and hypotheses to simplify the main goal target or non-dependent hypotheses. It has many variants: - `simp` simplifies the main goal target using lemmas tagged with the attribute `[simp]`. - `simp [h₁, h₂, ..., hₙ]` simplifies the main goal target using the lemmas tagged with the attribute `[simp]` and the given `hᵢ`'s, where the `hᵢ`'s are expressions. If an `hᵢ` is a defined constant `f`, then the equational lemmas associated with `f` are used. This provides a convenient way to unfold `f`. - `simp [*]` simplifies the main goal target using the lemmas tagged with the attribute `[simp]` and all hypotheses. - `simp only [h₁, h₂, ..., hₙ]` is like `simp [h₁, h₂, ..., hₙ]` but does not use `[simp]` lemmas. - `simp [-id₁, ..., -idₙ]` simplifies the main goal target using the lemmas tagged with the attribute `[simp]`, but removes the ones named `idᵢ`. - `simp at h₁ h₂ ... hₙ` simplifies the hypotheses `h₁ : T₁` ... `hₙ : Tₙ`. If the target or another hypothesis depends on `hᵢ`, a new simplified hypothesis `hᵢ` is introduced, but the old one remains in the local context. - `simp at *` simplifies all the hypotheses and the target. - `simp [*] at *` simplifies target and all (propositional) hypotheses using the other hypotheses. [HasKeyHasKey.{u_1, u_2} {α : Type u_1} {β : Type u_2} (k : α) (ρ : List (α × β)) : Prop, Or.inrOr.inr {a b : Prop} (h : b) : a ∨ b`Or.inr` is "right injection" into an `Or`. If `h : b` then `Or.inr h : a ∨ b`. , *] | isFalseDecidable.isFalse {p : Prop} (h : ¬p) : Decidable pProve that `p` is decidable by supplying a proof of `¬p` n¬HasKey x ρ' => isFalseDecidable.isFalse {p : Prop} (h : ¬p) : Decidable pProve that `p` is decidable by supplying a proof of `¬p` <| by`by tac` constructs a term of the expected type by running the tactic(s) `tac`. introIntroduces one or more hypotheses, optionally naming and/or pattern-matching them. For each hypothesis to be introduced, the remaining main goal's target type must be a `let` or function type. * `intro` by itself introduces one anonymous hypothesis, which can be accessed by e.g. `assumption`. * `intro x y` introduces two hypotheses and names them. Individual hypotheses can be anonymized via `_`, or matched against a pattern: ```lean -- ... ⊢ α × β → ... intro (a, b) -- ..., a : α, b : β ⊢ ... ``` * Alternatively, `intro` can be combined with pattern matching much like `fun`: ```lean intro | n + 1, 0 => tac | ... ``` h'HasKey x ((y, snd✝) :: ρ') casesAssuming `x` is a variable in the local context with an inductive type, `cases x` splits the main goal, producing one goal for each constructor of the inductive type, in which the target is replaced by a general instance of that constructor. If the type of an element in the local context depends on `x`, that element is reverted and reintroduced afterward, so that the case split affects that hypothesis as well. `cases` detects unreachable cases and closes them automatically. For example, given `n : Nat` and a goal with a hypothesis `h : P n` and target `Q n`, `cases n` produces one goal with hypothesis `h : P 0` and target `Q 0`, and one goal with hypothesis `h : P (Nat.succ a)` and target `Q (Nat.succ a)`. Here the name `a` is chosen automatically and is not accessible. You can use `with` to provide the variables names for each constructor. - `cases e`, where `e` is an expression instead of a variable, generalizes `e` in the goal, and then cases on the resulting variable. - Given `as : List α`, `cases as with | nil => tac₁ | cons a as' => tac₂`, uses tactic `tac₁` for the `nil` case, and `tac₂` for the `cons` case, and `a` and `as'` are used as names for the new variables introduced. - `cases h : e`, where `e` is a variable or an expression, performs cases on `e` as above, but also adds a hypothesis `h : e = ...` to each hypothesis, where `...` is the constructor instance for that particular case. h'HasKey x ((y, snd✝) :: ρ') <;> contradiction`contradiction` closes the main goal if its hypotheses are "trivially contradictory". - Inductive type/family with no applicable constructors ```lean example (h : False) : p := by contradiction ``` - Injectivity of constructors ```lean example (h : none = some true) : p := by contradiction -- ``` - Decidable false proposition ```lean example (h : 2 + 2 = 3) : p := by contradiction ``` - Contradictory hypotheses ```lean example (h : p) (h' : ¬ p) : q := by contradiction ``` - Other simple contradictions such as ```lean example (x : Nat) (h : x ≠ x) : p := by contradiction ``` instance : MembershipMembership.{u, v} (α : outParam (Type u)) (γ : Type v) : Type (max u v)The typeclass behind the notation `a ∈ s : Prop` where `a : α`, `s : γ`. Because `α` is an `outParam`, the "container type" `γ` determines the type of the elements of the container. LNameLName : TypeLocal names (LocalEnvLocalEnv.{u_1} (v : Type u_1) : Type u_1Local environments vType u_1) where memLName → LocalEnv v → PropThe membership relation `a ∈ s : Prop` where `a : α`, `s : γ`. xLName ρLocalEnv v := HasKeyHasKey.{u_1, u_2} {α : Type u_1} {β : Type u_2} (k : α) (ρ : List (α × β)) : Prop xLName ρLocalEnv v instance : MembershipMembership.{u, v} (α : outParam (Type u)) (γ : Type v) : Type (max u v)The typeclass behind the notation `a ∈ s : Prop` where `a : α`, `s : γ`. Because `α` is an `outParam`, the "container type" `γ` determines the type of the elements of the container. TNameTName : TypeTop-level names (TopEnvTopEnv.{u_1} (v : Type u_1) : Type u_1Top-level environments vType u_1) where memTName → TopEnv v → PropThe membership relation `a ∈ s : Prop` where `a : α`, `s : γ`. xTName ρTopEnv v := HasKeyHasKey.{u_1, u_2} {α : Type u_1} {β : Type u_2} (k : α) (ρ : List (α × β)) : Prop xTName ρTopEnv v instance {kType u : TypeA type universe. `Type ≡ Type 0`, `Type u ≡ Sort (u + 1)`. u} [DecidableRelDecidableRel.{u} {α : Sort u} (r : α → α → Prop) : Sort (max 1 u)A decidable relation. See `Decidable`. (α := kType u) (· = ·)] {xk : kType u} {ρList (k × v) : ListList.{u} (α : Type u) : Type u`List α` is the type of ordered lists with elements of type `α`. It is implemented as a linked list. `List α` is isomorphic to `Array α`, but they are useful for different things: * `List α` is easier for reasoning, and `Array α` is modeled as a wrapper around `List α` * `List α` works well as a persistent data structure, when many copies of the tail are shared. When the value is not shared, `Array α` will have better performance because it can do destructive updates. (kType u × vType u_1)}: DecidableDecidable (p : Prop) : Type`Decidable p` is a data-carrying class that supplies a proof that `p` is either `true` or `false`. It is equivalent to `Bool` (and in fact it has the same code generation as `Bool`) together with a proof that the `Bool` is true iff `p` is. `Decidable` instances are used to infer "computation strategies" for propositions, so that you can have the convenience of writing propositions inside `if` statements and executing them (which actually executes the inferred decidability instance instead of the proposition, which has no code). If a proposition `p` is `Decidable`, then `(by decide : p)` will prove it by evaluating the decidability instance to `isTrue h` and returning `h`. Because `Decidable` carries data, when writing `@[simp]` lemmas which include a `Decidable` instance on the LHS, it is best to use `{_ : Decidable p}` rather than `[Decidable p]` so that non-canonical instances can be found via unification rather than typeclass search. (HasKeyHasKey.{u_1, u_2} {α : Type u_1} {β : Type u_2} (k : α) (ρ : List (α × β)) : Prop xk ρList (k × v)) := decideHasKeydecideHasKey.{u_1, u_2} {k : Type u_1} {v : Type u_2} [inst✝ : DecidableEq k] (x : k) (ρ : List (k × v)) : Decidable (HasKey x ρ) _ _ instance {xLName : LNameLName : TypeLocal names } {ρLocalEnv v : LocalEnvLocalEnv.{u_1} (v : Type u_1) : Type u_1Local environments vType u_1} : DecidableDecidable (p : Prop) : Type`Decidable p` is a data-carrying class that supplies a proof that `p` is either `true` or `false`. It is equivalent to `Bool` (and in fact it has the same code generation as `Bool`) together with a proof that the `Bool` is true iff `p` is. `Decidable` instances are used to infer "computation strategies" for propositions, so that you can have the convenience of writing propositions inside `if` statements and executing them (which actually executes the inferred decidability instance instead of the proposition, which has no code). If a proposition `p` is `Decidable`, then `(by decide : p)` will prove it by evaluating the decidability instance to `isTrue h` and returning `h`. Because `Decidable` carries data, when writing `@[simp]` lemmas which include a `Decidable` instance on the LHS, it is best to use `{_ : Decidable p}` rather than `[Decidable p]` so that non-canonical instances can be found via unification rather than typeclass search. (xLName ρLocalEnv v) := decideHasKeydecideHasKey.{u_1, u_2} {k : Type u_1} {v : Type u_2} [inst✝ : DecidableEq k] (x : k) (ρ : List (k × v)) : Decidable (HasKey x ρ) _ _ instance {xTName : TNameTName : TypeTop-level names } {ρTopEnv v : TopEnvTopEnv.{u_1} (v : Type u_1) : Type u_1Top-level environments vType u_1} : DecidableDecidable (p : Prop) : Type`Decidable p` is a data-carrying class that supplies a proof that `p` is either `true` or `false`. It is equivalent to `Bool` (and in fact it has the same code generation as `Bool`) together with a proof that the `Bool` is true iff `p` is. `Decidable` instances are used to infer "computation strategies" for propositions, so that you can have the convenience of writing propositions inside `if` statements and executing them (which actually executes the inferred decidability instance instead of the proposition, which has no code). If a proposition `p` is `Decidable`, then `(by decide : p)` will prove it by evaluating the decidability instance to `isTrue h` and returning `h`. Because `Decidable` carries data, when writing `@[simp]` lemmas which include a `Decidable` instance on the LHS, it is best to use `{_ : Decidable p}` rather than `[Decidable p]` so that non-canonical instances can be found via unification rather than typeclass search. (xTName ρTopEnv v) := decideHasKeydecideHasKey.{u_1, u_2} {k : Type u_1} {v : Type u_2} [inst✝ : DecidableEq k] (x : k) (ρ : List (k × v)) : Decidable (HasKey x ρ) _ _ instance : GetElemGetElem.{u, v, w} (cont : Type u) (idx : Type v) (elem : outParam (Type w)) (dom : outParam (cont → idx → Prop)) : Type (max (max u v) w)The class `GetElem cont idx elem dom` implements the `xs[i]` notation. When you write this, given `xs : cont` and `i : idx`, Lean looks for an instance of `GetElem cont idx elem dom`. Here `elem` is the type of `xs[i]`, while `dom` is whatever proof side conditions are required to make this applicable. For example, the instance for arrays looks like `GetElem (Array α) Nat α (fun xs i => i < xs.size)`. The proof side-condition `dom xs i` is automatically dispatched by the `get_elem_tactic` tactic, which can be extended by adding more clauses to `get_elem_tactic_trivial`. (LocalEnvLocalEnv.{u_1} (v : Type u_1) : Type u_1Local environments vType u_1) LNameLName : TypeLocal names vType u_1 (fun ρLocalEnv v xLName => HasKeyHasKey.{u_1, u_2} {α : Type u_1} {β : Type u_2} (k : α) (ρ : List (α × β)) : Prop xLName ρLocalEnv v) where getElem(ρ : LocalEnv v) → (x : LName) → HasKey x ρ → vThe syntax `arr[i]` gets the `i`'th element of the collection `arr`. If there are proof side conditions to the application, they will be automatically inferred by the `get_elem_tactic` tactic. The actual behavior of this class is type-dependent, but here are some important implementations: * `arr[i] : α` where `arr : Array α` and `i : Nat` or `i : USize`: does array indexing with no bounds check and a proof side goal `i < arr.size`. * `l[i] : α` where `l : List α` and `i : Nat`: index into a list, with proof side goal `i < l.length`. * `stx[i] : Syntax` where `stx : Syntax` and `i : Nat`: get a syntax argument, no side goal (returns `.missing` out of range) There are other variations on this syntax: * `arr[i]`: proves the proof side goal by `get_elem_tactic` * `arr[i]!`: panics if the side goal is false * `arr[i]?`: returns `none` if the side goal is false * `arr[i]'h`: uses `h` to prove the side goal ρLocalEnv v xLName okHasKey x ρ := let rec goinstGetElemLocalEnvLNameHasKey.go.{u_1} {v : Type u_1} (x : LName) (locals : List (LName × v)) (a✝ : HasKey x locals) : v : (localsList (LName × v) : ListList.{u} (α : Type u) : Type u`List α` is the type of ordered lists with elements of type `α`. It is implemented as a linked list. `List α` is isomorphic to `Array α`, but they are useful for different things: * `List α` is easier for reasoning, and `Array α` is modeled as a wrapper around `List α` * `List α` works well as a persistent data structure, when many copies of the tail are shared. When the value is not shared, `Array α` will have better performance because it can do destructive updates. (LNameLName : TypeLocal names × vType u_1)) HasKeyHasKey.{u_1, u_2} {α : Type u_1} {β : Type u_2} (k : α) (ρ : List (α × β)) : Prop xLName localsList (LName × v) vType u_1 | [], impossibleHasKey x [] => nomatchEmpty match/ex falso. `nomatch e` is of arbitrary type `α : Sort u` if Lean can show that an empty set of patterns is exhaustive given `e`'s type, e.g. because it has no constructors. impossibleHasKey x [] | (yLName, vv✝) :: locals'List (LName × v✝), ok'HasKey x ((y, v) :: locals') => if"Dependent" if-then-else, normally written via the notation `if h : c then t(h) else e(h)`, is sugar for `dite c (fun h => t(h)) (fun h => e(h))`, and it is the same as `if c then t else e` except that `t` is allowed to depend on a proof `h : c`, and `e` can depend on `h : ¬c`. (Both branches use the same name for the hypothesis, even though it has different types in the two cases.) We use this to be able to communicate the if-then-else condition to the branches. For example, `Array.get arr ⟨i, h⟩` expects a proof `h : i < arr.size` in order to avoid a bounds check, so you can write `if h : i < arr.size then arr.get ⟨i, h⟩ else ...` to avoid the bounds check inside the if branch. (Of course in this case we have only lifted the check into an explicit `if`, but we could also use this proof multiple times or derive `i < arr.size` from some other proposition that we are checking in the `if`.) hx = y : xLName = yLName then"Dependent" if-then-else, normally written via the notation `if h : c then t(h) else e(h)`, is sugar for `dite c (fun h => t(h)) (fun h => e(h))`, and it is the same as `if c then t else e` except that `t` is allowed to depend on a proof `h : c`, and `e` can depend on `h : ¬c`. (Both branches use the same name for the hypothesis, even though it has different types in the two cases.) We use this to be able to communicate the if-then-else condition to the branches. For example, `Array.get arr ⟨i, h⟩` expects a proof `h : i < arr.size` in order to avoid a bounds check, so you can write `if h : i < arr.size then arr.get ⟨i, h⟩ else ...` to avoid the bounds check inside the if branch. (Of course in this case we have only lifted the check into an explicit `if`, but we could also use this proof multiple times or derive `i < arr.size` from some other proposition that we are checking in the `if`.) vv✝ else"Dependent" if-then-else, normally written via the notation `if h : c then t(h) else e(h)`, is sugar for `dite c (fun h => t(h)) (fun h => e(h))`, and it is the same as `if c then t else e` except that `t` is allowed to depend on a proof `h : c`, and `e` can depend on `h : ¬c`. (Both branches use the same name for the hypothesis, even though it has different types in the two cases.) We use this to be able to communicate the if-then-else condition to the branches. For example, `Array.get arr ⟨i, h⟩` expects a proof `h : i < arr.size` in order to avoid a bounds check, so you can write `if h : i < arr.size then arr.get ⟨i, h⟩ else ...` to avoid the bounds check inside the if branch. (Of course in this case we have only lifted the check into an explicit `if`, but we could also use this proof multiple times or derive `i < arr.size` from some other proposition that we are checking in the `if`.) goinstGetElemLocalEnvLNameHasKey.go.{u_1} {v : Type u_1} (x : LName) (locals : List (LName × v)) (a✝ : HasKey x locals) : v locals'List (LName × v✝) $ by`by tac` constructs a term of the expected type by running the tactic(s) `tac`. casesAssuming `x` is a variable in the local context with an inductive type, `cases x` splits the main goal, producing one goal for each constructor of the inductive type, in which the target is replaced by a general instance of that constructor. If the type of an element in the local context depends on `x`, that element is reverted and reintroduced afterward, so that the case split affects that hypothesis as well. `cases` detects unreachable cases and closes them automatically. For example, given `n : Nat` and a goal with a hypothesis `h : P n` and target `Q n`, `cases n` produces one goal with hypothesis `h : P 0` and target `Q 0`, and one goal with hypothesis `h : P (Nat.succ a)` and target `Q (Nat.succ a)`. Here the name `a` is chosen automatically and is not accessible. You can use `with` to provide the variables names for each constructor. - `cases e`, where `e` is an expression instead of a variable, generalizes `e` in the goal, and then cases on the resulting variable. - Given `as : List α`, `cases as with | nil => tac₁ | cons a as' => tac₂`, uses tactic `tac₁` for the `nil` case, and `tac₂` for the `cons` case, and `a` and `as'` are used as names for the new variables introduced. - `cases h : e`, where `e` is a variable or an expression, performs cases on `e` as above, but also adds a hypothesis `h : e = ...` to each hypothesis, where `...` is the constructor instance for that particular case. ok'HasKey x ((y, v) :: locals') <;> trivial`trivial` tries different simple tactics (e.g., `rfl`, `contradiction`, ...) to close the current goal. You can use the command `macro_rules` to extend the set of tactics used. Example: ``` macro_rules | `(tactic| trivial) => `(tactic| simp) ``` goinstGetElemLocalEnvLNameHasKey.go.{u_1} {v : Type u_1} (x : LName) (locals : List (LName × v)) (a✝ : HasKey x locals) : v ρLocalEnv v okHasKey x ρ instance : GetElemGetElem.{u, v, w} (cont : Type u) (idx : Type v) (elem : outParam (Type w)) (dom : outParam (cont → idx → Prop)) : Type (max (max u v) w)The class `GetElem cont idx elem dom` implements the `xs[i]` notation. When you write this, given `xs : cont` and `i : idx`, Lean looks for an instance of `GetElem cont idx elem dom`. Here `elem` is the type of `xs[i]`, while `dom` is whatever proof side conditions are required to make this applicable. For example, the instance for arrays looks like `GetElem (Array α) Nat α (fun xs i => i < xs.size)`. The proof side-condition `dom xs i` is automatically dispatched by the `get_elem_tactic` tactic, which can be extended by adding more clauses to `get_elem_tactic_trivial`. (EnvEnv.{u} (v : Type u) : Type uEnvironments map free variables to their values vType u_1) LNameLName : TypeLocal names vType u_1 (fun ρEnv v xLName => HasKeyHasKey.{u_1, u_2} {α : Type u_1} {β : Type u_2} (k : α) (ρ : List (α × β)) : Prop xLName ρEnv v.locEnv.loc.{u} {v : Type u} (self : Env v) : List (LName × v)) where getElem(ρ : Env v) → (x : LName) → HasKey x ρ.loc → vThe syntax `arr[i]` gets the `i`'th element of the collection `arr`. If there are proof side conditions to the application, they will be automatically inferred by the `get_elem_tactic` tactic. The actual behavior of this class is type-dependent, but here are some important implementations: * `arr[i] : α` where `arr : Array α` and `i : Nat` or `i : USize`: does array indexing with no bounds check and a proof side goal `i < arr.size`. * `l[i] : α` where `l : List α` and `i : Nat`: index into a list, with proof side goal `i < l.length`. * `stx[i] : Syntax` where `stx : Syntax` and `i : Nat`: get a syntax argument, no side goal (returns `.missing` out of range) There are other variations on this syntax: * `arr[i]`: proves the proof side goal by `get_elem_tactic` * `arr[i]!`: panics if the side goal is false * `arr[i]?`: returns `none` if the side goal is false * `arr[i]'h`: uses `h` to prove the side goal ρEnv v xLName okHasKey x ρ.loc := let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` localsLocalEnv v : LocalEnvLocalEnv.{u_1} (v : Type u_1) : Type u_1Local environments vType u_1 := ρEnv v.locEnv.loc.{u} {v : Type u} (self : Env v) : List (LName × v) localsLocalEnv v[xLName] instance : GetElemGetElem.{u, v, w} (cont : Type u) (idx : Type v) (elem : outParam (Type w)) (dom : outParam (cont → idx → Prop)) : Type (max (max u v) w)The class `GetElem cont idx elem dom` implements the `xs[i]` notation. When you write this, given `xs : cont` and `i : idx`, Lean looks for an instance of `GetElem cont idx elem dom`. Here `elem` is the type of `xs[i]`, while `dom` is whatever proof side conditions are required to make this applicable. For example, the instance for arrays looks like `GetElem (Array α) Nat α (fun xs i => i < xs.size)`. The proof side-condition `dom xs i` is automatically dispatched by the `get_elem_tactic` tactic, which can be extended by adding more clauses to `get_elem_tactic_trivial`. (TopEnvTopEnv.{u_1} (v : Type u_1) : Type u_1Top-level environments vType u_1) TNameTName : TypeTop-level names vType u_1 (fun ρTopEnv v xTName => HasKeyHasKey.{u_1, u_2} {α : Type u_1} {β : Type u_2} (k : α) (ρ : List (α × β)) : Prop xTName ρTopEnv v) where getElem(ρ : TopEnv v) → (x : TName) → HasKey x ρ → vThe syntax `arr[i]` gets the `i`'th element of the collection `arr`. If there are proof side conditions to the application, they will be automatically inferred by the `get_elem_tactic` tactic. The actual behavior of this class is type-dependent, but here are some important implementations: * `arr[i] : α` where `arr : Array α` and `i : Nat` or `i : USize`: does array indexing with no bounds check and a proof side goal `i < arr.size`. * `l[i] : α` where `l : List α` and `i : Nat`: index into a list, with proof side goal `i < l.length`. * `stx[i] : Syntax` where `stx : Syntax` and `i : Nat`: get a syntax argument, no side goal (returns `.missing` out of range) There are other variations on this syntax: * `arr[i]`: proves the proof side goal by `get_elem_tactic` * `arr[i]!`: panics if the side goal is false * `arr[i]?`: returns `none` if the side goal is false * `arr[i]'h`: uses `h` to prove the side goal ρTopEnv v xTName okHasKey x ρ := let rec goinstGetElemTopEnvTNameHasKey.go.{u_1} {v : Type u_1} (x : TName) (tops : List (TName × v)) (a✝ : HasKey x tops) : v : (topsList (TName × v) : ListList.{u} (α : Type u) : Type u`List α` is the type of ordered lists with elements of type `α`. It is implemented as a linked list. `List α` is isomorphic to `Array α`, but they are useful for different things: * `List α` is easier for reasoning, and `Array α` is modeled as a wrapper around `List α` * `List α` works well as a persistent data structure, when many copies of the tail are shared. When the value is not shared, `Array α` will have better performance because it can do destructive updates. (TNameTName : TypeTop-level names × vType u_1)) HasKeyHasKey.{u_1, u_2} {α : Type u_1} {β : Type u_2} (k : α) (ρ : List (α × β)) : Prop xTName topsList (TName × v) vType u_1 | [], impossibleHasKey x [] => nomatchEmpty match/ex falso. `nomatch e` is of arbitrary type `α : Sort u` if Lean can show that an empty set of patterns is exhaustive given `e`'s type, e.g. because it has no constructors. impossibleHasKey x [] | (yTName, vv✝) :: locals'List (TName × v✝), ok'HasKey x ((y, v) :: locals') => if"Dependent" if-then-else, normally written via the notation `if h : c then t(h) else e(h)`, is sugar for `dite c (fun h => t(h)) (fun h => e(h))`, and it is the same as `if c then t else e` except that `t` is allowed to depend on a proof `h : c`, and `e` can depend on `h : ¬c`. (Both branches use the same name for the hypothesis, even though it has different types in the two cases.) We use this to be able to communicate the if-then-else condition to the branches. For example, `Array.get arr ⟨i, h⟩` expects a proof `h : i < arr.size` in order to avoid a bounds check, so you can write `if h : i < arr.size then arr.get ⟨i, h⟩ else ...` to avoid the bounds check inside the if branch. (Of course in this case we have only lifted the check into an explicit `if`, but we could also use this proof multiple times or derive `i < arr.size` from some other proposition that we are checking in the `if`.) hx = y : xTName = yTName then"Dependent" if-then-else, normally written via the notation `if h : c then t(h) else e(h)`, is sugar for `dite c (fun h => t(h)) (fun h => e(h))`, and it is the same as `if c then t else e` except that `t` is allowed to depend on a proof `h : c`, and `e` can depend on `h : ¬c`. (Both branches use the same name for the hypothesis, even though it has different types in the two cases.) We use this to be able to communicate the if-then-else condition to the branches. For example, `Array.get arr ⟨i, h⟩` expects a proof `h : i < arr.size` in order to avoid a bounds check, so you can write `if h : i < arr.size then arr.get ⟨i, h⟩ else ...` to avoid the bounds check inside the if branch. (Of course in this case we have only lifted the check into an explicit `if`, but we could also use this proof multiple times or derive `i < arr.size` from some other proposition that we are checking in the `if`.) vv✝ else"Dependent" if-then-else, normally written via the notation `if h : c then t(h) else e(h)`, is sugar for `dite c (fun h => t(h)) (fun h => e(h))`, and it is the same as `if c then t else e` except that `t` is allowed to depend on a proof `h : c`, and `e` can depend on `h : ¬c`. (Both branches use the same name for the hypothesis, even though it has different types in the two cases.) We use this to be able to communicate the if-then-else condition to the branches. For example, `Array.get arr ⟨i, h⟩` expects a proof `h : i < arr.size` in order to avoid a bounds check, so you can write `if h : i < arr.size then arr.get ⟨i, h⟩ else ...` to avoid the bounds check inside the if branch. (Of course in this case we have only lifted the check into an explicit `if`, but we could also use this proof multiple times or derive `i < arr.size` from some other proposition that we are checking in the `if`.) goinstGetElemTopEnvTNameHasKey.go.{u_1} {v : Type u_1} (x : TName) (tops : List (TName × v)) (a✝ : HasKey x tops) : v locals'List (TName × v✝) $ by`by tac` constructs a term of the expected type by running the tactic(s) `tac`. casesAssuming `x` is a variable in the local context with an inductive type, `cases x` splits the main goal, producing one goal for each constructor of the inductive type, in which the target is replaced by a general instance of that constructor. If the type of an element in the local context depends on `x`, that element is reverted and reintroduced afterward, so that the case split affects that hypothesis as well. `cases` detects unreachable cases and closes them automatically. For example, given `n : Nat` and a goal with a hypothesis `h : P n` and target `Q n`, `cases n` produces one goal with hypothesis `h : P 0` and target `Q 0`, and one goal with hypothesis `h : P (Nat.succ a)` and target `Q (Nat.succ a)`. Here the name `a` is chosen automatically and is not accessible. You can use `with` to provide the variables names for each constructor. - `cases e`, where `e` is an expression instead of a variable, generalizes `e` in the goal, and then cases on the resulting variable. - Given `as : List α`, `cases as with | nil => tac₁ | cons a as' => tac₂`, uses tactic `tac₁` for the `nil` case, and `tac₂` for the `cons` case, and `a` and `as'` are used as names for the new variables introduced. - `cases h : e`, where `e` is a variable or an expression, performs cases on `e` as above, but also adds a hypothesis `h : e = ...` to each hypothesis, where `...` is the constructor instance for that particular case. ok'HasKey x ((y, v) :: locals') <;> trivial`trivial` tries different simple tactics (e.g., `rfl`, `contradiction`, ...) to close the current goal. You can use the command `macro_rules` to extend the set of tactics used. Example: ``` macro_rules | `(tactic| trivial) => `(tactic| simp) ``` goinstGetElemTopEnvTNameHasKey.go.{u_1} {v : Type u_1} (x : TName) (tops : List (TName × v)) (a✝ : HasKey x tops) : v ρTopEnv v okHasKey x ρ instance : GetElemGetElem.{u, v, w} (cont : Type u) (idx : Type v) (elem : outParam (Type w)) (dom : outParam (cont → idx → Prop)) : Type (max (max u v) w)The class `GetElem cont idx elem dom` implements the `xs[i]` notation. When you write this, given `xs : cont` and `i : idx`, Lean looks for an instance of `GetElem cont idx elem dom`. Here `elem` is the type of `xs[i]`, while `dom` is whatever proof side conditions are required to make this applicable. For example, the instance for arrays looks like `GetElem (Array α) Nat α (fun xs i => i < xs.size)`. The proof side-condition `dom xs i` is automatically dispatched by the `get_elem_tactic` tactic, which can be extended by adding more clauses to `get_elem_tactic_trivial`. (EnvEnv.{u} (v : Type u) : Type uEnvironments map free variables to their values vType u_1) TNameTName : TypeTop-level names vType u_1 (fun ρEnv v xTName => HasKeyHasKey.{u_1, u_2} {α : Type u_1} {β : Type u_2} (k : α) (ρ : List (α × β)) : Prop xTName ρEnv v.topEnv.top.{u} {v : Type u} (self : Env v) : List (TName × v)) where getElem(ρ : Env v) → (x : TName) → HasKey x ρ.top → vThe syntax `arr[i]` gets the `i`'th element of the collection `arr`. If there are proof side conditions to the application, they will be automatically inferred by the `get_elem_tactic` tactic. The actual behavior of this class is type-dependent, but here are some important implementations: * `arr[i] : α` where `arr : Array α` and `i : Nat` or `i : USize`: does array indexing with no bounds check and a proof side goal `i < arr.size`. * `l[i] : α` where `l : List α` and `i : Nat`: index into a list, with proof side goal `i < l.length`. * `stx[i] : Syntax` where `stx : Syntax` and `i : Nat`: get a syntax argument, no side goal (returns `.missing` out of range) There are other variations on this syntax: * `arr[i]`: proves the proof side goal by `get_elem_tactic` * `arr[i]!`: panics if the side goal is false * `arr[i]?`: returns `none` if the side goal is false * `arr[i]'h`: uses `h` to prove the side goal ρEnv v xTName okHasKey x ρ.top := let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` topsTopEnv v : TopEnvTopEnv.{u_1} (v : Type u_1) : Type u_1Top-level environments vType u_1 := ρEnv v.topEnv.top.{u} {v : Type u} (self : Env v) : List (TName × v) topsTopEnv v[xTName]

Values

The spine of accumulated arguments to an unknown function are best represented by a snoc-list, as they grow to the right when the value is applied to further arguments. In order to get the desired performance characteristics, spines should be lazy, allowing definitional equality comparison to abort early without fully evaluating all arguments. ThunkThunk.{u} (α : Type u) : Type uThunks are "lazy" values that are evaluated when first accessed using `Thunk.get/map/bind`. The value is then stored and not recomputed for all further accesses. vType u is a type that wraps a nullary function that yields a vType u, with special support from the Lean compiler so that the value is cached rather than recomputed. Getting the value of a thunk is called forcing it.

/-- A "spine" of values accumulated as arguments to an unknown function -/ inductiveIn Lean, every concrete type other than the universes and every type constructor other than dependent arrows is an instance of a general family of type constructions known as inductive types. It is remarkable that it is possible to construct a substantial edifice of mathematics based on nothing more than the type universes, dependent arrow types, and inductive types; everything else follows from those. Intuitively, an inductive type is built up from a specified list of constructor. For example, `List α` is the list of elements of type `α`, and is defined as follows: ``` inductive List (α : Type u) where | nil | cons (head : α) (tail : List α) ``` A list of elements of type `α` is either the empty list, `nil`, or an element `head : α` followed by a list `tail : List α`. For more information about [inductive types](https://lean-lang.org/theorem_proving_in_lean4/inductive_types.html). SpineSpine.{u} (v : Type u) : Type uA "spine" of values accumulated as arguments to an unknown function (vType u : TypeA type universe. `Type ≡ Type 0`, `Type u ≡ Sort (u + 1)`. u) where | emptySpine.empty.{u} {v : Type u} : Spine v | pushSpine.push.{u} {v : Type u} (a✝ : Spine v) (a✝¹ : Thunk v) : Spine v : SpineSpine.{u} (v : Type u) : Type uA "spine" of values accumulated as arguments to an unknown function vType u ThunkThunk.{u} (α : Type u) : Type uThunks are "lazy" values that are evaluated when first accessed using `Thunk.get/map/bind`. The value is then stored and not recomputed for all further accesses. vType u SpineSpine.{u} (v : Type u) : Type uA "spine" of values accumulated as arguments to an unknown function vType u def Spine.foldlSpine.foldl.{u_1, u_2} {β : Sort u_1} {v : Type u_2} (f : β → Thunk v → β) (start : β) (s : Spine v) : β (fβ → Thunk v → β : βSort u_1 ThunkThunk.{u} (α : Type u) : Type uThunks are "lazy" values that are evaluated when first accessed using `Thunk.get/map/bind`. The value is then stored and not recomputed for all further accesses. vType u_2 βSort u_1) (startβ : βSort u_1) (sSpine v : SpineSpine.{u} (v : Type u) : Type uA "spine" of values accumulated as arguments to an unknown function vType u_2) : βSort u_1 := matchPattern matching. `match e, ... with | p, ... => f | ...` matches each given term `e` against each pattern `p` of a match alternative. When all patterns of an alternative match, the `match` term evaluates to the value of the corresponding right-hand side `f` with the pattern variables bound to the respective matched values. If used as `match h : e, ... with | p, ... => f | ...`, `h : e = p` is available within `f`. When not constructing a proof, `match` does not automatically substitute variables matched on in dependent variables' types. Use `match (generalizing := true) ...` to enforce this. Syntax quotations can also be used in a pattern match. This matches a `Syntax` value against quotations, pattern variables, or `_`. Quoted identifiers only match identical identifiers - custom matching such as by the preresolved names only should be done explicitly. `Syntax.atom`s are ignored during matching by default except when part of a built-in literal. For users introducing new atoms, we recommend wrapping them in dedicated syntax kinds if they should participate in matching. For example, in ```lean syntax "c" ("foo" <|> "bar") ... ``` `foo` and `bar` are indistinguishable during matching, but in ```lean syntax foo := "foo" syntax "c" (foo <|> "bar") ... ``` they are not. sSpine v withPattern matching. `match e, ... with | p, ... => f | ...` matches each given term `e` against each pattern `p` of a match alternative. When all patterns of an alternative match, the `match` term evaluates to the value of the corresponding right-hand side `f` with the pattern variables bound to the respective matched values. If used as `match h : e, ... with | p, ... => f | ...`, `h : e = p` is available within `f`. When not constructing a proof, `match` does not automatically substitute variables matched on in dependent variables' types. Use `match (generalizing := true) ...` to enforce this. Syntax quotations can also be used in a pattern match. This matches a `Syntax` value against quotations, pattern variables, or `_`. Quoted identifiers only match identical identifiers - custom matching such as by the preresolved names only should be done explicitly. `Syntax.atom`s are ignored during matching by default except when part of a built-in literal. For users introducing new atoms, we recommend wrapping them in dedicated syntax kinds if they should participate in matching. For example, in ```lean syntax "c" ("foo" <|> "bar") ... ``` `foo` and `bar` are indistinguishable during matching, but in ```lean syntax foo := "foo" syntax "c" (foo <|> "bar") ... ``` they are not. | .emptySpine.empty.{u} {v : Type u} : Spine v => startβ | .pushSpine.push.{u} {v : Type u} (a✝ : Spine v) (a✝¹ : Thunk v) : Spine v s'Spine v argThunk v => fβ → Thunk v → β (s'Spine v.foldlSpine.foldl.{u_1, u_2} {β : Sort u_1} {v : Type u_2} (f : β → Thunk v → β) (start : β) (s : Spine v) : β fβ → Thunk v → β startβ) argThunk v

As described, there are three values: closures (Val.lamVal.lam (a✝ : LName) (a✝¹ : Env Val) (a✝² : Tm) : Val) retain an argument name, an environment for the function's body, and the body itself; applications of local variables (Val.locVal.loc (a✝ : LName) (a✝¹ : Spine Val) : Val) pair a local variable with a spine of lazy argument values; and applications of global constants (Val.topVal.top (a✝ : TName) (a✝¹ : Spine Val) (a✝² : Thunk Val) : Val) combine the constant's name with a spine of lazy values as well as a lazy computation that actually does the work of applying the top-level constant's value to its arguments.

inductiveIn Lean, every concrete type other than the universes and every type constructor other than dependent arrows is an instance of a general family of type constructions known as inductive types. It is remarkable that it is possible to construct a substantial edifice of mathematics based on nothing more than the type universes, dependent arrow types, and inductive types; everything else follows from those. Intuitively, an inductive type is built up from a specified list of constructor. For example, `List α` is the list of elements of type `α`, and is defined as follows: ``` inductive List (α : Type u) where | nil | cons (head : α) (tail : List α) ``` A list of elements of type `α` is either the empty list, `nil`, or an element `head : α` followed by a list `tail : List α`. For more information about [inductive types](https://lean-lang.org/theorem_proving_in_lean4/inductive_types.html). Val_root_.Val : Type where | lamVal.lam (a✝ : LName) (a✝¹ : Env Val) (a✝² : Tm) : Val : LNameLName : TypeLocal names EnvEnv.{u} (v : Type u) : Type uEnvironments map free variables to their values Val_root_.Val : Type TmTm : TypeTerms Val_root_.Val : Type | locVal.loc (a✝ : LName) (a✝¹ : Spine Val) : Val : LNameLName : TypeLocal names SpineSpine.{u} (v : Type u) : Type uA "spine" of values accumulated as arguments to an unknown function Val_root_.Val : Type Val_root_.Val : Type | topVal.top (a✝ : TName) (a✝¹ : Spine Val) (a✝² : Thunk Val) : Val : TNameTName : TypeTop-level names SpineSpine.{u} (v : Type u) : Type uA "spine" of values accumulated as arguments to an unknown function Val_root_.Val : Type ThunkThunk.{u} (α : Type u) : Type uThunks are "lazy" values that are evaluated when first accessed using `Thunk.get/map/bind`. The value is then stored and not recomputed for all further accesses. Val_root_.Val : Type Val_root_.Val : Type instance : InhabitedInhabited.{u} (α : Sort u) : Sort (max 1 u)`Inhabited α` is a typeclass that says that `α` has a designated element, called `(default : α)`. This is sometimes referred to as a "pointed type". This class is used by functions that need to return a value of the type when called "out of domain". For example, `Array.get! arr i : α` returns a value of type `α` when `arr : Array α`, but if `i` is not in range of the array, it reports a panic message, but this does not halt the program, so it must still return a value of type `α` (and in fact this is required for logical consistency), so in this case it returns `default`. ValVal : Type where defaultVal`default` is a function that produces a "default" element of any `Inhabited` type. This element does not have any particular specified properties, but it is often an all-zeroes value. := .locVal.loc (a✝ : LName) (a✝¹ : Spine Val) : Val "fake""fake" : String .emptySpine.empty.{u} {v : Type u} : Spine v

The Evaluator

The evaluator must be partial, because this is the untyped lambda calculus! The only difference between this evaluator and an ordinary eval-apply evaluator is that apply must account for the fact that functions might be represented by three different kinds of values. Lean's coercion mechanism takes care of wrapping expressions in lazy positions in thunks automatically.

mutual partial def eval_root_.eval (ρ : Env Val) (a✝ : Tm) : Val (ρEnv Val : EnvEnv.{u} (v : Type u) : Type uEnvironments map free variables to their values ValVal : Type) : TmTm : TypeTerms ValVal : Type | .lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm xLName bTm => .lamVal.lam (a✝ : LName) (a✝¹ : Env Val) (a✝² : Tm) : Val xLName ρEnv Val bTm | .letTm.let (a✝ : LName) (a✝¹a✝² : Tm) : Tm xLName eTm bTm => eval_root_.eval (ρ : Env Val) (a✝ : Tm) : Val (ρEnv Val.addLocalEnv.addLocal.{u_1} {v : Type u_1} (ρ : Env v) (x : LName) (val : v) : Env vAdd a local variable's value to an environment xLName (eval_root_.eval (ρ : Env Val) (a✝ : Tm) : Val ρEnv Val eTm)) bTm | .appTm.app (a✝a✝¹ : Tm) : Tm e1Tm e2Tm => (eval_root_.eval (ρ : Env Val) (a✝ : Tm) : Val ρEnv Val e1Tm).apply_root_.Val.apply (a✝ : Val) (a✝¹ : Thunk Val) : Val (eval_root_.eval (ρ : Env Val) (a✝ : Tm) : Val ρEnv Val e2Tm) | .topTm.top (a✝ : TName) : Tm xTName => .topVal.top (a✝ : TName) (a✝¹ : Spine Val) (a✝² : Thunk Val) : Val xTName .emptySpine.empty.{u} {v : Type u} : Spine v ρEnv Val[xTName]! | .locTm.loc (a✝ : LName) : Tm xLName => ρEnv Val[xLName]! partial def Val.apply_root_.Val.apply (a✝ : Val) (a✝¹ : Thunk Val) : Val : ValVal : Type ThunkThunk.{u} (α : Type u) : Type uThunks are "lazy" values that are evaluated when first accessed using `Thunk.get/map/bind`. The value is then stored and not recomputed for all further accesses. ValVal : Type ValVal : Type | .lamVal.lam (a✝ : LName) (a✝¹ : Env Val) (a✝² : Tm) : Val xLName ρEnv Val bTm, argThunk Val => eval_root_.eval (ρ : Env Val) (a✝ : Tm) : Val (ρEnv Val.addLocalEnv.addLocal.{u_1} {v : Type u_1} (ρ : Env v) (x : LName) (val : v) : Env vAdd a local variable's value to an environment xLName argThunk Val.getThunk.get.{u_1} {α : Type u_1} (x : Thunk α) : αForces a thunk to extract the value. This will cache the result, so a second call to the same function will return the value in O(1) instead of calling the stored getter function. ) bTm | .locVal.loc (a✝ : LName) (a✝¹ : Spine Val) : Val xLName sSpine Val, argThunk Val => .locVal.loc (a✝ : LName) (a✝¹ : Spine Val) : Val xLName (sSpine Val.pushSpine.push.{u} {v : Type u} (a✝ : Spine v) (a✝¹ : Thunk v) : Spine v argThunk Val.getThunk.get.{u_1} {α : Type u_1} (x : Thunk α) : αForces a thunk to extract the value. This will cache the result, so a second call to the same function will return the value in O(1) instead of calling the stored getter function. ) | .topVal.top (a✝ : TName) (a✝¹ : Spine Val) (a✝² : Thunk Val) : Val xTName sSpine Val vThunk Val, argThunk Val => .topVal.top (a✝ : TName) (a✝¹ : Spine Val) (a✝² : Thunk Val) : Val xTName (sSpine Val.pushSpine.push.{u} {v : Type u} (a✝ : Spine v) (a✝¹ : Thunk v) : Spine v argThunk Val) (vThunk Val.getThunk.get.{u_1} {α : Type u_1} (x : Thunk α) : αForces a thunk to extract the value. This will cache the result, so a second call to the same function will return the value in O(1) instead of calling the stored getter function. .apply_root_.Val.apply (a✝ : Val) (a✝¹ : Thunk Val) : Val argThunk Val) end

From Evaluation to Normal Forms

Normalizing terms consists of first evaluating them, then converting the value back to syntax. Because the domain of values has been carefully chosen to exhibit all of the equalities desired for the term language, this process must map terms to a canonical member of the equivalence class induced by the desired equalities - that is, the normal form! But the whole point of glued evaluation is to control unfolding, so the quotation process here must take an argument that determines whether to unfold top-level constants (thus computing normal forms) or to leave them as names (thus computing smaller terms in unifiers).

inductiveIn Lean, every concrete type other than the universes and every type constructor other than dependent arrows is an instance of a general family of type constructions known as inductive types. It is remarkable that it is possible to construct a substantial edifice of mathematics based on nothing more than the type universes, dependent arrow types, and inductive types; everything else follows from those. Intuitively, an inductive type is built up from a specified list of constructor. For example, `List α` is the list of elements of type `α`, and is defined as follows: ``` inductive List (α : Type u) where | nil | cons (head : α) (tail : List α) ``` A list of elements of type `α` is either the empty list, `nil`, or an element `head : α` followed by a list `tail : List α`. For more information about [inductive types](https://lean-lang.org/theorem_proving_in_lean4/inductive_types.html). UnfoldTop_root_.UnfoldTop : Type where | unfoldUnfoldTop.unfold : UnfoldTop | noUnfoldUnfoldTop.noUnfold : UnfoldTop deriving ReprRepr.{u} (α : Type u) : Type u

When reading back nested functions, it's important to ensure that the name of the bound variable has not already been used by a surrounding function. Otherwise, occurrences of the variable could end up pointing at the wrong binding site. Because local and global names are separate syntactic categories in this language, and functions bind only local variables, only the locals need to be checked for conflicts:

partial def LocalEnv.freshLocalEnv.fresh.{u_1} {v : Type u_1} (ρ : LocalEnv v) (x : LName) : LName (ρLocalEnv v : LocalEnvLocalEnv.{u_1} (v : Type u_1) : Type u_1Local environments vType u_1) (xLName : LNameLName : TypeLocal names ) : LNameLName : TypeLocal names := if`if c then t else e` is notation for `ite c t e`, "if-then-else", which decides to return `t` or `e` depending on whether `c` is true or false. The explicit argument `c : Prop` does not have any actual computational content, but there is an additional `[Decidable c]` argument synthesized by typeclass inference which actually determines how to evaluate `c` to true or false. Write `if h : c then t else e` instead for a "dependent if-then-else" `dite`, which allows `t`/`e` to use the fact that `c` is true/false. Because Lean uses a strict (call-by-value) evaluation strategy, the signature of this function is problematic in that it would require `t` and `e` to be evaluated before calling the `ite` function, which would cause both sides of the `if` to be evaluated. Even if the result is discarded, this would be a big performance problem, and is undesirable for users in any case. To resolve this, `ite` is marked as `@[macro_inline]`, which means that it is unfolded during code generation, and the definition of the function uses `fun _ => t` and `fun _ => e` so this recovers the expected "lazy" behavior of `if`: the `t` and `e` arguments delay evaluation until `c` is known. xLName ρLocalEnv v then`if c then t else e` is notation for `ite c t e`, "if-then-else", which decides to return `t` or `e` depending on whether `c` is true or false. The explicit argument `c : Prop` does not have any actual computational content, but there is an additional `[Decidable c]` argument synthesized by typeclass inference which actually determines how to evaluate `c` to true or false. Write `if h : c then t else e` instead for a "dependent if-then-else" `dite`, which allows `t`/`e` to use the fact that `c` is true/false. Because Lean uses a strict (call-by-value) evaluation strategy, the signature of this function is problematic in that it would require `t` and `e` to be evaluated before calling the `ite` function, which would cause both sides of the `if` to be evaluated. Even if the result is discarded, this would be a big performance problem, and is undesirable for users in any case. To resolve this, `ite` is marked as `@[macro_inline]`, which means that it is unfolded during code generation, and the definition of the function uses `fun _ => t` and `fun _ => e` so this recovers the expected "lazy" behavior of `if`: the `t` and `e` arguments delay evaluation until `c` is known. ρLocalEnv v.freshLocalEnv.fresh.{u_1} {v : Type u_1} (ρ : LocalEnv v) (x : LName) : LName {xLName with nameString := xLName.nameLName.name (self : LName) : String ++ "'""'" : String} else`if c then t else e` is notation for `ite c t e`, "if-then-else", which decides to return `t` or `e` depending on whether `c` is true or false. The explicit argument `c : Prop` does not have any actual computational content, but there is an additional `[Decidable c]` argument synthesized by typeclass inference which actually determines how to evaluate `c` to true or false. Write `if h : c then t else e` instead for a "dependent if-then-else" `dite`, which allows `t`/`e` to use the fact that `c` is true/false. Because Lean uses a strict (call-by-value) evaluation strategy, the signature of this function is problematic in that it would require `t` and `e` to be evaluated before calling the `ite` function, which would cause both sides of the `if` to be evaluated. Even if the result is discarded, this would be a big performance problem, and is undesirable for users in any case. To resolve this, `ite` is marked as `@[macro_inline]`, which means that it is unfolded during code generation, and the definition of the function uses `fun _ => t` and `fun _ => e` so this recovers the expected "lazy" behavior of `if`: the `t` and `e` arguments delay evaluation until `c` is known. xLName def Env.freshEnv.fresh.{u_1} {v : Type u_1} (ρ : Env v) (x : LName) : LName (ρEnv v : EnvEnv.{u} (v : Type u) : Type uEnvironments map free variables to their values vType u_1) (xLName : LNameLName : TypeLocal names ) : LNameLName : TypeLocal names := let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` localsLocalEnv v : LocalEnvLocalEnv.{u_1} (v : Type u_1) : Type u_1Local environments vType u_1 := ρEnv v.locEnv.loc.{u} {v : Type u} (self : Env v) : List (LName × v) localsLocalEnv v.freshLocalEnv.fresh.{u_1} {v : Type u_1} (ρ : LocalEnv v) (x : LName) : LName xLName

The actual process of quotation replaces each value with the corresponding expression:

  • Closures represent functions, and can be quoted to lambda-expressions by inventing a fresh name for the function's argument. The body of this function is found by quoting the result of applying the closure value to a value that represents the argument.

  • Local variable values represent applications of unknown local functions. They can be quoted by constructing an application expression that applies the variable to the results of quoting each value in the spine.

  • Global constant values can be quoted in one of two ways:

    1. If normal forms are being computed, the thunk that represents the value of the application is forced, and the resulting value is quoted

    2. If smaller results are being computed, then the value is quoted as if it were a local variable application, quoting each value in the spine and constructing an application expression

mutual partial def Val.quote_root_.Val.quote (ρ : LocalEnv Val) (unfold : UnfoldTop) (a✝ : Val) : Tm (ρLocalEnv Val : LocalEnvLocalEnv.{u_1} (v : Type u_1) : Type u_1Local environments ValVal : Type) (unfoldUnfoldTop : UnfoldTopUnfoldTop : Type) : ValVal : Type TmTm : TypeTerms | .locVal.loc (a✝ : LName) (a✝¹ : Spine Val) : Val xLName spSpine Val => spSpine Val.quote_root_.Spine.quote (ρ : LocalEnv Val) (unfold : UnfoldTop) (head : Tm) (args : Spine Val) : Tm ρLocalEnv Val unfoldUnfoldTop (.locTm.loc (a✝ : LName) : Tm xLName) | .topVal.top (a✝ : TName) (a✝¹ : Spine Val) (a✝² : Thunk Val) : Val xTName spSpine Val vThunk Val => matchPattern matching. `match e, ... with | p, ... => f | ...` matches each given term `e` against each pattern `p` of a match alternative. When all patterns of an alternative match, the `match` term evaluates to the value of the corresponding right-hand side `f` with the pattern variables bound to the respective matched values. If used as `match h : e, ... with | p, ... => f | ...`, `h : e = p` is available within `f`. When not constructing a proof, `match` does not automatically substitute variables matched on in dependent variables' types. Use `match (generalizing := true) ...` to enforce this. Syntax quotations can also be used in a pattern match. This matches a `Syntax` value against quotations, pattern variables, or `_`. Quoted identifiers only match identical identifiers - custom matching such as by the preresolved names only should be done explicitly. `Syntax.atom`s are ignored during matching by default except when part of a built-in literal. For users introducing new atoms, we recommend wrapping them in dedicated syntax kinds if they should participate in matching. For example, in ```lean syntax "c" ("foo" <|> "bar") ... ``` `foo` and `bar` are indistinguishable during matching, but in ```lean syntax foo := "foo" syntax "c" (foo <|> "bar") ... ``` they are not. unfoldUnfoldTop withPattern matching. `match e, ... with | p, ... => f | ...` matches each given term `e` against each pattern `p` of a match alternative. When all patterns of an alternative match, the `match` term evaluates to the value of the corresponding right-hand side `f` with the pattern variables bound to the respective matched values. If used as `match h : e, ... with | p, ... => f | ...`, `h : e = p` is available within `f`. When not constructing a proof, `match` does not automatically substitute variables matched on in dependent variables' types. Use `match (generalizing := true) ...` to enforce this. Syntax quotations can also be used in a pattern match. This matches a `Syntax` value against quotations, pattern variables, or `_`. Quoted identifiers only match identical identifiers - custom matching such as by the preresolved names only should be done explicitly. `Syntax.atom`s are ignored during matching by default except when part of a built-in literal. For users introducing new atoms, we recommend wrapping them in dedicated syntax kinds if they should participate in matching. For example, in ```lean syntax "c" ("foo" <|> "bar") ... ``` `foo` and `bar` are indistinguishable during matching, but in ```lean syntax foo := "foo" syntax "c" (foo <|> "bar") ... ``` they are not. | .unfoldUnfoldTop.unfold : UnfoldTop => vThunk Val.getThunk.get.{u_1} {α : Type u_1} (x : Thunk α) : αForces a thunk to extract the value. This will cache the result, so a second call to the same function will return the value in O(1) instead of calling the stored getter function. .quote_root_.Val.quote (ρ : LocalEnv Val) (unfold : UnfoldTop) (a✝ : Val) : Tm ρLocalEnv Val unfoldUnfoldTop | .noUnfoldUnfoldTop.noUnfold : UnfoldTop => spSpine Val.quote_root_.Spine.quote (ρ : LocalEnv Val) (unfold : UnfoldTop) (head : Tm) (args : Spine Val) : Tm ρLocalEnv Val unfoldUnfoldTop (.topTm.top (a✝ : TName) : Tm xTName) | fVal@(.lamVal.lam (a✝ : LName) (a✝¹ : Env Val) (a✝² : Tm) : Val xLName ..) => let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` x'LName := ρLocalEnv Val.freshLocalEnv.fresh.{u_1} {v : Type u_1} (ρ : LocalEnv v) (x : LName) : LName xLName let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` vVal : ValVal : Type := .locVal.loc (a✝ : LName) (a✝¹ : Spine Val) : Val x'LName .emptySpine.empty.{u} {v : Type u} : Spine v .lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm x'LName ((fVal.applyVal.apply (a✝ : Val) (a✝¹ : Thunk Val) : Val vVal).quote_root_.Val.quote (ρ : LocalEnv Val) (unfold : UnfoldTop) (a✝ : Val) : Tm ((x'LName, vVal) :: ρLocalEnv Val) unfoldUnfoldTop) partial def Spine.quote_root_.Spine.quote (ρ : LocalEnv Val) (unfold : UnfoldTop) (head : Tm) (args : Spine Val) : Tm (ρLocalEnv Val : LocalEnvLocalEnv.{u_1} (v : Type u_1) : Type u_1Local environments ValVal : Type) (unfoldUnfoldTop : UnfoldTopUnfoldTop : Type) (headTm : TmTm : TypeTerms ) (argsSpine Val : SpineSpine.{u} (v : Type u) : Type uA "spine" of values accumulated as arguments to an unknown function ValVal : Type) : TmTm : TypeTerms := argsSpine Val.foldlSpine.foldl.{u_1, u_2} {β : Sort u_1} {v : Type u_2} (f : β → Thunk v → β) (start : β) (s : Spine v) : β (mkApp_root_.mkApp (ρ : LocalEnv Val) (unfold : UnfoldTop) (quoted : Tm) (arg : Thunk Val) : Tm ρLocalEnv Val unfoldUnfoldTop) headTm partial def mkApp_root_.mkApp (ρ : LocalEnv Val) (unfold : UnfoldTop) (quoted : Tm) (arg : Thunk Val) : Tm (ρLocalEnv Val : LocalEnvLocalEnv.{u_1} (v : Type u_1) : Type u_1Local environments ValVal : Type) (unfoldUnfoldTop : UnfoldTopUnfoldTop : Type) (quotedTm : TmTm : TypeTerms ) (argThunk Val : ThunkThunk.{u} (α : Type u) : Type uThunks are "lazy" values that are evaluated when first accessed using `Thunk.get/map/bind`. The value is then stored and not recomputed for all further accesses. ValVal : Type) : TmTm : TypeTerms := .appTm.app (a✝a✝¹ : Tm) : Tm quotedTm (argThunk Val.getThunk.get.{u_1} {α : Type u_1} (x : Thunk α) : αForces a thunk to extract the value. This will cache the result, so a second call to the same function will return the value in O(1) instead of calling the stored getter function. .quote_root_.Val.quote (ρ : LocalEnv Val) (unfold : UnfoldTop) (a✝ : Val) : Tm ρLocalEnv Val unfoldUnfoldTop) end

Only a bit of glue code is needed to run the normalization procedure. Given a set of top-level constant definitions represented by an array of name-expression pairs, evalTopevalTop (topDefs : Array (TName × Tm)) (program : Tm) : ValEvaluate a term under some top-level declarations constructs a top-level environment by evaluating each declaration's expression in the context of the prior expressions, and then evaluates a result expression in the resulting top-level environment. Similarly, nfTopnfTop (unfold : UnfoldTop) (topDefs : Array (TName × Tm)) (program : Tm) : TmNormalize or semi-normalize a term under some top-level declarations does the same thing, but it also quotes the resulting value, yielding a normal form or partially-evaluated term, as requested.

/-- Evaluate a term under some top-level declarations -/ def evalTop_root_.evalTop (topDefs : Array (TName × Tm)) (program : Tm) : ValEvaluate a term under some top-level declarations (topDefsArray (TName × Tm) : ArrayArray.{u} (α : Type u) : Type u`Array α` is the type of [dynamic arrays](https://en.wikipedia.org/wiki/Dynamic_array) with elements from `α`. This type has special support in the runtime. An array has a size and a capacity; the size is `Array.size` but the capacity is not observable from Lean code. Arrays perform best when unshared; as long as they are used "linearly" all updates will be performed destructively on the array, so it has comparable performance to mutable arrays in imperative programming languages. From the point of view of proofs `Array α` is just a wrapper around `List α`. (TNameTName : TypeTop-level names × TmTm : TypeTerms )) (programTm : TmTm : TypeTerms ) : ValVal : Type := let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` ρEnv Val : EnvEnv.{u} (v : Type u) : Type uEnvironments map free variables to their values ValVal : Type := topDefsArray (TName × Tm).foldlArray.foldl.{u, v} {α : Type u} {β : Type v} (f : β → α → β) (init : β) (as : Array α) (start : Nat := 0) (stop : Nat := Array.size as) : β (fun ρEnv Val (xTName, eTm) => ρEnv Val.addTopEnv.addTop.{u_1} {v : Type u_1} (ρ : Env v) (x : TName) (val : v) : Env vAdd a top-level constant's definition to an environment xTName (evaleval (ρ : Env Val) (a✝ : Tm) : Val ρEnv Val eTm)) .emptyEnv.empty.{u_1} {v : Type u_1} : Env vThe empty environment evaleval (ρ : Env Val) (a✝ : Tm) : Val ρEnv Val programTm /-- Normalize or semi-normalize a term under some top-level declarations -/ def nfTop_root_.nfTop (unfold : UnfoldTop) (topDefs : Array (TName × Tm)) (program : Tm) : TmNormalize or semi-normalize a term under some top-level declarations (unfoldUnfoldTop : UnfoldTopUnfoldTop : Type) (topDefsArray (TName × Tm) : ArrayArray.{u} (α : Type u) : Type u`Array α` is the type of [dynamic arrays](https://en.wikipedia.org/wiki/Dynamic_array) with elements from `α`. This type has special support in the runtime. An array has a size and a capacity; the size is `Array.size` but the capacity is not observable from Lean code. Arrays perform best when unshared; as long as they are used "linearly" all updates will be performed destructively on the array, so it has comparable performance to mutable arrays in imperative programming languages. From the point of view of proofs `Array α` is just a wrapper around `List α`. (TNameTName : TypeTop-level names × TmTm : TypeTerms )) (programTm : TmTm : TypeTerms ) : TmTm : TypeTerms := (evalTopevalTop (topDefs : Array (TName × Tm)) (program : Tm) : ValEvaluate a term under some top-level declarations topDefsArray (TName × Tm) programTm).quoteVal.quote (ρ : LocalEnv Val) (unfold : UnfoldTop) (a✝ : Val) : Tm [] unfoldUnfoldTop

The array examplesexamples : Array (TName × Tm) contains the prior examples as declarations suitable for nfTopnfTop (unfold : UnfoldTop) (topDefs : Array (TName × Tm)) (program : Tm) : TmNormalize or semi-normalize a term under some top-level declarations :

def examples_root_.examples : Array (TName × Tm) : ArrayArray.{u} (α : Type u) : Type u`Array α` is the type of [dynamic arrays](https://en.wikipedia.org/wiki/Dynamic_array) with elements from `α`. This type has special support in the runtime. An array has a size and a capacity; the size is `Array.size` but the capacity is not observable from Lean code. Arrays perform best when unshared; as long as they are used "linearly" all updates will be performed destructively on the array, so it has comparable performance to mutable arrays in imperative programming languages. From the point of view of proofs `Array α` is just a wrapper around `List α`. (TNameTName : TypeTop-level names × TmTm : TypeTerms ) := #[ ("zero""zero" : String, Tm.zeroTm.zero : Tm), ("two""two" : String, Tm.twoTm.two : Tm), ("add""add" : String, Tm.addTm.add : Tm) ]

Squinting a bit, two plus two is indeed four:

Tm.lam { name := "f" } (Tm.lam { name := "z" } (Tm.app (Tm.loc { name := "f" }) (Tm.app (Tm.loc { name := "f" }) (Tm.app (Tm.loc { name := "f" }) (Tm.app (Tm.loc { name := "f" }) (Tm.loc { name := "z" })))))) #eval nfTopnfTop (unfold : UnfoldTop) (topDefs : Array (TName × Tm)) (program : Tm) : TmNormalize or semi-normalize a term under some top-level declarations .unfoldUnfoldTop.unfold : UnfoldTop examplesexamples : Array (TName × Tm) (.appTm.app (a✝a✝¹ : Tm) : Tm (.appTm.app (a✝a✝¹ : Tm) : Tm (.topTm.top (a✝ : TName) : Tm "add""add" : String) (.topTm.top (a✝ : TName) : Tm "two""two" : String)) (.topTm.top (a✝ : TName) : Tm "two""two" : String))
Tm.lam
  { name := "f" }
  (Tm.lam
    { name := "z" }
    (Tm.app
      (Tm.loc { name := "f" })
      (Tm.app
        (Tm.loc { name := "f" })
        (Tm.app (Tm.loc { name := "f" }) (Tm.app (Tm.loc { name := "f" }) (Tm.loc { name := "z" }))))))

Because add and two are top-level definitions, asking nfTopnfTop (unfold : UnfoldTop) (topDefs : Array (TName × Tm)) (program : Tm) : TmNormalize or semi-normalize a term under some top-level declarations not to unfold top-level definitions results in the original term:

Tm.app (Tm.app (Tm.top { name := "add" }) (Tm.top { name := "two" })) (Tm.top { name := "two" }) #eval nfTopnfTop (unfold : UnfoldTop) (topDefs : Array (TName × Tm)) (program : Tm) : TmNormalize or semi-normalize a term under some top-level declarations .noUnfoldUnfoldTop.noUnfold : UnfoldTop examplesexamples : Array (TName × Tm) (.appTm.app (a✝a✝¹ : Tm) : Tm (.appTm.app (a✝a✝¹ : Tm) : Tm (.topTm.top (a✝ : TName) : Tm "add""add" : String) (.topTm.top (a✝ : TName) : Tm "two""two" : String)) (.topTm.top (a✝ : TName) : Tm "two""two" : String))
Tm.app (Tm.app (Tm.top { name := "add" }) (Tm.top { name := "two" })) (Tm.top { name := "two" })

Inspecting the normal forms of the results of toChurchtoChurch (n : Nat) : Tm and toChurch'toChurch' (a✝ : Nat) : Tm shows that they agree:

Tm.lam { name := "f" } (Tm.lam { name := "z" } (Tm.app (Tm.loc { name := "f" }) (Tm.app (Tm.loc { name := "f" }) (Tm.app (Tm.loc { name := "f" }) (Tm.loc { name := "z" }))))) #eval nfTopnfTop (unfold : UnfoldTop) (topDefs : Array (TName × Tm)) (program : Tm) : TmNormalize or semi-normalize a term under some top-level declarations .unfoldUnfoldTop.unfold : UnfoldTop examplesexamples : Array (TName × Tm) (toChurchtoChurch (n : Nat) : Tm 3)
Tm.lam
  { name := "f" }
  (Tm.lam
    { name := "z" }
    (Tm.app
      (Tm.loc { name := "f" })
      (Tm.app (Tm.loc { name := "f" }) (Tm.app (Tm.loc { name := "f" }) (Tm.loc { name := "z" })))))
Tm.lam { name := "f" } (Tm.lam { name := "z" } (Tm.app (Tm.loc { name := "f" }) (Tm.app (Tm.loc { name := "f" }) (Tm.app (Tm.loc { name := "f" }) (Tm.loc { name := "z" }))))) #eval nfTopnfTop (unfold : UnfoldTop) (topDefs : Array (TName × Tm)) (program : Tm) : TmNormalize or semi-normalize a term under some top-level declarations .unfoldUnfoldTop.unfold : UnfoldTop examplesexamples : Array (TName × Tm) (toChurch'toChurch' (a✝ : Nat) : Tm 3)
Tm.lam
  { name := "f" }
  (Tm.lam
    { name := "z" }
    (Tm.app
      (Tm.loc { name := "f" })
      (Tm.app (Tm.loc { name := "f" }) (Tm.app (Tm.loc { name := "f" }) (Tm.loc { name := "z" })))))

Indeed, they are equal:

true #eval nfTopnfTop (unfold : UnfoldTop) (topDefs : Array (TName × Tm)) (program : Tm) : TmNormalize or semi-normalize a term under some top-level declarations .unfoldUnfoldTop.unfold : UnfoldTop examplesexamples : Array (TName × Tm) (toChurchtoChurch (n : Nat) : Tm 3) == nfTopnfTop (unfold : UnfoldTop) (topDefs : Array (TName × Tm)) (program : Tm) : TmNormalize or semi-normalize a term under some top-level declarations .unfoldUnfoldTop.unfold : UnfoldTop examplesexamples : Array (TName × Tm) (toChurch'toChurch' (a✝ : Nat) : Tm 3)
true

The glued evaluator seems to be functioning correctly. But constructing input terms and understanding its output is still tedious and error-prone due to the verbosity of the ASTs. The rest of this post concerns techniques for making the prototype easier to interact with.

Improving the Output

The first step in making the prototype easier to use is to make the output easier to read. The simplest way to do this is to write an improved instance of ReprRepr.{u} (α : Type u) : Type u TmTm : TypeTerms . A ReprRepr.{u} (α : Type u) : Type u instance uses Lean's built-in pretty-printing library to render a term. Each pretty printer takes a value and an incoming precedence as arguments; the value is used to determine what to display, and the precedence to determine whether parentheses are needed. If the natural precedence of the term is higher than that of the context in which it's used, then it must be wrapped in parentheses.

open Std (FormatStd.Format : TypeA string with pretty-printing information for rendering in a column-width-aware way. The pretty-printing algorithm is based on Wadler's paper [_A Prettier Printer_](https://homepages.inf.ed.ac.uk/wadler/papers/prettier/prettier.pdf). ) open Lean.Parser (maxPrecLean.Parser.maxPrec : NatMaximal (and function application) precedence. In the standard lean language, no parser has precedence higher than `maxPrec`. Note that nothing prevents users from using a higher precedence, but we strongly discourage them from doing it. argPrecLean.Parser.argPrec : Nat minPrecLean.Parser.minPrec : Nat) def argsAndBody_root_.argsAndBody (a✝ : Tm) : List LName × Tm : TmTm : TypeTerms ListList.{u} (α : Type u) : Type u`List α` is the type of ordered lists with elements of type `α`. It is implemented as a linked list. `List α` is isomorphic to `Array α`, but they are useful for different things: * `List α` is easier for reasoning, and `Array α` is modeled as a wrapper around `List α` * `List α` works well as a persistent data structure, when many copies of the tail are shared. When the value is not shared, `Array α` will have better performance because it can do destructive updates. LNameLName : TypeLocal names × TmTm : TypeTerms | .lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm xLName bodyTm => let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` (argsList LName, body'Tm) := argsAndBody_root_.argsAndBody (a✝ : Tm) : List LName × Tm bodyTm (xLName :: argsList LName, body'Tm) | otherTm => ([], otherTm) partial def Tm.formatTm.format (prec : Nat) (a✝ : Tm) : Format (precNat : NatNat : TypeThe type of natural numbers, starting at zero. It is defined as an inductive type freely generated by "zero is a natural number" and "the successor of a natural number is a natural number". You can prove a theorem `P n` about `n : Nat` by `induction n`, which will expect a proof of the theorem for `P 0`, and a proof of `P (succ i)` assuming a proof of `P i`. The same method also works to define functions by recursion on natural numbers: induction and recursion are two expressions of the same operation from Lean's point of view. ``` open Nat example (n : Nat) : n < succ n := by induction n with | zero => show 0 < 1 decide | succ i ih => -- ih : i < succ i show succ i < succ (succ i) exact Nat.succ_lt_succ ih ``` This type is special-cased by both the kernel and the compiler: * The type of expressions contains "`Nat` literals" as a primitive constructor, and the kernel knows how to reduce zero/succ expressions to nat literals. * If implemented naively, this type would represent a numeral `n` in unary as a linked list with `n` links, which is horribly inefficient. Instead, the runtime itself has a special representation for `Nat` which stores numbers up to 2^63 directly and larger numbers use an arbitrary precision "bignum" library (usually [GMP](https://gmplib.org/)). ) : TmTm : TypeTerms FormatStd.Format : TypeA string with pretty-printing information for rendering in a column-width-aware way. The pretty-printing algorithm is based on Wadler's paper [_A Prettier Printer_](https://homepages.inf.ed.ac.uk/wadler/papers/prettier/prettier.pdf). | .lamTm.lam (a✝ : LName) (a✝¹ : Tm) : Tm xLName bodyTm => let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` (moreArgsList LName, body'Tm) := argsAndBodyargsAndBody (a✝ : Tm) : List LName × Tm bodyTm let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` argsList LName := xLName :: moreArgsList LName .groupStd.Format.group (a✝ : Format) (behavior : Format.FlattenBehavior := Format.FlattenBehavior.allOrNone) : FormatCreates a new flattening group for the given inner format. <| "fun ""fun " : String ++ (Format.textStd.Format.text (a✝ : String) : FormatA node containing a plain string. " "" " : String).joinSepStd.Format.joinSep.{u} {α : Type u} [inst✝ : Lean.ToFormat α] (a✝ : List α) (a✝¹ : Format) : FormatIntersperse the given list (each item printed with `format`) with the given `sep` format. (argsList LName.mapList.map.{u, v} {α : Type u} {β : Type v} (f : α → β) (a✝ : List α) : List β`O(|l|)`. `map f l` applies `f` to each element of the list. * `map f [a, b, c] = [f a, f b, f c]` (Format.textStd.Format.text (a✝ : String) : FormatA node containing a plain string. ·.nameLName.name (self : LName) : String)) ++ " =>"" =>" : String ++ .nestStd.Format.nest (indent : Int) (a✝ : Format) : Format`nest n f` tells the formatter that `f` is nested inside something with length `n` so that it is pretty-printed with the correct indentation on a line break. For example, we can define a formatter for list `l : List Format` as: ``` let f := join <| l.intersperse <| ", " ++ Format.line group (nest 1 <| "[" ++ f ++ "]") ``` This will be written all on one line, but if the text is too large, the formatter will put in linebreaks after the commas and indent later lines by 1. 2 (.lineStd.Format.line : FormatA position where a newline may be inserted if the current group does not fit within the allotted column width. ++ body'Tm.formatTm.format (prec : Nat) (a✝ : Tm) : Format minPrecLean.Parser.minPrec : Nat) | .letTm.let (a✝ : LName) (a✝¹a✝² : Tm) : Tm xLName eTm bodyTm => .groupStd.Format.group (a✝ : Format) (behavior : Format.FlattenBehavior := Format.FlattenBehavior.allOrNone) : FormatCreates a new flattening group for the given inner format. <| "let""let" : String ++ " "" " : String ++ .groupStd.Format.group (a✝ : Format) (behavior : Format.FlattenBehavior := Format.FlattenBehavior.allOrNone) : FormatCreates a new flattening group for the given inner format. (.nestStd.Format.nest (indent : Int) (a✝ : Format) : Format`nest n f` tells the formatter that `f` is nested inside something with length `n` so that it is pretty-printed with the correct indentation on a line break. For example, we can define a formatter for list `l : List Format` as: ``` let f := join <| l.intersperse <| ", " ++ Format.line group (nest 1 <| "[" ++ f ++ "]") ``` This will be written all on one line, but if the text is too large, the formatter will put in linebreaks after the commas and indent later lines by 1. 2 (xLName.nameLName.name (self : LName) : String ++ " :="" :=" : String ++ .lineStd.Format.line : FormatA position where a newline may be inserted if the current group does not fit within the allotted column width. ++ eTm.formatTm.format (prec : Nat) (a✝ : Tm) : Format maxPrecLean.Parser.maxPrec : NatMaximal (and function application) precedence. In the standard lean language, no parser has precedence higher than `maxPrec`. Note that nothing prevents users from using a higher precedence, but we strongly discourage them from doing it. )) ++ .lineStd.Format.line : FormatA position where a newline may be inserted if the current group does not fit within the allotted column width. ++ "in""in" : String ++ .lineStd.Format.line : FormatA position where a newline may be inserted if the current group does not fit within the allotted column width. ++ .groupStd.Format.group (a✝ : Format) (behavior : Format.FlattenBehavior := Format.FlattenBehavior.allOrNone) : FormatCreates a new flattening group for the given inner format. (.nestStd.Format.nest (indent : Int) (a✝ : Format) : Format`nest n f` tells the formatter that `f` is nested inside something with length `n` so that it is pretty-printed with the correct indentation on a line break. For example, we can define a formatter for list `l : List Format` as: ``` let f := join <| l.intersperse <| ", " ++ Format.line group (nest 1 <| "[" ++ f ++ "]") ``` This will be written all on one line, but if the text is too large, the formatter will put in linebreaks after the commas and indent later lines by 1. 2 <| bodyTm.formatTm.format (prec : Nat) (a✝ : Tm) : Format minPrecLean.Parser.minPrec : Nat) | .appTm.app (a✝a✝¹ : Tm) : Tm e1Tm e2Tm => fmtAppTm.format.fmtApp (prec : Nat) (elts : Format) : Format <| e1Tm.formatTm.format (prec : Nat) (a✝ : Tm) : Format argPrecLean.Parser.argPrec : Nat ++ .lineStd.Format.line : FormatA position where a newline may be inserted if the current group does not fit within the allotted column width. ++ e2Tm.formatTm.format (prec : Nat) (a✝ : Tm) : Format maxPrecLean.Parser.maxPrec : NatMaximal (and function application) precedence. In the standard lean language, no parser has precedence higher than `maxPrec`. Note that nothing prevents users from using a higher precedence, but we strongly discourage them from doing it. | .topTm.top (a✝ : TName) : Tm xTName => xTName.nameTName.name (self : TName) : String | .locTm.loc (a✝ : LName) : Tm xLName => xLName.nameLName.name (self : LName) : String where fmtAppTm.format.fmtApp (prec : Nat) (elts : Format) : Format (eltsFormat : FormatStd.Format : TypeA string with pretty-printing information for rendering in a column-width-aware way. The pretty-printing algorithm is based on Wadler's paper [_A Prettier Printer_](https://homepages.inf.ed.ac.uk/wadler/papers/prettier/prettier.pdf). ) : FormatStd.Format : TypeA string with pretty-printing information for rendering in a column-width-aware way. The pretty-printing algorithm is based on Wadler's paper [_A Prettier Printer_](https://homepages.inf.ed.ac.uk/wadler/papers/prettier/prettier.pdf). := Repr.addAppParenRepr.addAppParen (f : Format) (prec : Nat) : Format (.groupStd.Format.group (a✝ : Format) (behavior : Format.FlattenBehavior := Format.FlattenBehavior.allOrNone) : FormatCreates a new flattening group for the given inner format. (.nestStd.Format.nest (indent : Int) (a✝ : Format) : Format`nest n f` tells the formatter that `f` is nested inside something with length `n` so that it is pretty-printed with the correct indentation on a line break. For example, we can define a formatter for list `l : List Format` as: ``` let f := join <| l.intersperse <| ", " ++ Format.line group (nest 1 <| "[" ++ f ++ "]") ``` This will be written all on one line, but if the text is too large, the formatter will put in linebreaks after the commas and indent later lines by 1. 2 eltsFormat)) precNat instance : ReprRepr.{u} (α : Type u) : Type u TmTm : TypeTerms where reprPrecTm → Nat → Format tmTm _ := .groupStd.Format.group (a✝ : Format) (behavior : Format.FlattenBehavior := Format.FlattenBehavior.allOrNone) : FormatCreates a new flattening group for the given inner format. <| .nestStd.Format.nest (indent : Int) (a✝ : Format) : Format`nest n f` tells the formatter that `f` is nested inside something with length `n` so that it is pretty-printed with the correct indentation on a line break. For example, we can define a formatter for list `l : List Format` as: ``` let f := join <| l.intersperse <| ", " ++ Format.line group (nest 1 <| "[" ++ f ++ "]") ``` This will be written all on one line, but if the text is too large, the formatter will put in linebreaks after the commas and indent later lines by 1. 2 ("{{{""{{{" : String ++ .lineStd.Format.line : FormatA position where a newline may be inserted if the current group does not fit within the allotted column width. ++ tmTm.formatTm.format (prec : Nat) (a✝ : Tm) : Format minPrecLean.Parser.minPrec : Nat) ++ .lineStd.Format.line : FormatA position where a newline may be inserted if the current group does not fit within the allotted column width. ++ "}}}""}}}" : String

The output is wrapped in {{{ and }}} because this syntax is used later in this post to input embedded language terms. With the improved ReprRepr.{u} (α : Type u) : Type u TmTm : TypeTerms instance, the result is now much more clearly the Church numeral for four:

{{{ fun f z => f (f (f (f z))) }}} #eval nfTopnfTop (unfold : UnfoldTop) (topDefs : Array (TName × Tm)) (program : Tm) : TmNormalize or semi-normalize a term under some top-level declarations .unfoldUnfoldTop.unfold : UnfoldTop examplesexamples : Array (TName × Tm) (.appTm.app (a✝a✝¹ : Tm) : Tm (.appTm.app (a✝a✝¹ : Tm) : Tm (.topTm.top (a✝ : TName) : Tm "add""add" : String) (.topTm.top (a✝ : TName) : Tm "two""two" : String)) (.topTm.top (a✝ : TName) : Tm "two""two" : String))
{{{ fun f z => f (f (f (f z))) }}}

Similarly, the version without unfolding is easier to read:

{{{ add two two }}} #eval nfTopnfTop (unfold : UnfoldTop) (topDefs : Array (TName × Tm)) (program : Tm) : TmNormalize or semi-normalize a term under some top-level declarations .noUnfoldUnfoldTop.noUnfold : UnfoldTop examplesexamples : Array (TName × Tm) (.appTm.app (a✝a✝¹ : Tm) : Tm (.appTm.app (a✝a✝¹ : Tm) : Tm (.topTm.top (a✝ : TName) : Tm "add""add" : String) (.topTm.top (a✝ : TName) : Tm "two""two" : String)) (.topTm.top (a✝ : TName) : Tm "two""two" : String))
{{{ add two two }}}

With the output now easier to understand, it's time to see what can be done to make better input.

Improving the Input

There are two basic strategies that can be used to make it easier to write down a data structure: coercions and syntax extensions. Coercions allow Lean to automatically convert expressions with one type into another, and a coercion from StringString : Type`String` is the type of (UTF-8 encoded) strings. The compiler overrides the data representation of this type to a byte sequence, and both `String.utf8ByteSize` and `String.length` are cached and O(1). to TNameTName : TypeTop-level names or LNameLName : TypeLocal names could make it a bit shorter to write names as string literals. However, coercions are mostly useful for removing noisy explicit calls to conversion operators, and that isn't the main contributor to Tm_root_.Tm : TypeTerms being noisy to write. For more difficult cases, syntax extensions allow the Lean parser to be extended essentially arbitrarily, allowing much greater notational flexibility. Because the desired surface syntax is so far from the Tm_root_.Tm : TypeTerms datatype, this is the right tool for the job.

The first step is to define a new syntactic category, corresponding to adding a new nonterminal to Lean's grammar. The syntactic category is called embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression , because it is used to represent terms in the embedded language:

/-- Embedded language expression -/ declare_syntax_cat embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression

Each production in the grammar of Tm_root_.Tm : TypeTerms has a parser rule, except both top-level and local names are represented by identifiers. The :max and :arg annotations on syntax rules and subterms specify the precedence and associativity of the operators.

/-- A variable reference -/ syntax:maxMaximum precedence used in term parsers, in particular for terms in function position (`ident`, `paren`, ...) identLean.Parser.ident : Lean.Parser.ParserThe parser `ident` parses a single identifier, possibly with namespaces, such as `foo` or `bar.baz`. The identifier must not be a declared token, so for example it will not match `"def"` because `def` is a keyword token. Tokens are implicitly declared by using them in string literals in parser declarations, so `syntax foo := "bla"` will make `bla` no longer legal as an identifier. Identifiers can contain special characters or keywords if they are escaped using the `«»` characters: `«def»` is an identifier named `def`, and `«x»` is treated the same as `x`. This is useful for using disallowed characters in identifiers such as `«foo.bar».baz` or `«hello world»`. This parser has arity 1: it produces a `Syntax.ident` node containing the parsed identifier. You can use `TSyntax.getId` to extract the name from the resulting syntax object. : embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression /-- Grouping of expressions -/ syntax "(""(" : String embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression ")"")" : String : embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression /-- Function application -/ syntax:argPrecedence used for application arguments (`do`, `by`, ...). embedded:argPrecedence used for application arguments (`do`, `by`, ...). embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression :maxMaximum precedence used in term parsers, in particular for terms in function position (`ident`, `paren`, ...) : embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression /-- A function -/ syntax:maxMaximum precedence used in term parsers, in particular for terms in function position (`ident`, `paren`, ...) "fun""fun" : String identLean.Parser.ident : Lean.Parser.ParserThe parser `ident` parses a single identifier, possibly with namespaces, such as `foo` or `bar.baz`. The identifier must not be a declared token, so for example it will not match `"def"` because `def` is a keyword token. Tokens are implicitly declared by using them in string literals in parser declarations, so `syntax foo := "bla"` will make `bla` no longer legal as an identifier. Identifiers can contain special characters or keywords if they are escaped using the `«»` characters: `«def»` is an identifier named `def`, and `«x»` is treated the same as `x`. This is useful for using disallowed characters in identifiers such as `«foo.bar».baz` or `«hello world»`. This parser has arity 1: it produces a `Syntax.ident` node containing the parsed identifier. You can use `TSyntax.getId` to extract the name from the resulting syntax object. + " => "" => " : String embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression :argPrecedence used for application arguments (`do`, `by`, ...). : embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression /-- A local definition -/ syntax:maxMaximum precedence used in term parsers, in particular for terms in function position (`ident`, `paren`, ...) "let""let" : String identLean.Parser.ident : Lean.Parser.ParserThe parser `ident` parses a single identifier, possibly with namespaces, such as `foo` or `bar.baz`. The identifier must not be a declared token, so for example it will not match `"def"` because `def` is a keyword token. Tokens are implicitly declared by using them in string literals in parser declarations, so `syntax foo := "bla"` will make `bla` no longer legal as an identifier. Identifiers can contain special characters or keywords if they are escaped using the `«»` characters: `«def»` is an identifier named `def`, and `«x»` is treated the same as `x`. This is useful for using disallowed characters in identifiers such as `«foo.bar».baz` or `«hello world»`. This parser has arity 1: it produces a `Syntax.ident` node containing the parsed identifier. You can use `TSyntax.getId` to extract the name from the resulting syntax object. " := "" := " : String embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression "in""in" : String embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression : embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression

In order to write functions like toChurch, it is also convenient to be able to embed Lean expressions that compute a Tm, making the expression syntax into a form of quasiquotation, just as $ can be used to escape Lean's quotations.

syntax:maxMaximum precedence used in term parsers, in particular for terms in function position (`ident`, `paren`, ...) "~""~" : String termLean.Parser.Category.term : Lean.Parser.Category`term` is the builtin syntax category for terms. A term denotes an expression in lean's type theory, for example `2 + 2` is a term. The difference between `Term` and `Expr` is that the former is a kind of syntax, while the latter is the result of elaboration. For example `by simp` is also a `Term`, but it elaborates to different `Expr`s depending on the context. :maxMaximum precedence used in term parsers, in particular for terms in function position (`ident`, `paren`, ...) : embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression

Similarly, embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression needs to actually be embedded into Lean terms—otherwise, there's no way for the parser to reach it:

syntax "{{{ ""{{{ " : String embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression " }}}"" }}}" : String : termLean.Parser.Category.term : Lean.Parser.Category`term` is the builtin syntax category for terms. A term denotes an expression in lean's type theory, for example `2 + 2` is a term. The difference between `Term` and `Expr` is that the former is a kind of syntax, while the latter is the result of elaboration. For example `by simp` is also a `Term`, but it elaborates to different `Expr`s depending on the context.

The current syntax doesn't provide a convenient way to define top-level constants. Keeping with the theme of imitating Lean's own syntax, def can be used for embedded language definitions as well. This also demonstrates the addition of whitespace-sensitivity to the syntax: withPositionLean.Parser.withPosition (a✝ : Lean.Parser.Parser) : Lean.Parser.Parser`withPosition(p)` runs `p` while setting the "saved position" to the current position. This has no effect on its own, but various other parsers access this position to achieve some composite effect: * `colGt`, `colGe`, `colEq` compare the column of the saved position to the current position, used to implement Python-style indentation sensitive blocks * `lineEq` ensures that the current position is still on the same line as the saved position, used to implement composite tokens The saved position is only available in the read-only state, which is why this is a scoping parser: after the `withPosition(..)` block the saved position will be restored to its original value. This parser has the same arity as `p` - it just forwards the results of `p`. saves a parser position, and colGt fails if the current column is not greater than the column of the most recent saved position. In other words, these definitions' bodies must be at least as indented as the def keywords.

/-- Embedded-language declarations -/ declare_syntax_cat declLean.Parser.Category.decl : Lean.Parser.CategoryEmbedded-language declarations /-- A top-level definition -/ syntax withPositionLean.Parser.withPosition (a✝ : Lean.Parser.Parser) : Lean.Parser.Parser`withPosition(p)` runs `p` while setting the "saved position" to the current position. This has no effect on its own, but various other parsers access this position to achieve some composite effect: * `colGt`, `colGe`, `colEq` compare the column of the saved position to the current position, used to implement Python-style indentation sensitive blocks * `lineEq` ensures that the current position is still on the same line as the saved position, used to implement composite tokens The saved position is only available in the read-only state, which is why this is a scoping parser: after the `withPosition(..)` block the saved position will be restored to its original value. This parser has the same arity as `p` - it just forwards the results of `p`. ("def""def" : String identLean.Parser.ident : Lean.Parser.ParserThe parser `ident` parses a single identifier, possibly with namespaces, such as `foo` or `bar.baz`. The identifier must not be a declared token, so for example it will not match `"def"` because `def` is a keyword token. Tokens are implicitly declared by using them in string literals in parser declarations, so `syntax foo := "bla"` will make `bla` no longer legal as an identifier. Identifiers can contain special characters or keywords if they are escaped using the `«»` characters: `«def»` is an identifier named `def`, and `«x»` is treated the same as `x`. This is useful for using disallowed characters in identifiers such as `«foo.bar».baz` or `«hello world»`. This parser has arity 1: it produces a `Syntax.ident` node containing the parsed identifier. You can use `TSyntax.getId` to extract the name from the resulting syntax object. " := "" := " : String colGtLean.Parser.checkColGt (errorMsg : String := "checkColGt") : Lean.Parser.ParserThe `colGt` parser requires that the next token starts a strictly greater column than the saved position (see `withPosition`). This can be used for whitespace sensitive syntax for the arguments to a tactic, to ensure that the following tactic is not interpreted as an argument. ``` example (x : False) : False := by revert x exact id ``` Here, the `revert` tactic is followed by a list of `colGt ident`, because otherwise it would interpret `exact` as an identifier and try to revert a variable named `exact`. This parser has arity 0 - it does not capture anything. embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression ) : declLean.Parser.Category.decl : Lean.Parser.CategoryEmbedded-language declarations

Finally, the #norm_embedded command provides a syntax for normalizing an expression in the context of a sequence of declarations. The optional parameter unfoldTop allows users to control whether full or partial normalization should be used.

/-- Normalize an expression in the context of some top-level definitions -/ syntax withPositionLean.Parser.withPosition (a✝ : Lean.Parser.Parser) : Lean.Parser.Parser`withPosition(p)` runs `p` while setting the "saved position" to the current position. This has no effect on its own, but various other parsers access this position to achieve some composite effect: * `colGt`, `colGe`, `colEq` compare the column of the saved position to the current position, used to implement Python-style indentation sensitive blocks * `lineEq` ensures that the current position is still on the same line as the saved position, used to implement composite tokens The saved position is only available in the read-only state, which is why this is a scoping parser: after the `withPosition(..)` block the saved position will be restored to its original value. This parser has the same arity as `p` - it just forwards the results of `p`. ( "#norm_embedded""#norm_embedded" : String groupLean.Parser.group (p : Lean.Parser.Parser) : Lean.Parser.ParserThe parser `group(p)` parses the same thing as `p`, but it wraps the results in a `groupKind` node. This parser always has arity 1, even if `p` does not. Parsers like `p*` are automatically rewritten to `group(p)*` if `p` does not have arity 1, so that the results from separate invocations of `p` can be differentiated. ("(""(" : String &"unfoldTop""unfoldTop" : String ":="":=" : String identLean.Parser.ident : Lean.Parser.ParserThe parser `ident` parses a single identifier, possibly with namespaces, such as `foo` or `bar.baz`. The identifier must not be a declared token, so for example it will not match `"def"` because `def` is a keyword token. Tokens are implicitly declared by using them in string literals in parser declarations, so `syntax foo := "bla"` will make `bla` no longer legal as an identifier. Identifiers can contain special characters or keywords if they are escaped using the `«»` characters: `«def»` is an identifier named `def`, and `«x»` is treated the same as `x`. This is useful for using disallowed characters in identifiers such as `«foo.bar».baz` or `«hello world»`. This parser has arity 1: it produces a `Syntax.ident` node containing the parsed identifier. You can use `TSyntax.getId` to extract the name from the resulting syntax object. ")"")" : String)? (colGtLean.Parser.checkColGt (errorMsg : String := "checkColGt") : Lean.Parser.ParserThe `colGt` parser requires that the next token starts a strictly greater column than the saved position (see `withPosition`). This can be used for whitespace sensitive syntax for the arguments to a tactic, to ensure that the following tactic is not interpreted as an argument. ``` example (x : False) : False := by revert x exact id ``` Here, the `revert` tactic is followed by a list of `colGt ident`, because otherwise it would interpret `exact` as an identifier and try to revert a variable named `exact`. This parser has arity 0 - it does not capture anything. declLean.Parser.Category.decl : Lean.Parser.CategoryEmbedded-language declarations )* colGtLean.Parser.checkColGt (errorMsg : String := "checkColGt") : Lean.Parser.ParserThe `colGt` parser requires that the next token starts a strictly greater column than the saved position (see `withPosition`). This can be used for whitespace sensitive syntax for the arguments to a tactic, to ensure that the following tactic is not interpreted as an argument. ``` example (x : False) : False := by revert x exact id ``` Here, the `revert` tactic is followed by a list of `colGt ident`, because otherwise it would interpret `exact` as an identifier and try to revert a variable named `exact`. This parser has arity 0 - it does not capture anything. (withPositionLean.Parser.withPosition (a✝ : Lean.Parser.Parser) : Lean.Parser.Parser`withPosition(p)` runs `p` while setting the "saved position" to the current position. This has no effect on its own, but various other parsers access this position to achieve some composite effect: * `colGt`, `colGe`, `colEq` compare the column of the saved position to the current position, used to implement Python-style indentation sensitive blocks * `lineEq` ensures that the current position is still on the same line as the saved position, used to implement composite tokens The saved position is only available in the read-only state, which is why this is a scoping parser: after the `withPosition(..)` block the saved position will be restored to its original value. This parser has the same arity as `p` - it just forwards the results of `p`. ("=>""=>" : String colGtLean.Parser.checkColGt (errorMsg : String := "checkColGt") : Lean.Parser.ParserThe `colGt` parser requires that the next token starts a strictly greater column than the saved position (see `withPosition`). This can be used for whitespace sensitive syntax for the arguments to a tactic, to ensure that the following tactic is not interpreted as an argument. ``` example (x : False) : False := by revert x exact id ``` Here, the `revert` tactic is followed by a list of `colGt ident`, because otherwise it would interpret `exact` as an identifier and try to revert a variable named `exact`. This parser has arity 0 - it does not capture anything. embeddedLean.Parser.Category.embedded : Lean.Parser.CategoryEmbedded language expression ))) : commandLean.Parser.Category.command : Lean.Parser.Category`command` is the syntax category for things that appear at the top level of a lean file. For example, `def foo := 1` is a `command`, as is `namespace Foo` and `end Foo`. Commands generally have an effect on the state of adding something to the environment (like a new definition), as well as commands like `variable` which modify future commands within a scope.

Lean's syntax has been extended, but there's still no semantics for the new syntax. Attempting to use it only results in an error:

example := elaboration function for '«term{{{_}}}»' has not been implemented {{{ fun x => x }}}{{{ funA function x => x }}}
elaboration function for '«term{{{_}}}»' has not been implemented
  {{{ fun x => x }}}

Lean needs to be given instructions for translating this new syntax into something it understands.

First Attempt: Reusing Lean's Binders

Because the embedded language contains binding forms, it's important to find a way to associate the appropriate TNameTName : TypeTop-level names and LNameLName : TypeLocal names values with the names that the user writes. In this first attempt, the code that implements embedded-language binding forms will emit Lean let-expressions that bind the surface name to the underlying value. To keep the multiple implementations separate, each is placed in its own namespace so that scoped rules won't interfere with each other:

namespace LeanBinders

The new syntax is given meaning by defining it as a macro. Each case of the macro translates a quotation of the embedded language to Lean code that constructs the appropriate AST. Whenever a variable is bound in the embedded language, it is implemented by a let-binding in the generated code. The placement of this let determines the variable's scope, so it's important that the case for let places the expansion of e outside the scope of the binding for $x.

open Lean in scoped macro_rules | `({{{ ( $eTSyntax `embedded ) }}}) => `({{{ $eTSyntax `embedded }}}) | `({{{ $e1TSyntax `embedded $e2TSyntax `embedded }}}) => `(Tm.app {{{ $e1TSyntax `embedded }}} {{{ $e2TSyntax `embedded }}}) | `({{{ $xTSyntax `ident:ident }}}) => `($xTSyntax `ident) | `({{{ funA function $xTSyntax `ident => $bodyTSyntax `embedded }}}) => do -- Rather than just using `x`, create a new string -- without its source positions to avoid misleading -- hovers let xNameStrLit : StrLitLean.Syntax.StrLit : Type := quoteLean.Quote.quote {α : Type} {k : optParam SyntaxNodeKind `term} [self : Quote α k] (a✝ : α) : TSyntax k (toStringToString.toString.{u} {α : Type u} [self : ToString α] (a✝ : α) : String xTSyntax `ident.getIdLean.TSyntax.getId (s : Ident) : Name) `(let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` $xTSyntax `ident := Tm.loc $xNameStrLit Tm.lam $xNameStrLit {{{ $bodyTSyntax `embedded }}}) | `({{{ funA function $xTSyntax `ident $yTSyntax `ident $xsTSyntaxArray `ident* => $bodyTSyntax `embedded }}}) => do let xNameStrLit : StrLitLean.Syntax.StrLit : Type := quoteLean.Quote.quote {α : Type} {k : optParam SyntaxNodeKind `term} [self : Quote α k] (a✝ : α) : TSyntax k (toStringToString.toString.{u} {α : Type u} [self : ToString α] (a✝ : α) : String xTSyntax `ident.getIdLean.TSyntax.getId (s : Ident) : Name) `(let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` $xTSyntax `ident := Tm.loc $xNameStrLit Tm.lam $xNameStrLit {{{ funA function $yTSyntax `ident $xsTSyntaxArray `ident* => $bodyTSyntax `embedded }}}) | `({{{ letA local definition $xTSyntax `ident := $eTSyntax `embedded inA local definition $bodyTSyntax `embedded }}}) => do let xNameStrLit : StrLitLean.Syntax.StrLit : Type := quoteLean.Quote.quote {α : Type} {k : optParam SyntaxNodeKind `term} [self : Quote α k] (a✝ : α) : TSyntax k (toStringToString.toString.{u} {α : Type u} [self : ToString α] (a✝ : α) : String xTSyntax `ident.getIdLean.TSyntax.getId (s : Ident) : Name) `(let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` e := {{{ $eTSyntax `embedded }}} let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` $xTSyntax `ident := Tm.loc $xNameStrLit Tm.let $xNameStrLit e {{{ $bodyTSyntax `embedded }}}) | `({{{ ~$eTSyntax `term }}}) => purePure.pure.{u, v} {f : Type u → Type v} [self : Pure f] {α : Type u} (a✝ : α) : f αIf `a : α`, then `pure a : f α` represents a monadic action that does nothing and returns `a`. eTSyntax `term

With this macro in place, toChurchLeanBinders.toChurch (n : Nat) : Tm can be implemented using quasiquotations, and the bindings of f and z can cross the boundaries of antiquotations and be used in nested quotations:

def toChurchLeanBinders.toChurch (n : Nat) : Tm (nNat : NatNat : TypeThe type of natural numbers, starting at zero. It is defined as an inductive type freely generated by "zero is a natural number" and "the successor of a natural number is a natural number". You can prove a theorem `P n` about `n : Nat` by `induction n`, which will expect a proof of the theorem for `P 0`, and a proof of `P (succ i)` assuming a proof of `P i`. The same method also works to define functions by recursion on natural numbers: induction and recursion are two expressions of the same operation from Lean's point of view. ``` open Nat example (n : Nat) : n < succ n := by induction n with | zero => show 0 < 1 decide | succ i ih => -- ih : i < succ i show succ i < succ (succ i) exact Nat.succ_lt_succ ih ``` This type is special-cased by both the kernel and the compiler: * The type of expressions contains "`Nat` literals" as a primitive constructor, and the kernel knows how to reduce zero/succ expressions to nat literals. * If implemented naively, this type would represent a numeral `n` in unary as a linked list with `n` links, which is horribly inefficient. Instead, the runtime itself has a special representation for `Nat` which stores numbers up to 2^63 directly and larger numbers use an arbitrary precision "bignum" library (usually [GMP](https://gmplib.org/)). ) : TmTm : TypeTerms := {{{ funA function fTm zTm => ~(let rec mkBodyLeanBinders.toChurch.mkBody (f z : Tm) (x✝ : Nat) : Tm | 0 => {{{ zTm }}} | kNat + 1 => {{{ fTm ~(mkBodyLeanBinders.toChurch.mkBody (f z : Tm) (x✝ : Nat) : Tm kNat) }}} mkBodyLeanBinders.toChurch.mkBody (f z : Tm) (x✝ : Nat) : Tm nNat) }}}{{{ fun f z => f (f (f (f (f z)))) }}} #eval toChurchLeanBinders.toChurch (n : Nat) : Tm 5
{{{ fun f z => f (f (f (f (f z)))) }}}

This strategy has a number of nice properties:

  • Because embedded language binding forms are represented by native Lean binding forms, features like occurrence highlighting and go-to-definition work properly for quotations with no additional work.

  • The scoping rules of the embedded language can be read directly from the implementation.

  • Mistakes in variable names are caught while writing the program, rather than while running it.

  • Macros are easier to write than other kinds of metaprograms.

Unfortunately, there is a key downside to this approach that undermines its usability in practice. Even though Lean and the embedded language conceptually have completely separate namespaces, Lean binders intrude into embedded language terms, leading to error messages that leak details about the encoding. One way that this can go wrong is when Lean binders are accidentally referred to from the embedded language:

example (nNat : NatNat : TypeThe type of natural numbers, starting at zero. It is defined as an inductive type freely generated by "zero is a natural number" and "the successor of a natural number is a natural number". You can prove a theorem `P n` about `n : Nat` by `induction n`, which will expect a proof of the theorem for `P 0`, and a proof of `P (succ i)` assuming a proof of `P i`. The same method also works to define functions by recursion on natural numbers: induction and recursion are two expressions of the same operation from Lean's point of view. ``` open Nat example (n : Nat) : n < succ n := by induction n with | zero => show 0 < 1 decide | succ i ih => -- ih : i < succ i show succ i < succ (succ i) exact Nat.succ_lt_succ ih ``` This type is special-cased by both the kernel and the compiler: * The type of expressions contains "`Nat` literals" as a primitive constructor, and the kernel knows how to reduce zero/succ expressions to nat literals. * If implemented naively, this type would represent a numeral `n` in unary as a linked list with `n` links, which is horribly inefficient. Instead, the runtime itself has a special representation for `Nat` which stores numbers up to 2^63 directly and larger numbers use an arbitrary precision "bignum" library (usually [GMP](https://gmplib.org/)). ) : TmTm : TypeTerms := application type mismatch Tm.lam { name := "xyz" } n argument n has type Nat : Type but is expected to have type Tm : Type{{{ funA function xyzTm => nNat }}}
application type mismatch
  Tm.lam { name := "xyz" } n
argument
  n
has type
  Nat : Type
but is expected to have type
  Tm : Type

Similarly, embedded language bindings leak into Lean, shadowing Lean's functions:

example (nNat : NatNat : TypeThe type of natural numbers, starting at zero. It is defined as an inductive type freely generated by "zero is a natural number" and "the successor of a natural number is a natural number". You can prove a theorem `P n` about `n : Nat` by `induction n`, which will expect a proof of the theorem for `P 0`, and a proof of `P (succ i)` assuming a proof of `P i`. The same method also works to define functions by recursion on natural numbers: induction and recursion are two expressions of the same operation from Lean's point of view. ``` open Nat example (n : Nat) : n < succ n := by induction n with | zero => show 0 < 1 decide | succ i ih => -- ih : i < succ i show succ i < succ (succ i) exact Nat.succ_lt_succ ih ``` This type is special-cased by both the kernel and the compiler: * The type of expressions contains "`Nat` literals" as a primitive constructor, and the kernel knows how to reduce zero/succ expressions to nat literals. * If implemented naively, this type would represent a numeral `n` in unary as a linked list with `n` links, which is horribly inefficient. Instead, the runtime itself has a special representation for `Nat` which stores numbers up to 2^63 directly and larger numbers use an arbitrary precision "bignum" library (usually [GMP](https://gmplib.org/)). ) : TmTm : TypeTerms := {{{ funA function nTm => ~(toChurchLeanBinders.toChurch (n : Nat) : Tm application type mismatch toChurch n argument n has type Tm : Type but is expected to have type Nat : TypenTm) }}}
application type mismatch
  toChurch n
argument
  n
has type
  Tm : Type
but is expected to have type
  Nat : Type

Declarations

To implement #norm_embedded as a macro, a bit of cleverness is needed. This is because each of the declarations should be in scope for the next, and all of them should be in scope for the final expression. This means that either the macro will need to be complex, generating the necessary code in one fell swoop, or it will need to track scopes carefully while expanding step by step. Usually, the latter is easier to read and maintain. One way to do this is to create a term version of #norm_embedded that includes a position for the environment that is being constructed:

scoped syntax "#norm_embedded""#norm_embedded" : String groupLean.Parser.group (p : Lean.Parser.Parser) : Lean.Parser.ParserThe parser `group(p)` parses the same thing as `p`, but it wraps the results in a `groupKind` node. This parser always has arity 1, even if `p` does not. Parsers like `p*` are automatically rewritten to `group(p)*` if `p` does not have arity 1, so that the results from separate invocations of `p` can be differentiated. ("(""(" : String &"unfoldTop""unfoldTop" : String ":="":=" : String identLean.Parser.ident : Lean.Parser.ParserThe parser `ident` parses a single identifier, possibly with namespaces, such as `foo` or `bar.baz`. The identifier must not be a declared token, so for example it will not match `"def"` because `def` is a keyword token. Tokens are implicitly declared by using them in string literals in parser declarations, so `syntax foo := "bla"` will make `bla` no longer legal as an identifier. Identifiers can contain special characters or keywords if they are escaped using the `«»` characters: `«def»` is an identifier named `def`, and `«x»` is treated the same as `x`. This is useful for using disallowed characters in identifiers such as `«foo.bar».baz` or `«hello world»`. This parser has arity 1: it produces a `Syntax.ident` node containing the parsed identifier. You can use `TSyntax.getId` to extract the name from the resulting syntax object. ")"")" : String) declLean.Parser.Category.decl : Lean.Parser.CategoryEmbedded-language declarations * "⟨""⟨" : String identLean.Parser.ident : Lean.Parser.ParserThe parser `ident` parses a single identifier, possibly with namespaces, such as `foo` or `bar.baz`. The identifier must not be a declared token, so for example it will not match `"def"` because `def` is a keyword token. Tokens are implicitly declared by using them in string literals in parser declarations, so `syntax foo := "bla"` will make `bla` no longer legal as an identifier. Identifiers can contain special characters or keywords if they are escaped using the `«»` characters: `«def»` is an identifier named `def`, and `«x»` is treated the same as `x`. This is useful for using disallowed characters in identifiers such as `«foo.bar».baz` or `«hello world»`. This parser has arity 1: it produces a `Syntax.ident` node containing the parsed identifier. You can use `TSyntax.getId` to extract the name from the resulting syntax object. "⟩=>""⟩=>" : String termLean.Parser.Category.term : Lean.Parser.Category`term` is the builtin syntax category for terms. A term denotes an expression in lean's type theory, for example `2 + 2` is a term. The difference between `Term` and `Expr` is that the former is a kind of syntax, while the latter is the result of elaboration. For example `by simp` is also a `Term`, but it elaborates to different `Expr`s depending on the context. : termLean.Parser.Category.term : Lean.Parser.Category`term` is the builtin syntax category for terms. A term denotes an expression in lean's type theory, for example `2 + 2` is a term. The difference between `Term` and `Expr` is that the former is a kind of syntax, while the latter is the result of elaboration. For example `by simp` is also a `Term`, but it elaborates to different `Expr`s depending on the context.

This new syntax extends terms rather than commands so that the strategy of using local let-bindings for embedded-language bindings is still applicable.

The command macro has two cases:

  1. If no unfolding strategy was specified, the default is added.

  2. Given an unfolding strategy, it emits a #eval command that contains the term version of #norm_embedded. The #eval keyword is given the source location of the original arrow so that the evaluation result is associated with the arrow in the editor.

open Lean in scoped macro_rules | `(#norm_embeddedNormalize an expression in the context of some top-level definitions $declsTSyntaxArray `decl* =>%$arrSyntax $eTSyntax `embedded:embedded) => `(#norm_embeddedNormalize an expression in the context of some top-level definitions ( unfoldTop := UnfoldTop.unfold) $declsTSyntaxArray `decl* =>%$arrSyntax $eTSyntax `embedded) | `(#norm_embeddedNormalize an expression in the context of some top-level definitions ( unfoldTop := $unfoldTSyntax `ident ) $declsTSyntaxArray `decl* =>%$arrSyntax $eTSyntax `embedded:embedded) => `(#eval%$arrSyntax let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` ρ : Array (TName × Tm) := #[] #norm_embedded ( unfoldTop := $unfoldTSyntax `ident ) $declsTSyntaxArray `decl* ρ ⟩=> {{{ $eTSyntax `embedded }}})

For the term macro, there are also two possibilities:

  1. If there are no declarations remaining, it expands to a call to nfTopnfTop (unfold : UnfoldTop) (topDefs : Array (TName × Tm)) (program : Tm) : TmNormalize or semi-normalize a term under some top-level declarations with the appropriate unfolding flag, environment, and expression.

  2. If there are declarations, the environment is extended with the first one. Then, the declaration's identifier is let-bound around a #norm_embedded term that will generate code for the remaining declarations.

The ordering of let-bindings is important. In this language, declarations are not recursive, so the environment must be extended outside the scope of the variable.

open Lean in scoped macro_rules | `(#norm_embedded ( unfoldTop := $unfoldTSyntax `ident ) $ρTSyntax `ident ⟩=> $eTSyntax `term:term) => `(nfTop (open`open Foo in e` is like `open Foo` but scoped to a single term. It makes the given namespaces available in the term `e`. UnfoldTop in`open Foo in e` is like `open Foo` but scoped to a single term. It makes the given namespaces available in the term `e`. $unfoldTSyntax `ident) $ρTSyntax `ident $eTSyntax `term) | `(#norm_embedded ( unfoldTop := $unfoldTSyntax `ident ) defA top-level definition $xTSyntax `ident:ident := $e1TSyntax `embedded:embedded $dsTSyntaxArray `decl:decl* $ρTSyntax `ident ⟩=>%$arrSyntax $eTSyntax `term:term) => do let xNameStrLit : StrLitLean.Syntax.StrLit : Type := quoteLean.Quote.quote {α : Type} {k : optParam SyntaxNodeKind `term} [self : Quote α k] (a✝ : α) : TSyntax k (toStringToString.toString.{u} {α : Type u} [self : ToString α] (a✝ : α) : String xTSyntax `ident.getIdLean.TSyntax.getId (s : Ident) : Name) `(let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` ρ' := $(ρTSyntax `ident).push ($xNameStrLit, {{{ $e1TSyntax `embedded }}}) let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` $xTSyntax `ident : Tm := Tm.top $xNameStrLit #norm_embedded ( unfoldTop := $unfoldTSyntax `ident ) $dsTSyntaxArray `decl* ρ' ⟩=> $eTSyntax `term)#norm_embeddedNormalize an expression in the context of some top-level definitions defA top-level definition oneTm := funA function fTm zTm => fTm zTm defA top-level definition twoTm := funA function fTm zTm => fTm (fTm zTm) defA top-level definition plusTm := funA function nTm kTm fTm zTm => nTm fTm (kTm fTm zTm) {{{ fun f z => f (f (f z)) }}} => plusTm oneTm twoTm
{{{ fun f z => f (f (f z)) }}}
end LeanBinders

Second Attempt: Distinguishing Scopes

namespace ScopedBinders open Lean

Lean's macros are hygienic: this means that, by default, bindings introduced by a macro expansion will neither capture nor be captured by bindings introduced by other steps of macro expansion. Taking a cue from Matthew Flatt's 2016 paper, Lean distinguishes identifiers introduced at different steps of macro expansion using scopes, which are unique tokens generated for each step. The scope token is added to identifiers in quotations, so binding forms and uses introduced by the macro will refer to each other, but not to anything else. Lean's implementation of hygienic macros is described in detail in a paper by my colleagues Sebastian Ullrich and Leo de Moura.

Even though the hygiene system was designed to prevent bindings introduced by macros from interfering with each other or with other bindings, it can be repurposed to prevent interference between bindings in the embedded language and bindings in Lean. With a bit of trickery, it is possible to obtain a unique scope token that can be added to all embedded language bindings and uses. Adding it to the bindings prevents Lean terms from accessing them, and adding it to the use sites causes them to avoid Lean's bindings and gives them access to the correct bindings.

Each step of macro expansion needs its own scope. To implement this, Lean provides a global supply of macro scopes for each module, and one of them is distinguished as the "current" scope. Between steps of expansion, the current scope is replaced by a fresh one from the supply.

The API to access the macro scopes contains getCurrMacroScopeLean.MonadQuotation.getCurrMacroScope {m : Type → Type} [self : MonadQuotation m] : m MacroScopeGet the fresh scope of the current macro invocation , which accesses the current macro scope, and withFreshMacroScopeLean.MonadQuotation.withFreshMacroScope {m : Type → Type} [self : MonadQuotation m] {α : Type} (a✝ : m α) : m αExecute action in a new macro invocation context. This transformer should be used at all places that morally qualify as the beginning of a "macro call", e.g. `elabCommand` and `elabTerm` in the case of the elaborator. However, it can also be used internally inside a "macro" if identifiers introduced by e.g. different recursive calls should be independent and not collide. While returning an intermediate syntax tree that will recursively be expanded by the elaborator can be used for the same effect, doing direct recursion inside the macro guarded by this transformer is often easier because one is not restricted to passing a single syntax tree. Modelling this helper as a transformer and not just a monadic action ensures that the current macro scope before the recursive call is restored after it, as expected. , which runs a monadic action with the current scope set to a unique, fresh scope. These tools can be combined to create a Lean command that defines a top-level constant that refers to a unique scope:

macro "#define_scope""#define_scope" : String xTSyntax `ident:identLean.Parser.ident : Parser.ParserThe parser `ident` parses a single identifier, possibly with namespaces, such as `foo` or `bar.baz`. The identifier must not be a declared token, so for example it will not match `"def"` because `def` is a keyword token. Tokens are implicitly declared by using them in string literals in parser declarations, so `syntax foo := "bla"` will make `bla` no longer legal as an identifier. Identifiers can contain special characters or keywords if they are escaped using the `«»` characters: `«def»` is an identifier named `def`, and `«x»` is treated the same as `x`. This is useful for using disallowed characters in identifiers such as `«foo.bar».baz` or `«hello world»`. This parser has arity 1: it produces a `Syntax.ident` node containing the parsed identifier. You can use `TSyntax.getId` to extract the name from the resulting syntax object. : commandLean.Parser.Category.command : Parser.Category`command` is the syntax category for things that appear at the top level of a lean file. For example, `def foo := 1` is a `command`, as is `namespace Foo` and `end Foo`. Commands generally have an effect on the state of adding something to the environment (like a new definition), as well as commands like `variable` which modify future commands within a scope. => do let scMacroScope withFreshMacroScopeLean.MonadQuotation.withFreshMacroScope {m : Type → Type} [self : MonadQuotation m] {α : Type} (a✝ : m α) : m αExecute action in a new macro invocation context. This transformer should be used at all places that morally qualify as the beginning of a "macro call", e.g. `elabCommand` and `elabTerm` in the case of the elaborator. However, it can also be used internally inside a "macro" if identifiers introduced by e.g. different recursive calls should be independent and not collide. While returning an intermediate syntax tree that will recursively be expanded by the elaborator can be used for the same effect, doing direct recursion inside the macro guarded by this transformer is often easier because one is not restricted to passing a single syntax tree. Modelling this helper as a transformer and not just a monadic action ensures that the current macro scope before the recursive call is restored after it, as expected. getCurrMacroScopeLean.MonadQuotation.getCurrMacroScope {m : Type → Type} [self : MonadQuotation m] : m MacroScopeGet the fresh scope of the current macro invocation let thisModName getMainModuleLean.MonadQuotation.getMainModule {m : Type → Type} [self : MonadQuotation m] : m NameGet the module name of the current file. This is used to ensure that hygienic names don't clash across multiple files. `(def $xTSyntax `ident : Name × MacroScope := ($(quoteLean.Quote.quote {α : Type} {k : optParam SyntaxNodeKind `term} [self : Quote α k] (a✝ : α) : TSyntax k thisModName), $(quoteLean.Quote.quote {α : Type} {k : optParam SyntaxNodeKind `term} [self : Quote α k] (a✝ : α) : TSyntax k scMacroScope)))

Running this command defines specialEmbeddedScopeScopedBinders.specialEmbeddedScope : Name × MacroScope to be the desired scope along with the module's name:

#define_scope specialEmbeddedScopeScopedBinders.specialEmbeddedScope : Name × MacroScope

The next step is to write a function that adds this scope to a name:

def _root_.Lean.Name.addEmbeddedScopeLean.Name.addEmbeddedScope (n : Name) : Name (nName : NameLean.Name : TypeHierarchical names. We use hierarchical names to name declarations and for creating unique identifiers for free variables and metavariables. You can create hierarchical names using the following quotation notation. ``` `Lean.Meta.whnf ``` It is short for `.str (.str (.str .anonymous "Lean") "Meta") "whnf"` You can use double quotes to request Lean to statically check whether the name corresponds to a Lean declaration in scope. ``` ``Lean.Meta.whnf ``` If the name is not in scope, Lean will report an error. ) : NameLean.Name : TypeHierarchical names. We use hierarchical names to name declarations and for creating unique identifiers for free variables and metavariables. You can create hierarchical names using the following quotation notation. ``` `Lean.Meta.whnf ``` It is short for `.str (.str (.str .anonymous "Lean") "Meta") "whnf"` You can use double quotes to request Lean to statically check whether the name corresponds to a Lean declaration in scope. ``` ``Lean.Meta.whnf ``` If the name is not in scope, Lean will report an error. := let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` (modName, scMacroScope) := specialEmbeddedScopeScopedBinders.specialEmbeddedScope : Name × MacroScope Lean.addMacroScopeLean.addMacroScope (mainModule n : Name) (scp : MacroScope) : NameAdd a new macro scope onto the name `n`, in the given `mainModule`. modName nName scMacroScope

Because macro scopes are usually added by quotations, there's no particularly easy way to add the scope to an identifier (that is, a piece of syntax that represents a name occurrence). In Lean, the datatype of syntax is essentially an encoding of arbitrary trees, which allows the same type to be used for the results of parsers even when they're extensible with arbitrary new productions. The wrapper TSyntaxLean.TSyntax (ks : SyntaxNodeKinds) : TypeA `Syntax` value of one of the given syntax kinds. Note that while syntax quotations produce/expect `TSyntax` values of the correct kinds, this is not otherwise enforced and can easily be circumvented by direct use of the constructor. The namespace `TSyntax.Compat` can be opened to expose a general coercion from `Syntax` to any `TSyntax ks` for porting older code. tracks the syntactic category of a piece of syntax at the type level, which can prevent accidental confusion of syntax and also enables the use of the coercion machinery to automate the translation from one category to another. TSyntaxLean.TSyntax (ks : SyntaxNodeKinds) : TypeA `Syntax` value of one of the given syntax kinds. Note that while syntax quotations produce/expect `TSyntax` values of the correct kinds, this is not otherwise enforced and can easily be circumvented by direct use of the constructor. The namespace `TSyntax.Compat` can be opened to expose a general coercion from `Syntax` to any `TSyntax ks` for porting older code. postdates the paper on the macro system, but it's described in detail in section 4.4 of Sebastian Ullrich's thesis. Here, the parsed name inside identifier syntax is enriched with the embedded language's designated scope:

def _root_.Lean.TSyntax.addEmbeddedScopeLean.TSyntax.addEmbeddedScope (x : TSyntax `ident) : TSyntax `ident (xTSyntax `ident : TSyntaxLean.TSyntax (ks : SyntaxNodeKinds) : TypeA `Syntax` value of one of the given syntax kinds. Note that while syntax quotations produce/expect `TSyntax` values of the correct kinds, this is not otherwise enforced and can easily be circumvented by direct use of the constructor. The namespace `TSyntax.Compat` can be opened to expose a general coercion from `Syntax` to any `TSyntax ks` for porting older code. `ident) : TSyntaxLean.TSyntax (ks : SyntaxNodeKinds) : TypeA `Syntax` value of one of the given syntax kinds. Note that while syntax quotations produce/expect `TSyntax` values of the correct kinds, this is not otherwise enforced and can easily be circumvented by direct use of the constructor. The namespace `TSyntax.Compat` can be opened to expose a general coercion from `Syntax` to any `TSyntax ks` for porting older code. `ident := matchPattern matching. `match e, ... with | p, ... => f | ...` matches each given term `e` against each pattern `p` of a match alternative. When all patterns of an alternative match, the `match` term evaluates to the value of the corresponding right-hand side `f` with the pattern variables bound to the respective matched values. If used as `match h : e, ... with | p, ... => f | ...`, `h : e = p` is available within `f`. When not constructing a proof, `match` does not automatically substitute variables matched on in dependent variables' types. Use `match (generalizing := true) ...` to enforce this. Syntax quotations can also be used in a pattern match. This matches a `Syntax` value against quotations, pattern variables, or `_`. Quoted identifiers only match identical identifiers - custom matching such as by the preresolved names only should be done explicitly. `Syntax.atom`s are ignored during matching by default except when part of a built-in literal. For users introducing new atoms, we recommend wrapping them in dedicated syntax kinds if they should participate in matching. For example, in ```lean syntax "c" ("foo" <|> "bar") ... ``` `foo` and `bar` are indistinguishable during matching, but in ```lean syntax foo := "foo" syntax "c" (foo <|> "bar") ... ``` they are not. xTSyntax `ident withPattern matching. `match e, ... with | p, ... => f | ...` matches each given term `e` against each pattern `p` of a match alternative. When all patterns of an alternative match, the `match` term evaluates to the value of the corresponding right-hand side `f` with the pattern variables bound to the respective matched values. If used as `match h : e, ... with | p, ... => f | ...`, `h : e = p` is available within `f`. When not constructing a proof, `match` does not automatically substitute variables matched on in dependent variables' types. Use `match (generalizing := true) ...` to enforce this. Syntax quotations can also be used in a pattern match. This matches a `Syntax` value against quotations, pattern variables, or `_`. Quoted identifiers only match identical identifiers - custom matching such as by the preresolved names only should be done explicitly. `Syntax.atom`s are ignored during matching by default except when part of a built-in literal. For users introducing new atoms, we recommend wrapping them in dedicated syntax kinds if they should participate in matching. For example, in ```lean syntax "c" ("foo" <|> "bar") ... ``` `foo` and `bar` are indistinguishable during matching, but in ```lean syntax foo := "foo" syntax "c" (foo <|> "bar") ... ``` they are not. | .identLean.Syntax.ident (info : SourceInfo) (rawVal : Substring) (val : Name) (preresolved : List Syntax.Preresolved) : SyntaxAn `ident` corresponds to an identifier as parsed by the `ident` or `rawIdent` parsers. * `rawVal` is the literal substring from the input file * `val` is the parsed identifier (with hygiene) * `preresolved` is the list of possible declarations this could refer to infoSourceInfo strSubstring nName _ => .identLean.Syntax.ident (info : SourceInfo) (rawVal : Substring) (val : Name) (preresolved : List Syntax.Preresolved) : SyntaxAn `ident` corresponds to an identifier as parsed by the `ident` or `rawIdent` parsers. * `rawVal` is the literal substring from the input file * `val` is the parsed identifier (with hygiene) * `preresolved` is the list of possible declarations this could refer to infoSourceInfo strSubstring nName.addEmbeddedScopeLean.Name.addEmbeddedScope (n : Name) : Name [] | otherTSyntax `ident => otherTSyntax `ident

The term quasiquotation macro that uses this designated scope is identical to the previous iteration, except it calls addEmbeddedScopeLean.TSyntax.addEmbeddedScope (x : TSyntax `ident) : TSyntax `ident at each ocurrence of embedded-language identifiers.

open Lean in scoped macro_rules | `({{{ ( $eTSyntax `embedded ) }}}) => `({{{ $eTSyntax `embedded }}}) | `({{{ $e1TSyntax `embedded $e2TSyntax `embedded }}}) => `(Tm.app {{{ $e1TSyntax `embedded }}} {{{ $e2TSyntax `embedded }}}) | `({{{ $xTSyntax `ident:ident }}}) => `($(xTSyntax `ident.addEmbeddedScopeLean.TSyntax.addEmbeddedScope (x : TSyntax `ident) : TSyntax `ident)) | `({{{ funA function $xTSyntax `ident => $bodyTSyntax `embedded }}}) => do let xNameStrLit : StrLitLean.Syntax.StrLit : Type := quoteLean.Quote.quote {α : Type} {k : optParam SyntaxNodeKind `term} [self : Quote α k] (a✝ : α) : TSyntax k (toStringToString.toString.{u} {α : Type u} [self : ToString α] (a✝ : α) : String xTSyntax `ident.getIdLean.TSyntax.getId (s : Ident) : Name) `(let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` $(xTSyntax `ident.addEmbeddedScopeLean.TSyntax.addEmbeddedScope (x : TSyntax `ident) : TSyntax `ident) := Tm.loc $xNameStrLit Tm.lam $xNameStrLit {{{ $bodyTSyntax `embedded }}}) | `({{{ funA function $xTSyntax `ident $yTSyntax `ident $xsTSyntaxArray `ident* => $bodyTSyntax `embedded }}}) => do let xNameStrLit : StrLitLean.Syntax.StrLit : Type := quoteLean.Quote.quote {α : Type} {k : optParam SyntaxNodeKind `term} [self : Quote α k] (a✝ : α) : TSyntax k (toStringToString.toString.{u} {α : Type u} [self : ToString α] (a✝ : α) : String xTSyntax `ident.getIdLean.TSyntax.getId (s : Ident) : Name) `(let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` $(xTSyntax `ident.addEmbeddedScopeLean.TSyntax.addEmbeddedScope (x : TSyntax `ident) : TSyntax `ident) := Tm.loc $xNameStrLit Tm.lam $xNameStrLit {{{ funA function $yTSyntax `ident $xsTSyntaxArray `ident* => $bodyTSyntax `embedded }}}) | `({{{ letA local definition $xTSyntax `ident := $eTSyntax `embedded inA local definition $bodyTSyntax `embedded }}}) => do let xNameStrLit : StrLitLean.Syntax.StrLit : Type := quoteLean.Quote.quote {α : Type} {k : optParam SyntaxNodeKind `term} [self : Quote α k] (a✝ : α) : TSyntax k (toStringToString.toString.{u} {α : Type u} [self : ToString α] (a✝ : α) : String xTSyntax `ident.getIdLean.TSyntax.getId (s : Ident) : Name) `(let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` e := {{{ $eTSyntax `embedded }}} let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` $(xTSyntax `ident.addEmbeddedScopeLean.TSyntax.addEmbeddedScope (x : TSyntax `ident) : TSyntax `ident) := Tm.loc $xNameStrLit Tm.let $xNameStrLit e {{{ $bodyTSyntax `embedded }}}) | `({{{ ~$eTSyntax `term }}}) => purePure.pure.{u, v} {f : Type u → Type v} [self : Pure f] {α : Type u} (a✝ : α) : f αIf `a : α`, then `pure a : f α` represents a monadic action that does nothing and returns `a`. eTSyntax `term

Now, implementation details no longer leak through variable capture. The embedded language term correctly fails to refer to the surrounding Lean binding of n:

example (nNat : NatNat : TypeThe type of natural numbers, starting at zero. It is defined as an inductive type freely generated by "zero is a natural number" and "the successor of a natural number is a natural number". You can prove a theorem `P n` about `n : Nat` by `induction n`, which will expect a proof of the theorem for `P 0`, and a proof of `P (succ i)` assuming a proof of `P i`. The same method also works to define functions by recursion on natural numbers: induction and recursion are two expressions of the same operation from Lean's point of view. ``` open Nat example (n : Nat) : n < succ n := by induction n with | zero => show 0 < 1 decide | succ i ih => -- ih : i < succ i show succ i < succ (succ i) exact Nat.succ_lt_succ ih ``` This type is special-cased by both the kernel and the compiler: * The type of expressions contains "`Nat` literals" as a primitive constructor, and the kernel knows how to reduce zero/succ expressions to nat literals. * If implemented naively, this type would represent a numeral `n` in unary as a linked list with `n` links, which is horribly inefficient. Instead, the runtime itself has a special representation for `Nat` which stores numbers up to 2^63 directly and larger numbers use an arbitrary precision "bignum" library (usually [GMP](https://gmplib.org/)). ) : TmTm : TypeTerms := {{{ funA function xyzTm => unknown identifier 'n✝'n }}}
unknown identifier 'n✝'

Furthermore, the embedded language's binding of n no longer capture's Lean's binding, so this example now works:

example (nNat : NatNat : TypeThe type of natural numbers, starting at zero. It is defined as an inductive type freely generated by "zero is a natural number" and "the successor of a natural number is a natural number". You can prove a theorem `P n` about `n : Nat` by `induction n`, which will expect a proof of the theorem for `P 0`, and a proof of `P (succ i)` assuming a proof of `P i`. The same method also works to define functions by recursion on natural numbers: induction and recursion are two expressions of the same operation from Lean's point of view. ``` open Nat example (n : Nat) : n < succ n := by induction n with | zero => show 0 < 1 decide | succ i ih => -- ih : i < succ i show succ i < succ (succ i) exact Nat.succ_lt_succ ih ``` This type is special-cased by both the kernel and the compiler: * The type of expressions contains "`Nat` literals" as a primitive constructor, and the kernel knows how to reduce zero/succ expressions to nat literals. * If implemented naively, this type would represent a numeral `n` in unary as a linked list with `n` links, which is horribly inefficient. Instead, the runtime itself has a special representation for `Nat` which stores numbers up to 2^63 directly and larger numbers use an arbitrary precision "bignum" library (usually [GMP](https://gmplib.org/)). ) : TmTm : TypeTerms := {{{ funA function nTm => ~(toChurchScopedBinders.toChurch (n : Nat) : Tm nNat) }}}

The implementation of the normalization test command is identical to the version without separate scopes, with the exception of the addition of the embedded language scope to identifier names:

open Lean in scoped macro_rules | `(#norm_embedded ( unfoldTop := $unfoldTSyntax `ident ) $ρTSyntax `ident ⟩=> $eTSyntax `term:term) => `(nfTop (open`open Foo in e` is like `open Foo` but scoped to a single term. It makes the given namespaces available in the term `e`. UnfoldTop in`open Foo in e` is like `open Foo` but scoped to a single term. It makes the given namespaces available in the term `e`. $unfoldTSyntax `ident) $ρTSyntax `ident $eTSyntax `term) | `(#norm_embedded ( unfoldTop := $unfoldTSyntax `ident ) defA top-level definition $xTSyntax `ident:ident := $e1TSyntax `embedded:embedded $dsTSyntaxArray `decl:decl* $ρTSyntax `ident ⟩=>%$arrSyntax $eTSyntax `term:term) => do let xNameStrLit : StrLitLean.Syntax.StrLit : Type := quoteLean.Quote.quote {α : Type} {k : optParam SyntaxNodeKind `term} [self : Quote α k] (a✝ : α) : TSyntax k (toStringToString.toString.{u} {α : Type u} [self : ToString α] (a✝ : α) : String xTSyntax `ident.getIdLean.TSyntax.getId (s : Ident) : Name) `(let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` ρ' := $(ρTSyntax `ident).push ($xNameStrLit, {{{ $e1TSyntax `embedded }}}) let`let` is used to declare a local definition. Example: ``` let x := 1 let y := x + 1 x + y ``` Since functions are first class citizens in Lean, you can use `let` to declare local functions too. ``` let double := fun x => 2*x double (double 3) ``` For recursive definitions, you should use `let rec`. You can also perform pattern matching using `let`. For example, assume `p` has type `Nat × Nat`, then you can write ``` let (x, y) := p x + y ``` $(xTSyntax `ident.addEmbeddedScopeLean.TSyntax.addEmbeddedScope (x : TSyntax `ident) : TSyntax `ident) : Tm := Tm.top $xNameStrLit #norm_embedded ( unfoldTop := $unfoldTSyntax `ident ) $dsTSyntaxArray `decl* ρ' ⟩=> $eTSyntax `term)

This version similarly keeps embedded language bindings separate from Lean's bindings:

def nScopedBinders.n : Nat := 5 #norm_embeddedNormalize an expression in the context of some top-level definitions defA top-level definition zeroTm := funA function fTm zTm => zTm defA top-level definition nTm := ~(toChurchScopedBinders.toChurch (n : Nat) : Tm nScopedBinders.n : Nat) defA top-level definition n'Tm := ~(toChurchScopedBinders.toChurch (n : Nat) : Tm nScopedBinders.n : Nat) defA top-level definition addTm := funA function nTm kTm fTm zTm => nTm fTm (kTm fTm zTm) {{{ fun f z => f (f (f (f (f (f (f (f (f (f z))))))))) }}} => addTm nTm nTmend ScopedBinders

Next Post

The techniques demonstrated in this post are relatively easy to employ, and they provide a nice user interface for embedded languages. Reusing Lean's binders to represent embedded language binders means that features like occurrence highlighting and rename refactorings automatically work, with no further effort required. After a quite small amount of work, the formal artifact has become a usable prototype, and it is practical to use Lean to translate higher-level features (such as numbers) to lower-level encodings that are suitable for the formalism. For many uses, this technique is sufficient.

However, this implementation is beginning to reach the limits of what can be easily written as a macro. Maintaining the current environment while expanding the list of declarations required an explicit representation of the state. While there is a long history of encoding quite complicated control flow using continuation-passing-style macros in Scheme, the code tends to become difficult to read and maintain over time. At the same time, most languages that are interesting to programming language researchers will require more information about identifiers than just whether they're in scope, such as their type, and there's no easy way for the macro to check these features.

An upcoming post will demonstrate some techniques for associating more information with bound variables, moving from a macro-based implementation to an elaborator, which uses the compiler's API to translate terms directly to Lean's core type theory.