Like many languages, Lean uses square brackets for indexing into arrays and lists.
For instance, if
woodlandCritters is defined as follows:
def woodlandCritters : List String := ["hedgehog", "deer", "snail"]
then the individual components can be extracted:
def hedgehog := woodlandCritters def deer := woodlandCritters def snail := woodlandCritters
However, attempting to extract the fourth element results in a compile-time error, rather than a run-time error:
def oops := woodlandCritters
failed to prove index is valid, possible solutions: - Use `have`-expressions to prove the index is valid - Use `a[i]!` notation instead, runtime check is perfomed, and 'Panic' error message is produced if index is not valid - Use `a[i]?` notation instead, result is an `Option` type - Use `a[i]'h` notation instead, where `h` is a proof that index is valid ⊢ 3 < List.length woodlandCritters
This error message is saying Lean tried to automatically mathematically prove that
3 < List.length woodlandCritters, which would mean that the lookup was safe, but that it could not do so.
Out-of-bounds errors are a common class of bugs, and Lean uses its dual nature as a programming language and a theorem prover to rule out as many as possible.
Understanding how this works requires an understanding of three key ideas: propositions, proofs, and tactics.
A proposition is a statement that can be true or false. All of the following are propositions:
- 1 + 1 = 2
- Addition is commutative
- There are infinitely many prime numbers
- 1 + 1 = 15
- Paris is the capital of France
- Buenos Aires is the capital of South Korea
- All birds can fly
On the other hand, nonsense statements are not propositions. None of the following are propositions:
- 1 + green = ice cream
- All capital cities are prime numbers
- At least one gorg is a fleep
Propositions come in two varieties: those that are purely mathematical, relying only on our definitions of concepts, and those that are facts about the world. Theorem provers like Lean are concerned with the former category, and have nothing to say about the flight capabilities of penguins or the legal status of cities.
A proof is a convincing argument that a proposition is true. For mathematical propositions, these arguments make use of the definitions of the concepts that are involved as well as the rules of logical argumentation. Most proofs are written for people to understand, and leave out many tedious details. Computer-aided theorem provers like Lean are designed to allow mathematicians to write proofs while omitting many details, and it is the software's responsibility to fill in the missing explicit steps. This decreases the likelihood of oversights or mistakes.
In Lean, a program's type describes the ways it can be interacted with.
For instance, a program of type
Nat → List String is a function that takes a
Nat argument and produces a list of strings.
In other words, each type specifies what counts as a program with that type.
In Lean, propositions are in fact types. They specify what counts as evidence that the statement is true. The proposition is proved by providing this evidence. On the other hand, if the proposition is false, then it will be impossible to construct this evidence.
For example, the proposition "1 + 1 = 2" can be written directly in Lean.
The evidence for this proposition is the constructor
rfl, which is short for reflexivity:
def onePlusOneIsTwo : 1 + 1 = 2 := rfl
On the other hand,
rfl does not prove the false proposition "1 + 1 = 15":
def onePlusOneIsFifteen : 1 + 1 = 15 := rfl
type mismatch rfl has type 1 + 1 = 1 + 1 : Prop but is expected to have type 1 + 1 = 15 : Prop
This error message indicates that
rfl can prove that two expressions are equal when both sides of the equality statement are already the same number.
1 + 1 evaluates directly to
2, they are considered to be the same, which allows
onePlusOneIsTwo to be accepted.
Type describes types such as
List (Nat × String × (Int → Float)) that represent data structures and functions,
Prop describes propositions.
When a proposition has been proven, it is called a theorem.
In Lean, it is conventional to declare theorems with the
theorem keyword instead of
This helps readers see which declarations are intended to be read as mathematical proofs, and which are definitions.
Generally speaking, with a proof, what matters is that there is evidence that a proposition is true, but it's not particularly important which evidence was provided.
With definitions, on the other hand, it matters very much which particular value is selected—after all, a definition of addition that always returns
0 is clearly wrong.
The prior example could be rewritten as follows:
def OnePlusOneIsTwo : Prop := 1 + 1 = 2 theorem onePlusOneIsTwo : OnePlusOneIsTwo := rfl
Proofs are normally written using tactics, rather than by providing evidence directly. Tactics are small programs that construct evidence for a proposition. These programs run in a proof state that tracks the statement that is to be proved (called the goal) along with the assumptions that are available to prove it. Running a tactic on a goal results in a new proof state that contains new goals. The proof is complete when all goals have been proven.
To write a proof with tactics, begin the definition with
by puts Lean into tactic mode until the end of the next indented block.
While in tactic mode, Lean provides ongoing feedback about the current proof state.
Written with tactics,
onePlusOneIsTwo is still quite short:
theorem onePlusOneIsTwo : 1 + 1 = 2 := by simp
simp tactic, short for "simplify", is the workhorse of Lean proofs.
It rewrites the goal to as simple a form as possible, taking care of parts of the proof that are small enough.
In particular, it proves simple equality statements.
Behind the scenes, a detailed formal proof is constructed, but using
simp hides this complexity.
Tactics are useful for a number of reasons:
- Many proofs are complicated and tedious when written out down to the smallest detail, and tactics can automate these uninteresting parts.
- Proofs written with tactics are easier to maintain over time, because flexible automation can paper over small changes to definitions.
- Because a single tactic can prove many different theorems, Lean can use tactics behind the scenes to free users from writing proofs by hand. For instance, an array lookup requires a proof that the index is in bounds, and a tactic can typically construct that proof without the user needing to worry about it.
Behind the scenes, indexing notation uses a tactic to prove that the user's lookup operation is safe.
This tactic is
simp, configured to take certain arithmetic identities into account.
The basic building blocks of logic, such as "and", "or", "true", "false", and "not", are called logical connectives. Each connective defines what counts as evidence of its truth. For example, to prove a statement "A and B", one must prove both A and B. This means that evidence for "A and B" is a pair that contains both evidence for A and evidence for B. Similarly, evidence for "A or B" consists of either evidence for A or evidence for B.
In particular, most of these connectives are defined like datatypes, and they have constructors.
B are propositions, then "
A ∧ B) is a proposition.
A ∧ B consists of the constructor
And.intro, which has the type
A → B → A ∧ B.
B with concrete propositions, it is possible to prove
1 + 1 = 2 ∧ "Str".append "ing" = "String" with
And.intro rfl rfl.
simp is also powerful enough to find this proof:
theorem addAndAppend : 1 + 1 = 2 ∧ "Str".append "ing" = "String" := by simp
A ∨ B) has two constructors, because a proof of "
B" requires only that one of the two underlying propositions be true.
There are two constructors:
Or.inl, with type
A → A ∨ B, and
Or.inr, with type
B → A ∨ B.
Implication (if A then B) is represented using functions.
In particular, a function that transforms evidence for A into evidence for B is itself evidence that A implies B.
This is different from the usual description of implication, in which
A → B is shorthand for
¬A ∨ B, but the two formulations are equivalent.
Because evidence for an "and" is a constructor, it can be used with pattern matching. For instance, a proof that A and B implies A or B is a function that pulls the evidence of A (or of B) out of the evidence for A and B, and then uses this evidence to produce evidence of A or B:
theorem andImpliesOr : A ∧ B → A ∨ B := fun andEvidence => match andEvidence with | And.intro a b => Or.inl a
|A and B|
|A or B||Either |
|A implies B||A function that transforms evidence of A into evidence of B|
|not A||A function that would transform evidence of A into evidence of |
simp tactic can prove theorems that use these connectives.
theorem onePlusOneAndLessThan : 1 + 1 = 2 ∨ 3 < 5 := by simp theorem notTwoEqualFive : ¬(1 + 1 = 5) := by simp theorem trueIsTrue : True := True.intro theorem trueOrFalse : True ∨ False := by simp theorem falseImpliesTrue : False → True := by simp
simp does a great job proving propositions that involve equalities and inequalities of specific numbers, it is not very good at proving statements that involve variables.
simp can prove that
4 < 15, but it can't easily tell that because
x < 4, it's also true that
x < 15.
Because index notation uses
simp behind the scenes to prove that array access is safe, it can require a bit of hand-holding.
One of the easiest ways to make indexing notation work well is to have the function that performs a lookup into a data structure take the required evidence of safety as an argument. For instance, a function that returns the third entry in a list is not generally safe because lists might contain zero, one, or two entries:
def third (xs : List α) : α := xs
failed to prove index is valid, possible solutions: - Use `have`-expressions to prove the index is valid - Use `a[i]!` notation instead, runtime check is perfomed, and 'Panic' error message is produced if index is not valid - Use `a[i]?` notation instead, result is an `Option` type - Use `a[i]'h` notation instead, where `h` is a proof that index is valid α : Type ?u.3908 xs : List α ⊢ 2 < List.length xs
However, the obligation to show that the list has at least three entries can be imposed on the caller by adding an argument that consists of evidence that the indexing operation is safe:
def third (xs : List α) (ok : xs.length > 2) : α := xs
In this example,
xs.length > 2 is not a program that checks whether
xs has more than 2 entries.
It is a proposition that could be true or false, and the argument
ok must be evidence that it is true.
When the function is called on a concrete list, its length is known.
In these cases,
by simp can construct the evidence automatically:
#eval third woodlandCritters (by simp)
In cases where it's not practical to prove that an indexing operation is in bounds, there are other alternatives.
Adding a question mark results in an
Option, where the result is
some if the index is in bounds, and
def thirdOption (xs : List α) : Option α := xs? #eval thirdOption woodlandCritters
#eval thirdOption ["only", "two"]
There is also a version that crashes the program when the index is out of bounds, rather than returning an
Because code that is run with
#eval runs in the context of the Lean compiler, selecting the wrong index can crash your IDE.
In addition to the error that occurs when Lean is unable to find compile-time evidence that an indexing operation is safe, polymorphic functions that use unsafe indexing may produce the following message:
def unsafeThird (xs : List α) : α := xs!
failed to synthesize instance Inhabited α
This is due to a technical restriction that is part of keeping Lean usable as both a logic for proving theorems and a programming language. In particular, only programs whose types contain at least one value are allowed to crash. This is because a proposition in Lean is a kind of type that classifies evidence of its truth. False propositions have no such evidence. If a program with an empty type could crash, then that crashing program could be used as a kind of fake evidence for a false proposition.
Internally, Lean contains a table of types that are known to have at least one value.
This error is saying that some arbitrary type
α is not necessarily in that table.
The next chapter describes how to add to this table, and how to successfully write functions like
Adding whitespace between a list and the brackets used for lookup can cause another message:
#eval woodlandCritters 
function expected at woodlandCritters term has type List String
Adding a space causes Lean to treat the expression as a function application, and the index as a list that contains a single number.
This error message results from having Lean attempt to treat
woodlandCritters as a function.
- Prove the following theorems using
2 + 3 = 5,
15 - 8 = 7,
"Hello, ".append "world" = "Hello, world". What happens if
rflis used to prove
5 < 18? Why?
- Prove the following theorems using
2 + 3 = 5,
15 - 8 = 7,
"Hello, ".append "world" = "Hello, world",
5 < 18.
- Write a function that looks up the fifth entry in a list. Pass the evidence that this lookup is safe as an argument to the function.