To write efficient code, it is important to select appropriate data structures. Linked lists have their place: in some applications, the ability to share the tails of lists is very important. However, most use cases for a variable-length sequential collection of data are better served by arrays, which have both less memory overhead and better locality.
Arrays, however, have two drawbacks relative to lists:
- Arrays are accessed through indexing, rather than by pattern matching, which imposes proof obligations in order to maintain safety.
- A loop that processes an entire array from left to right is a tail-recursive function, but it does not have an argument that decreases on each call.
Making effective use of arrays requires knowing how to prove to Lean that an array index is in bounds, and how to prove that an array index that approaches the size of the array also causes the program to terminate. Both of these are expressed using an inequality proposition, rather than propositional equality.
Because different types have different notions of ordering, inequality is governed by two type classes, called
The table in the section on standard type classes describes how these classes relate to the syntax:
In other words, a type may customize the meaning of the
≤ operators, while
≥ derive their meanings from
LE have methods that return propositions rather than
class LE (α : Type u) where le : α → α → Prop class LT (α : Type u) where lt : α → α → Prop
The instance of
Nat delegates to
instance : LE Nat where le := Nat.le
Nat.le requires a feature of Lean that has not yet been presented: it is an inductively-defined relation.
Nat.le is an inductively-defined relation.
inductive can be used to create new datatypes, it can also be used to create new propositions.
When a proposition takes an argument, it is referred to as a predicate that may be true for some, but not all, potential arguments.
Propositions that take multiple arguments are called relations.
Each constructor of an inductively defined proposition is a way to prove it. In other words, the declaration of the proposition describes the different forms of evidence that it is true. A proposition with no arguments that has a single constructor can be quite easy to prove:
inductive EasyToProve : Prop where | heresTheProof : EasyToProve
The proof consists of using its constructor:
theorem fairlyEasy : EasyToProve := by constructor
In fact, the proposition
True, which should always be easy to prove, is defined just like
inductive True : Prop where | intro : True
Inductively-defined propositions that don't take arguments are not nearly as interesting as inductively-defined datatypes.
This is because data is interesting in its own right—the natural number
3 is different from the number
35, and someone who has ordered 3 pizzas will be upset if 35 arrive at their door 30 minutes later.
The constructors of a proposition describe ways in which the proposition can be true, but once a proposition has been proved, there is no need to know which underlying constructors were used.
This is why most interesting inductively-defined types in the
Prop universe take arguments.
The inductively-defined predicate
IsThree states that its argument is three:
inductive IsThree : Nat → Prop where | isThree : IsThree 3
The mechanism used here is just like indexed families such as
HasCol, except the resulting type is a proposition that can be proved rather than data that can be used.
Using this predicate, it is possible to prove that three is indeed three:
theorem three_is_three : IsThree 3 := by constructor
IsFive is a predicate that states that its argument is
inductive IsFive : Nat → Prop where | isFive : IsFive 5
If a number is three, then the result of adding two to it should be five. This can be expressed as a theorem statement:
theorem three_plus_two_five : IsThree n → IsFive (n + 2) := by skip
The resulting goal has a function type:
unsolved goals n : Nat ⊢ IsThree n → IsFive (n + 2)
intro tactic can be used to convert the argument into an assumption:
theorem three_plus_two_five : IsThree n → IsFive (n + 2) := by intro three
unsolved goals n : Nat three : IsThree n ⊢ IsFive (n + 2)
Given the assumption that
n is three, it should be possible to use the constructor of
IsFive to complete the proof:
theorem three_plus_two_five : IsThree n → IsFive (n + 2) := by intro three constructor
However, this results in an error:
tactic 'constructor' failed, no applicable constructor found n : Nat three : IsThree n ⊢ IsFive (n + 2)
This error occurs because
n + 2 is not definitionally equal to
In an ordinary function definition, dependent pattern matching on the assumption
three could be used to refine
The tactic equivalent of dependent pattern matching is
cases, which has a syntax similar to that of
theorem three_plus_two_five : IsThree n → IsFive (n + 2) := by intro three cases three with | isThree => skip
In the remaining case,
n has been refined to
unsolved goals case isThree ⊢ IsFive (3 + 2)
3 + 2 is definitionally equal to
5, the constructor is now applicable:
theorem three_plus_two_five : IsThree n → IsFive (n + 2) := by intro three cases three with | isThree => constructor
The standard false proposition
False has no constructors, making it impossible to provide direct evidence for.
The only way to provide evidence for
False is if an assumption is itself impossible, similarly to how
nomatch can be used to mark code that the type system can see is unreachable.
As described in the initial Interlude on proofs, the negation
Not A is short for
A → False.
Not A can also be written
It is not the case that four is three:
theorem four_is_not_three : ¬ IsThree 4 := by skip
The initial proof goal contains
unsolved goals ⊢ ¬IsThree 4
The fact that it's actually a function type can be exposed using
theorem four_is_not_three : ¬ IsThree 4 := by simp [Not]
unsolved goals ⊢ IsThree 4 → False
Because the goal is a function type,
intro can be used to convert the argument into an assumption.
There is no need to keep
intro can unfold the definition of
theorem four_is_not_three : ¬ IsThree 4 := by intro h
unsolved goals h : IsThree 4 ⊢ False
In this proof, the
cases tactic solves the goal immediately:
theorem four_is_not_three : ¬ IsThree 4 := by intro h cases h
Just as a pattern match on a
Vect String 2 doesn't need to include a case for
Vect.nil, a proof by cases over
IsThree 4 doesn't need to include a case for
The definition of
Nat.le has a parameter and an index:
inductive Nat.le (n : Nat) : Nat → Prop | refl : Nat.le n n | step : Nat.le n m → Nat.le n (m + 1)
n is the number that should be smaller, while the index is the number that should be greater than or equal to
refl constructor is used when both numbers are equal, while the
step constructor is used when the index is greater than
From the perspective of evidence, a proof that \( n \leq k \) consists of finding some number \( d \) such that \( n + d = m \).
In Lean, the proof then consists of a
Nat.le.refl constructor wrapped by \( d \) instances of
step constructor adds one to its index argument, so \( d \)
step constructors adds \( d \) to the larger number.
For example, evidence that four is less than or equal to seven consists of three
steps around a
theorem four_le_seven : 4 ≤ 7 := open Nat.le in step (step (step refl))
The strict less-than relation is defined by adding one to the number on the left:
def Nat.lt (n m : Nat) : Prop := Nat.le (n + 1) m instance : LT Nat where lt := Nat.lt
Evidence that four is strictly less than seven consists of two
step's around a
theorem four_lt_seven : 4 < 7 := open Nat.le in step (step refl)
This is because
4 < 7 is equivalent to
5 ≤ 7.
Array.map transforms an array with a function, returning a new array that contains the result of applying the function to each element of the input array.
Writing it as a tail-recursive function follows the usual pattern of delegating to a function that passes the output array in an accumulator.
The accumulator is initialized with an empty array.
The accumulator-passing helper function also takes an argument that tracks the current index into the array, which starts at
def Array.map (f : α → β) (arr : Array α) : Array β := arrayMapHelper f arr Array.empty 0
The helper should, at each iteration, check whether the index is still in bounds.
If so, it should loop again with the transformed element added to the end of the accumulator and the index incremented by
If not, then it should terminate and return the accumulator.
An initial implementation of this code fails because Lean is unable to prove that the array index is valid:
def arrayMapHelper (f : α → β) (arr : Array α) (soFar : Array β) (i : Nat) : Array β := if i < arr.size then arrayMapHelper f arr (soFar.push (f arr[i])) (i + 1) else soFar
failed to prove index is valid, possible solutions: - Use `have`-expressions to prove the index is valid - Use `a[i]!` notation instead, runtime check is perfomed, and 'Panic' error message is produced if index is not valid - Use `a[i]?` notation instead, result is an `Option` type - Use `a[i]'h` notation instead, where `h` is a proof that index is valid α : Type ?u.1704 β : Type ?u.1707 f : α → β arr : Array α soFar : Array β i : Nat ⊢ i < Array.size arr
However, the conditional expression already checks the precise condition that the array index's validity demands (namely,
i < arr.size).
Adding a name to the
if resolves the issue, because it adds an assumption that the array indexing tactic can use:
def arrayMapHelper (f : α → β) (arr : Array α) (soFar : Array β) (i : Nat) : Array β := if inBounds : i < arr.size then arrayMapHelper f arr (soFar.push (f arr[i])) (i + 1) else soFar
Lean does not, however, accept the modified program, because the recursive call is not made on an argument to one of the input constructors. In fact, both the accumulator and the index grow, rather than shrinking:
fail to show termination for arrayMapHelper with errors argument #6 was not used for structural recursion failed to eliminate recursive application arrayMapHelper f✝ arr (Array.push soFar (f✝ arr[i])) (i + 1) structural recursion cannot be used failed to prove termination, use `termination_by` to specify a well-founded relation
Nevertheless, this function terminates, so simply marking it
partial would be unfortunate.
Each iteration checks whether the index
i is still in bounds for the array
i is incremented and the loop repeats.
If not, the program terminates.
arr.size is a finite number,
i can be incremented only a finite number of times.
Even though no argument to the function decreases on each call,
arr.size - i decreases toward zero.
Lean can be instructed to use another expression for termination by providing a
termination_by clause at the end of a definition.
termination_by clause has two components: names for the function's arguments and an expression using those names that should decrease on each call.
arrayMapHelper, the final definition looks like this:
def arrayMapHelper (f : α → β) (arr : Array α) (soFar : Array β) (i : Nat) : Array β := if inBounds : i < arr.size then arrayMapHelper f arr (soFar.push (f arr[i])) (i + 1) else soFar termination_by arrayMapHelper _ arr _ i _ => arr.size - i
A similar termination proof can be used to write
Array.find, a function that finds the first element in an array that satisfies a Boolean function and returns both the element and its index:
def Array.find (arr : Array α) (p : α → Bool) : Option (Nat × α) := findHelper arr p 0
Once again, the helper function terminates because
arr.size - i decreases as
def findHelper (arr : Array α) (p : α → Bool) (i : Nat) : Option (Nat × α) := if h : i < arr.size then let x := arr[i] if p x then some (i, x) else findHelper arr p (i + 1) else none termination_by findHelper arr p i => arr.size - i
Not all termination arguments are as quite as simple as this one. However, the basic structure of identifying some expression based on the function's arguments that will decrease in each call occurs in all termination proofs. Sometimes, creativity can be required in order to figure out just why a function terminates, and sometimes Lean requires additional proofs in order to accept the termination argument.
- Implement a
ForM (Array α)instance on arrays using a tail-recursive accumulator-passing function and a
- Implement a function to reverse arrays using a tail-recursive accumulator-passing function that doesn't need a
Array.find, and the
for ... in ...loops in the identity monad and compare the resulting code.
- Reimplement array reversal using a
for ... in ...loop in the identity monad. Compare it to the tail-recursive function.