To write efficient code, it is important to select appropriate data structures.
Linked lists have their place: in some applications, the ability to share the tails of lists is very important.
However, most use cases for a variable-length sequential collection of data are better served by arrays, which have both less memory overhead and better locality.
Arrays, however, have two drawbacks relative to lists:
Arrays are accessed through indexing, rather than by pattern matching, which imposes proof obligations in order to maintain safety.
A loop that processes an entire array from left to right is a tail-recursive function, but it does not have an argument that decreases on each call.
Making effective use of arrays requires knowing how to prove to Lean that an array index is in bounds, and how to prove that an array index that approaches the size of the array also causes the program to terminate.
Both of these are expressed using an inequality proposition, rather than propositional equality.
Because different types have different notions of ordering, inequality is governed by two type classes, called LE and LT.
The table in the section on standard type classes describes how these classes relate to the syntax:
In other words, a type may customize the meaning of the < and ≤ operators, while > and ≥ derive their meanings from < and ≤.
The classes LT and LE have methods that return propositions rather than Bools:
Defining Nat.le requires a feature of Lean that has not yet been presented: it is an inductively-defined relation.
8.3.1.1. Inductively-Defined Propositions, Predicates, and Relations🔗
Nat.le is an inductively-defined relation.
Just as inductive can be used to create new datatypes, it can be used to create new propositions.
When a proposition takes an argument, it is referred to as a predicate that may be true for some, but not all, potential arguments.
Propositions that take multiple arguments are called relations.
Each constructor of an inductively defined proposition is a way to prove it.
In other words, the declaration of the proposition describes the different forms of evidence that it is true.
A proposition with no arguments that has a single constructor can be quite easy to prove:
In fact, the proposition True, which should always be easy to prove, is defined just like EasyToProve:
inductiveTrue:Propwhere|intro:True
Inductively-defined propositions that don't take arguments are not nearly as interesting as inductively-defined datatypes.
This is because data is interesting in its own right—the natural number 3 is different from the number 35, and someone who has ordered 3 pizzas will be upset if 35 arrive at their door 30 minutes later.
The constructors of a proposition describe ways in which the proposition can be true, but once a proposition has been proved, there is no need to know which underlying constructors were used.
This is why most interesting inductively-defined types in the Prop universe take arguments.
The inductively-defined predicate IsThree states that its argument is three:
The mechanism used here is just like indexed families such as HasCol, except the resulting type is a proposition that can be proved rather than data that can be used.
Using this predicate, it is possible to prove that three is indeed three:
Tactic `constructor` failed: no applicable constructor foundn:Natthree:IsThreen⊢ IsFive(n+2)
This error occurs because n+2 is not definitionally equal to 5.
In an ordinary function definition, dependent pattern matching on the assumption three could be used to refine n to 3.
The tactic equivalent of dependent pattern matching is cases, which has a syntax similar to that of induction:
The standard false proposition False has no constructors, making it impossible to provide direct evidence for.
The only way to provide evidence for False is if an assumption is itself impossible, similarly to how nomatch can be used to mark code that the type system can see is unreachable.
As described in the initial Interlude on proofs, the negation NotA is short for A→False.
NotA can also be written ¬A.
Because the goal is a function type, intro can be used to convert the argument into an assumption.
There is no need to keep unfold, as intro can unfold the definition of Not itself:
Just as a pattern match on a VectString2 doesn't need to include a case for Vect.nil, a proof by cases over IsThree4 doesn't need to include a case for isThree.
The parameter n is the number that should be smaller, while the index is the number that should be greater than or equal to n.
The refl constructor is used when both numbers are equal, while the step constructor is used when the index is greater than n.
From the perspective of evidence, a proof that n \leq k consists of finding some number d such that n + d = m.
In Lean, the proof then consists of a Nat.le.refl constructor wrapped by d instances of Nat.le.step.
Each step constructor adds one to its index argument, so dstep constructors adds d to the larger number.
For example, evidence that four is less than or equal to seven consists of three steps around a refl:
The function Array.map transforms an array with a function, returning a new array that contains the result of applying the function to each element of the input array.
Writing it as a tail-recursive function follows the usual pattern of delegating to a function that passes the output array in an accumulator.
The accumulator is initialized with an empty array.
The accumulator-passing helper function also takes an argument that tracks the current index into the array, which starts at 0:
The helper should, at each iteration, check whether the index is still in bounds.
If so, it should loop again with the transformed element added to the end of the accumulator and the index incremented by 1.
If not, then it should terminate and return the accumulator.
An initial implementation of this code fails because Lean is unable to prove that the array index is valid:
defarrayMapHelper(f:α→β)(arr:Arrayα)(soFar:Arrayβ)(i:Nat):Arrayβ:=ifi<arr.sizethenarrayMapHelperfarr(soFar.push(ffailed to prove index is valid, possible solutions: - Use `have`-expressions to prove the index is valid - Use `a[i]!` notation instead, runtime check is performed, and 'Panic' error message is produced if index is not valid - Use `a[i]?` notation instead, result is an `Option` type - Use `a[i]'h` notation instead, where `h` is a proof that index is validα:Type ?u.3051β:Type ?u.3054f:α→βarr:ArrayαsoFar:Arrayβi:Nat⊢ i<arr.sizearr[i]))(i+1)elsesoFar
failed to prove index is valid, possible solutions: - Use `have`-expressions to prove the index is valid - Use `a[i]!` notation instead, runtime check is performed, and 'Panic' error message is produced if index is not valid - Use `a[i]?` notation instead, result is an `Option` type - Use `a[i]'h` notation instead, where `h` is a proof that index is validα:Type ?u.3051β:Type ?u.3054f:α→βarr:ArrayαsoFar:Arrayβi:Nat⊢ i<arr.size
However, the conditional expression already checks the precise condition that the array index's validity demands (namely, i<arr.size).
Adding a name to the if resolves the issue, because it adds an assumption that the array indexing tactic can use:
Lean accepts the modified program, even though the recursive call is not made on an argument to one of the input constructors.
In fact, both the accumulator and the index grow, rather than shrinking.
Behind the scenes, Lean's proof automation constructs a termination proof.
Reconstructing this proof can make it easier to understand the cases that Lean cannot automatically recognize.
Why does arrayMapHelper terminate?
Each iteration checks whether the index i is still in bounds for the array arr.
If so, i is incremented and the loop repeats.
If not, the program terminates.
Because arr.size is a finite number, i can be incremented only a finite number of times.
Even though no argument to the function decreases on each call, arr.size-i decreases toward zero.
The value that decreases at each recursive call is called a measure.
Lean can be instructed to use a specific expression as the measure of termination by providing a termination_by clause at the end of a definition.
For arrayMapHelper, the explicit measure looks like this:
A similar termination proof can be used to write Array.find, a function that finds the first element in an array that satisfies a Boolean function and returns both the element and its index:
Adding a question mark to termination_by (that is, using termination_by?) causes Lean to explicitly suggest the measure that it chose.
Clicking [apply] replaces termination_by? with the suggested measure:
Not all termination arguments are as quite as simple as this one.
However, the basic structure of identifying some expression based on the function's arguments that will decrease in each call occurs in all termination proofs.
Sometimes, creativity can be required in order to figure out just why a function terminates, and sometimes Lean requires additional proofs in order to accept that the measure in fact decreases.