Theorem Proving in Industry with Lean

Sebastian Ullrich
Head of Engineering, Lean FRO

Dutch Formal Methods Day · June 4, 2026

zlib in Lean

A general-purpose AI autonomously converted zlib (C compression library) to Lean, passed the test suite, and proved the code correct

theorem zlib_decompressSingle_compress (data : ByteArray) (level : UInt8)
    (maxOutputSize : Nat) (hsize : data.size ≤ maxOutputSize) :
  ZlibDecode.decompressSingle (ZlibEncode.compress data level) maxOutputSize =
  .ok data

github.com/kim-em/lean-zip

Blog post: When AI Writes the World's Software, Who Verifies It?

What is Lean?

A programming language and a proof assistant

Same language for code and proofs

Lean is implemented in Lean, enabling unprecedented extensibility

Lean is scalable

Lean has a small trusted kernel

Multiple Independent Kernels

The reference kernel (C++)
nanoda (Rust)
lean4lean (Lean)
nanobruijn (Rust)
and more

arena.lean-lang.org hosts independent kernel implementations

Anyone can build a kernel and submit it

Disagreement between independent kernels exposes bugs that neither would catch alone

The State of Lean

250,000+ unique installations: VS Code (143K) + Open VSX (107K)

9,000+ GitHub repositories depending on Lean

Industry: AWS, Google, Microsoft, Mistral, Nethermind, Galois, others

Lean Beginnings

How did we get here?

Before Lean, there was Z3

Z3 is a state-of-the-art SMT solver created by Leonardo de Moura

Z3 powered bug-finding pipelines at Microsoft, catching thousands of defects

But for whole-program verification, proofs were too brittle: minor code edits broke solver traces

Leo started the Lean project in 2013, aimed at merging interactive and automated theorem proving

Focus on classical logic from the start, collaboration with Jeremy Avigad (CMU) on math library

Mathlib: 2.2M+ Lines of Formalized Mathematics

270,000+ formalized theorems by 750+ contributors

An unprecedented community achievement

Mathlib dependency graph

The Liquid Tensor Experiment

In 2020, Peter Scholze posed a formalization challenge

"I spent much of 2019 obsessed with the proof of this theorem, almost getting crazy over it. I still have some small lingering doubts." — Peter Scholze

Johan Commelin led a team that verified the proof, with only minor corrections

They verified and simplified it without fully understanding the proof. Lean was a guide

"The Lean Proof Assistant was really that: an assistant in navigating through the thick jungle that this proof is." — Peter Scholze

Lean Is Taking Mathematics by Storm

Six Fields Medalists engaged: Tao, Scholze, Viazovska, Gowers, Hairer, Freedman

Carleson's Theorem (completed) — van Doorn
Liquid Tensor Experiment (completed) — Commelin
Equational Theories Project (completed) — Tao
Sphere Packing — Birkbeck, Hariharan, Lee, Ma, Mehta, Viazovska
Fermat's Last Theorem — Buzzard
Inter-Universal Teichmüller Theory — Mochizuki

AI startups

"Lean has become the de facto choice for AI-based systems of mathematical reasoning." — ACM SIGPLAN Programming Languages Software Award 2025

Startups using Lean

Software Verification

Lean is not only for mathematics.

SampCert (AWS): Verified Differential Privacy

Differential privacy is the gold standard for privacy-preserving data analysis. Implementing it correctly is hard

Mironov's attack exploited floating-point to break DP. Bugs in random number generation have destroyed privacy guarantees in real systems

SampCert (Jean-Baptiste Tristan): the first comprehensive, mechanized foundation for executable DP implementations

12,000+ lines of Lean proof
Discrete Laplace and Gaussian samplers, proved correct using Fourier analysis and the Poisson summation formula from Mathlib

SampCert — Deployed at AWS

Custom code extractor: the deployed artifact does not go through the Lean compiler. TCB includes the extractor, not the compiler

Performance: SampCert outperforms diffprivlib by >2× for discrete Gaussians

Deployed in AWS Clean Rooms Differential Privacy

PLDI 2025: "Verified Foundations for Differential Privacy."

SymCrypt (Microsoft): Verified Cryptography

SymCrypt verification with Aeneas

SymCrypt — Microsoft's core cryptographic library, rewritten in Rust

Aeneas translates safe Rust to pure functional Lean code. The bridge that makes Rust verification in Lean practical

SymCrypt — Verifying Code as Engineers Write It

They verify the Rust code as written by software engineers. No special verification-friendly subset

Code is evolving: new optimizations for specific hardware (AVX-512, Arm NEON, …). Proofs adapt to rewrites

"We are verifying code faster than we can write it." — Son Ho, SymCrypt, Microsoft

Anneal (Google): Verified Unsafe Rust

Google's zerocopy team building "literate verification" for unsafe Rust using Lean and Aeneas

Unsafe Rust accounts for ~2/3 of vulnerabilities in the Rust ecosystem. Safety invariants are documented only in prose comments

Anneal: developers write Lean specifications directly in Rust doc comments

spec blocks for safe code
unsafe(axiom) blocks for opaque unsafe operations with pre/post conditions
Type invariants for struct safety properties

Designed for AI agents. Antigravity demonstrated authoring unsafe Rust and proving its soundness

Move Borrow Checker

Ilya Sergey (NUS) formalized and verified the borrow checker of Move (smart contract language) in Lean

39,000 lines of mechanized metatheory

Claude (Opus 4.5/4.6) handled proof repair, pattern propagation, routine preservation cases

Human insight still needed for novel proof strategies (e.g., weakening lemmas, inter-procedural reasoning for function calls took two days)

27 working days. Estimated 5-6 months without AI

Blog post: Verifying Move Borrow Checker in Lean: an Experiment in AI-Assisted PL Metatheory

CSLib: A Mathlib for Computer Science

CSLib aims to be a foundation for teaching, research, and new verification efforts, including AI-assisted

CSLib: adding to CS foundations

Steering committee: Clark Barrett (Stanford/Amazon), Swarat Chaudhuri (Deepmind/Austin), Jim Grundy (Amazon), Pushmeet Kohli (Deepmind), Leo de Moura (Lean FRO/Amazon), Fabrizio Montesi (U of Southern Denmark)

Signal Shot

Signal Shot is a public moonshot to verify the Signal protocol and its Rust implementation using Lean

It is a joint effort of Signal (Rolfe Schmidt), the Beneficial AI Foundation (Max Tegmark), and the Lean FRO

The pieces:

Aeneas,
Mathlib and CSLib,
Symbolic proof automation (grind) and scalable verification generation (SymM),
and AI

Proof Automation

The "classic AI" part of scaling up program verification.

Do We Still Need Proof Automation?

"I thought AI would prove all theorems for us now."

Tactics are like game moves
Better tactics = shorter proofs
Better tactics = more powerful AI
Compact proofs = better training data = better AI provers

40 lines replaced by one grind call

What Is `grind`?

New proof automation, shipped in Lean 4.22, by Leonardo de Moura and Kim Morrison

A virtual whiteboard, inspired by modern SMT solvers

Writes facts on the board. Merges equivalent terms
Cooperating engines: congruence closure, E-matching, constraint propagation, guided case analysis
Satellite theory solvers: cutsat (linear integer arithmetic), commutative rings (Gröbner), linarith, AC
Native to dependent type theory. No translation to FOL
Produces ordinary Lean proof terms. Kernel-checkable

VC Generation

A VC generator turns code and specifications into proof obligations

Lean-based VC generators:

Aeneas — Rust verification via translation to Lean
Velvet — a Dafny-style verifier built in Lean
mvcgen — Lean's VC generator for monadic programs

Issue: Verification condition generation with proof assistants has not scaled so far

Adam Chlipala's group documented this for Rocq. The same story holds in Lean

Kraken (Google): x64 Semantics in Lean

Kraken is an x64 model written in Lean

Intended for verifying sequential software that performs computations using common registers and memory

Project started after the Lean@Google Hackathon in December 2025

Andres' Challenge

Andres Erbsen (Google ISE Formal, Rocq software verification veteran)

Andres distilled the scaling problem into a minimal challenge

A small deeply embedded imperative language

inductive Cmd where
  | skip : Cmd
  | set : String → Expr → Cmd
  | seq : Cmd → Cmd → Cmd
  | input : String → Cmd
  -- .

abbrev Memory       := PartialMap Word Byte
abbrev LocalContext := PartialMap String Word

Andres' Challenge — Operational Semantics

inductive IOEvent
  | IN : Word → IOEvent
  | OUT : Word → IOEvent

abbrev Post := List IOEvent → Memory → LocalContext → Prop

inductive Exec : Cmd → List IOEvent → Memory → LocalContext → Post → Prop
  | skip  : post es m l → Exec .skip es m l post
  | seq   : Exec c1 es m l mid →
            (∀ es' m' l', mid es' m' l' → Exec c2 es' m' l' post) →
            Exec (.seq c1 c2) es m l post
  | set   : (h_eval : Expr.eval m l e = some v) →
            (h_post : post es m (PartialMap.put l x v)) →
            Exec (.set x e) es m l post
  | input : (∀ v, post (IOEvent.IN v :: es) m (PartialMap.put l x v)) →
            Exec (.input x) es m l post
-- .

theorem Exec.seq_cps (c1 c2 : Cmd) (es : List IOEvent)
    (m : Memory) (l : LocalContext) (post : Post)
    : Exec c1 es m l (fun es' m' l' => Exec c2 es' m' l' post) →
      Exec (.seq c1 c2) es m l post := byc1:Cmdc2:Cmdes:List IOEventm:Memoryl:LocalContextpost:Post⊢ (Exec c1 es m l fun es' m' l' => Exec c2 es' m' l' post) → Exec (c1.seq c2) es m l post
  intro hc1:Cmdc2:Cmdes:List IOEventm:Memoryl:LocalContextpost:Posth:Exec c1 es m l fun es' m' l' => Exec c2 es' m' l' post⊢ Exec (c1.seq c2) es m l post; apply Exec.seqac1:Cmdc2:Cmdes:List IOEventm:Memoryl:LocalContextpost:Posth:Exec c1 es m l fun es' m' l' => Exec c2 es' m' l' post⊢ Exec c1 es m l ?midac1:Cmdc2:Cmdes:List IOEventm:Memoryl:LocalContextpost:Posth:Exec c1 es m l fun es' m' l' => Exec c2 es' m' l' post⊢ ∀ (es' : List IOEvent) (m' : Memory) (l' : LocalContext), ?mid es' m' l' → Exec c2 es' m' l' postmidc1:Cmdc2:Cmdes:List IOEventm:Memoryl:LocalContextpost:Posth:Exec c1 es m l fun es' m' l' => Exec c2 es' m' l' post⊢ Post; apply hac1:Cmdc2:Cmdes:List IOEventm:Memoryl:LocalContextpost:Posth:Exec c1 es m l fun es' m' l' => Exec c2 es' m' l' post⊢ ∀ (es' : List IOEvent) (m' : Memory) (l' : LocalContext), Exec c2 es' m' l' post → Exec c2 es' m' l' post; introsac1:Cmdc2:Cmdes:List IOEventm:Memoryl:LocalContextpost:Posth:Exec c1 es m l fun es' m' l' => Exec c2 es' m' l' postes'✝:List IOEventm'✝:Memoryl'✝:LocalContexta✝:Exec c2 es'✝ m'✝ l'✝ post⊢ Exec c2 es'✝ m'✝ l'✝ post; assumptionAll goals completed! 🐙

Andres' Challenge — A Parametric Program

notation:130 a " ;;\n" b => Cmd.seq a b
notation:150 x " ::= " e:150 => Cmd.set x e
notation:160 a " +' " b:160 => Expr.op Binop.add (Expr.var a) (Expr.var b)
notation:160 a " -' " b:160 => Expr.op Binop.sub (Expr.var a) (Expr.var b)

/-
Generates a command sequence of the form
input b
a := b
a := a + a; a := a - b
..
a := a + a; a := a - b
-/
def generated_cmd (n : Nat) (a b : String) : Cmd :=
  .input b ;;
  a ::= .var b ;;
  repeated_cmds n a b

abbrev spec (m : Memory) : Post := fun es m' l =>
  m' = m ∧ ∃ v : Word, es = [IOEvent.IN v] ∧ l.get "a" = some v

def Goal (n : Nat) : Prop :=
  ∀ m l, Exec (generated_cmd n "a" "b") [] m l (spec m)

Andres' Challenge — Baseline

example : Goal 1 := by⊢ Goal 1
  intro m lm:Memoryl:LocalContext⊢ Exec (generated_cmd 1 "a" "b") [] m l (spec m); simp only [generated_cmd, repeated_cmds]m:Memoryl:LocalContext⊢ Exec
  (Cmd.input "b" ;;
    "a" ::= Expr.var "b" ;;
      "a" ::= "a" +' "a" ;;
        "a" ::= "a" -' "b" ;;
          Cmd.skip)
  [] m l (spec m); unfold specm:Memoryl:LocalContext⊢ Exec
  (Cmd.input "b" ;;
    "a" ::= Expr.var "b" ;;
      "a" ::= "a" +' "a" ;;
        "a" ::= "a" -' "b" ;;
          Cmd.skip)
  [] m l fun es m' l => m' = m ∧ ∃ v, es = [IOEvent.IN v] ∧ PartialMap.get l "a" = some v
  apply Exec.seq_cpsam:Memoryl:LocalContext⊢ Exec (Cmd.input "b") [] m l fun es' m' l' =>
  Exec
    ("a" ::= Expr.var "b" ;;
      "a" ::= "a" +' "a" ;;
        "a" ::= "a" -' "b" ;;
          Cmd.skip)
    es' m' l' fun es m' l => m' = m ∧ ∃ v, es = [IOEvent.IN v] ∧ PartialMap.get l "a" = some v; apply Exec.inputa.am:Memoryl:LocalContext⊢ ∀ (v : Word),
  Exec
    ("a" ::= Expr.var "b" ;;
      "a" ::= "a" +' "a" ;;
        "a" ::= "a" -' "b" ;;
          Cmd.skip)
    [IOEvent.IN v] m (PartialMap.put l "b" v) fun es m' l =>
    m' = m ∧ ∃ v, es = [IOEvent.IN v] ∧ PartialMap.get l "a" = some v; intro va.am:Memoryl:LocalContextv:Word⊢ Exec
  ("a" ::= Expr.var "b" ;;
    "a" ::= "a" +' "a" ;;
      "a" ::= "a" -' "b" ;;
        Cmd.skip)
  [IOEvent.IN v] m (PartialMap.put l "b" v) fun es m' l =>
  m' = m ∧ ∃ v, es = [IOEvent.IN v] ∧ PartialMap.get l "a" = some v
  apply Exec.seq_cpsa.a.am:Memoryl:LocalContextv:Word⊢ Exec ("a" ::= Expr.var "b") [IOEvent.IN v] m (PartialMap.put l "b" v) fun es' m' l' =>
  Exec
    ("a" ::= "a" +' "a" ;;
      "a" ::= "a" -' "b" ;;
        Cmd.skip)
    es' m' l' fun es m' l => m' = m ∧ ∃ v, es = [IOEvent.IN v] ∧ PartialMap.get l "a" = some v; apply Exec.seta.a.a.h_evalm:Memoryl:LocalContextv:Word⊢ Expr.eval m (PartialMap.put l "b" v) (Expr.var "b") = some ?a.a.a.v✝a.a.a.h_postm:Memoryl:LocalContextv:Word⊢ Exec
  ("a" ::= "a" +' "a" ;;
    "a" ::= "a" -' "b" ;;
      Cmd.skip)
  [IOEvent.IN v] m ((PartialMap.put l "b" v).put "a" ?a.a.a.v✝) fun es m' l =>
  m' = m ∧ ∃ v, es = [IOEvent.IN v] ∧ PartialMap.get l "a" = some va.a.a.vm:Memoryl:LocalContextv:Word⊢ Word; simp_vca.a.a.h_evalm:Memoryl:LocalContextv:Word⊢ v = ?a.a.a.v✝a.a.a.h_postm:Memoryl:LocalContextv:Word⊢ Exec
  ("a" ::= "a" +' "a" ;;
    "a" ::= "a" -' "b" ;;
      Cmd.skip)
  [IOEvent.IN v] m ((PartialMap.put l "b" v).put "a" ?a.a.a.v✝) fun es m' l =>
  m' = m ∧ ∃ v, es = [IOEvent.IN v] ∧ PartialMap.get l "a" = some va.a.a.vm:Memoryl:LocalContextv:Word⊢ Word; rfla.a.a.h_postm:Memoryl:LocalContextv:Word⊢ Exec
  ("a" ::= "a" +' "a" ;;
    "a" ::= "a" -' "b" ;;
      Cmd.skip)
  [IOEvent.IN v] m ((PartialMap.put l "b" v).put "a" v) fun es m' l =>
  m' = m ∧ ∃ v, es = [IOEvent.IN v] ∧ PartialMap.get l "a" = some v
  apply Exec.seq_cpsa.a.a.h_post.am:Memoryl:LocalContextv:Word⊢ Exec ("a" ::= "a" +' "a") [IOEvent.IN v] m ((PartialMap.put l "b" v).put "a" v) fun es' m' l' =>
  Exec
    ("a" ::= "a" -' "b" ;;
      Cmd.skip)
    es' m' l' fun es m' l => m' = m ∧ ∃ v, es = [IOEvent.IN v] ∧ PartialMap.get l "a" = some v; apply Exec.seta.a.a.h_post.a.h_evalm:Memoryl:LocalContextv:Word⊢ Expr.eval m ((PartialMap.put l "b" v).put "a" v) ("a" +' "a") = some ?a.a.a.h_post.a.v✝a.a.a.h_post.a.h_postm:Memoryl:LocalContextv:Word⊢ Exec
  ("a" ::= "a" -' "b" ;;
    Cmd.skip)
  [IOEvent.IN v] m (((PartialMap.put l "b" v).put "a" v).put "a" ?a.a.a.h_post.a.v✝) fun es m' l =>
  m' = m ∧ ∃ v, es = [IOEvent.IN v] ∧ PartialMap.get l "a" = some va.a.a.h_post.a.vm:Memoryl:LocalContextv:Word⊢ Word; simp_vca.a.a.h_post.a.h_evalm:Memoryl:LocalContextv:Word⊢ Word.add (some v) (some v) = ?a.a.a.h_post.a.v✝a.a.a.h_post.a.h_postm:Memoryl:LocalContextv:Word⊢ Exec
  ("a" ::= "a" -' "b" ;;
    Cmd.skip)
  [IOEvent.IN v] m (((PartialMap.put l "b" v).put "a" v).put "a" ?a.a.a.h_post.a.v✝) fun es m' l =>
  m' = m ∧ ∃ v, es = [IOEvent.IN v] ∧ PartialMap.get l "a" = some va.a.a.h_post.a.vm:Memoryl:LocalContextv:Word⊢ Word; rfla.a.a.h_post.a.h_postm:Memoryl:LocalContextv:Word⊢ Exec
  ("a" ::= "a" -' "b" ;;
    Cmd.skip)
  [IOEvent.IN v] m (((PartialMap.put l "b" v).put "a" v).put "a" (Word.add (some v) (some v))) fun es m' l =>
  m' = m ∧ ∃ v, es = [IOEvent.IN v] ∧ PartialMap.get l "a" = some v
  apply Exec.seq_cpsa.a.a.h_post.a.h_post.am:Memoryl:LocalContextv:Word⊢ Exec ("a" ::= "a" -' "b") [IOEvent.IN v] m (((PartialMap.put l "b" v).put "a" v).put "a" (Word.add (some v) (some v)))
  fun es' m' l' =>
  Exec Cmd.skip es' m' l' fun es m' l => m' = m ∧ ∃ v, es = [IOEvent.IN v] ∧ PartialMap.get l "a" = some v; apply Exec.seta.a.a.h_post.a.h_post.a.h_evalm:Memoryl:LocalContextv:Word⊢ Expr.eval m (((PartialMap.put l "b" v).put "a" v).put "a" (Word.add (some v) (some v))) ("a" -' "b") =
  some ?a.a.a.h_post.a.h_post.a.v✝a.a.a.h_post.a.h_post.a.h_postm:Memoryl:LocalContextv:Word⊢ Exec Cmd.skip [IOEvent.IN v] m
  ((((PartialMap.put l "b" v).put "a" v).put "a" (Word.add (some v) (some v))).put "a" ?a.a.a.h_post.a.h_post.a.v✝)
  fun es m' l => m' = m ∧ ∃ v, es = [IOEvent.IN v] ∧ PartialMap.get l "a" = some va.a.a.h_post.a.h_post.a.vm:Memoryl:LocalContextv:Word⊢ Word; simp_vca.a.a.h_post.a.h_post.a.h_evalm:Memoryl:LocalContextv:Word⊢ v = ?a.a.a.h_post.a.h_post.a.v✝a.a.a.h_post.a.h_post.a.h_postm:Memoryl:LocalContextv:Word⊢ Exec Cmd.skip [IOEvent.IN v] m
  ((((PartialMap.put l "b" v).put "a" v).put "a" (Word.add (some v) (some v))).put "a" ?a.a.a.h_post.a.h_post.a.v✝)
  fun es m' l => m' = m ∧ ∃ v, es = [IOEvent.IN v] ∧ PartialMap.get l "a" = some va.a.a.h_post.a.h_post.a.vm:Memoryl:LocalContextv:Word⊢ Word; rfla.a.a.h_post.a.h_post.a.h_postm:Memoryl:LocalContextv:Word⊢ Exec Cmd.skip [IOEvent.IN v] m ((((PartialMap.put l "b" v).put "a" v).put "a" (Word.add (some v) (some v))).put "a" v)
  fun es m' l => m' = m ∧ ∃ v, es = [IOEvent.IN v] ∧ PartialMap.get l "a" = some v
  apply Exec.skipa.a.a.h_post.a.h_post.a.h_post.am:Memoryl:LocalContextv:Word⊢ m = m ∧
  ∃ v_1,
    [IOEvent.IN v] = [IOEvent.IN v_1] ∧
      ((((PartialMap.put l "b" v).put "a" v).put "a" (Word.add (some v) (some v))).put "a" v).get "a" = some v_1
  simp only [List.cons.injEq, IOEvent.IN.injEq, and_true, PartialMap.put_put,
    PartialMap.get_put, Option.some.injEq, and_self, exists_eq']All goals completed! 🐙

With Lean's default tactic framework, MetaM, this is superlinear

Andres' Challenge tactics time by goal size

What Must Be Fast in VC Generation

The recipe is the same independent of the specific generator:

Efficient apply
Preserve term sharing
Reuse simp/grind data across VCs and across simulation steps
...

SymM: Making Verification Scale

SymM: new monadic framework for high-performance software verification. Born after the Lean@Google Hackathon
Designed for tools like Aeneas, mvcgen and Velvet that need to discharge thousands of verification conditions efficiently

Without this, verifying something as large as Signal would be too slow and painful

SymM vs MetaM benchmark

SymM with cache reuse

mvcgen — Ported to `SymM`

Andres' challenge using Lean monadic code. Linear out to n=1000

Tactic time

Kernel time

Velvet — Ported to `SymM`

A Dafny-style verifier in Lean. Ported to SymM by Vova (Lean FRO intern). Preliminary results

Before: Velvet ran ~3× slower than Dafny
After: Faster than Dafny on 24 of 27 benchmarks; competitive on the other 3

One port beat Dafny on 24 of 27. Dafny has a vast TCB

Dafny vs Velvet vs Velvet + Loom2 verification time

The Lean Focused Research Organization (FRO)

Launched in August 2023 by Leonardo de Moura and me as a non-profit

A nonprofit employing 20 engineers producing open-source software

Public roadmap | Our team

Conclusion

ITP-based verification provides highest assurances, in mathematics as in industry

"Generative AI" is not enough, "classical" AI and general scalability is as important

Program verification can require foundational framework changes to scale

Thank You

lean-lang.org | leanprover.zulipchat.com