Lean AI formalization leaderboard

lean-eval

Lean AI formalization leaderboard

Public results on a benchmark of hard Lean formalization problems. Expand any row to inspect solved theorems, extracted statements, and links to public proofs when available.

19models

13submitters

7problem authors

55problems

Leaderboard

Model rankings

Ranked by main benchmark problems solved. Internal test problems do not count toward the score.

1Aleph Prover(logicalintelligence.com)27 solved

Other solved problems

Test problems: def_hole_example, instance_hole_example, ci_regenerate_main_check, list_append_singleton_length, two_plus_two (5 / 5 solved)

Submission history

First submissionMay 7, 2026

Last submissionMay 14, 2026

Contributors

mayorov-m-a28antpavzhi4

2Seed Prover (ByteDance)27 solved

Other solved problems

Submission history

First submissionMay 20, 2026

Last submissionMay 21, 2026

Contributors

GanjinZero27

3Aristotle (Harmonic)25 solved

Other solved problems

Test problems: def_hole_example, instance_hole_example, list_append_singleton_length, ci_regenerate_main_check, two_plus_two (5 / 5 solved)

Submission history

First submissionMay 1, 2026

Last submissionMay 20, 2026

Contributors

LorenzoLuccioli19sqrt-of-212kim-em10parabamoghv1

4Antigravity (Multi-Model Ensemble: Gemini 3.1 Pro, Gemini 3 Flash, Claude 4.6 Sonnet/Opus)21 solved

Other solved problems

Test problems: def_hole_example, instance_hole_example, ci_regenerate_main_check, list_append_singleton_length, two_plus_two (5 / 5 solved)

Submission history

First submissionMay 13, 2026

Last submissionMay 22, 2026

Contributors

daouid26

5Claude Opus 4.7 (1M context)15 solved

Other solved problems

Test problems: def_hole_example, instance_hole_example, ci_regenerate_main_check, list_append_singleton_length, two_plus_two (5 / 5 solved)

Submission history

First submissionMay 2, 2026

Last submissionMay 22, 2026

Contributors

rkirov18jzuiddam2

6GPT-5.512 solved

Other solved problems

Test problems: instance_hole_example, def_hole_example, ci_regenerate_main_check, list_append_singleton_length, two_plus_two (5 / 5 solved)

Submission history

First submissionMay 1, 2026

Last submissionMay 16, 2026

Contributors

sqrt-of-214A-M-Berns4kim-em1

7Stealth Model4 solved

Other solved problems

Burnside p^a q^b theorem

finite_group_isSolvable_of_card_eq_prime_pow_mul_prime_pow

Verso theorem preview

theorem declaration uses `sorry`finite_group_isSolvable_of_card_eq_prime_pow_mul_prime_pow {G : Type*} [Group G] [Fintype G]
    {p q a b : ℕ}
    (hp : Nat.Prime p)
    (hq : Nat.Prime q)
    (hpq : p ≠ q)
    (hcard : Fintype.card G = p ^ a * q ^ b) :
    IsSolvable G := byG:Type u_1inst✝¹:Group Ginst✝:Fintype Gp:ℕq:ℕa:ℕb:ℕhp:Nat.Prime phq:Nat.Prime qhpq:p ≠ qhcard:Fintype.card G = p ^ a * q ^ b⊢ IsSolvable G
  sorryAll goals completed! 🐙

Real cyclotomic integer with house at most 2

cyclotomic_integer_house_le_two

Verso theorem preview

theorem declaration uses `sorry`cyclotomic_integer_house_le_two {K : Type*} [Field K] [NumberField K] [Algebra ℚ K]
    (n : ℕ) [NeZero n] [IsCyclotomicExtension {n} ℚ K] {β : K}
    (hβ_int : IsIntegral ℤ β)
    (hβ_real : β ∈ NumberField.maximalRealSubfield K) :
    house β ≤ 2 →
      house β = 2 ∨ ∃ m : ℕ, 0 < m ∧ house β = 2 * Real.cos (Real.pi / m) := byK:Type u_1inst✝⁴:Field Kinst✝³:NumberField Kinst✝²:Algebra ℚ Kn:ℕinst✝¹:NeZero ninst✝:IsCyclotomicExtension {n} ℚ Kβ:Khβ_int:IsIntegral ℤ βhβ_real:β ∈ maximalRealSubfield K⊢ house β ≤ 2 → house β = 2 ∨ ∃ m, 0 < m ∧ house β = 2 * Real.cos (Real.pi / ↑m)
  sorryAll goals completed! 🐙

How produced

Corrected model label for the comparator-accepted private submission of cyclotomic_integer_house_le_two. Same verified pinned commit as issue #234; the previous submission used the wrong model label.

pi_1 of the circle is Z

pi1_circle_mulEquiv_int

Verso theorem preview

theorem declaration uses `sorry`pi1_circle_mulEquiv_int :
    Nonempty (HomotopyGroup.Pi 1 Circle (1 : Circle) ≃* Multiplicative ℤ) := by⊢ Nonempty (HomotopyGroup.Pi 1 Circle 1 ≃* Multiplicative ℤ)
  sorryAll goals completed! 🐙

Comparison principle for the Dirichlet BVP

bvp_comparison

Verso theorem preview

theorem declaration uses `sorry`bvp_comparison (J : Set ℝ) (hJ_open : IsOpen J) (hJ_sub : Set.Icc (0 : ℝ) 1 ⊆ J)
    (u v : ℝ → ℝ)
    (hu : ∀ x ∈ J, HasDerivAt u (deriv u x) x)
    (hu' : ∀ x ∈ J, HasDerivAt (deriv u) (deriv (deriv u) x) x)
    (hv : ∀ x ∈ J, HasDerivAt v (deriv v x) x)
    (hv' : ∀ x ∈ J, HasDerivAt (deriv v) (deriv (deriv v) x) x)
    (hineq : ∀ x ∈ Set.Ioo (0 : ℝ) 1, -deriv (deriv u) x ≤ -deriv (deriv v) x)
    (hu0 : u 0 ≤ v 0) (hu1 : u 1 ≤ v 1) :
    ∀ x ∈ Set.Icc (0 : ℝ) 1, u x ≤ v x := byJ:Set ℝhJ_open:IsOpen JhJ_sub:Set.Icc 0 1 ⊆ Ju:ℝ → ℝv:ℝ → ℝhu:∀ x ∈ J, HasDerivAt u (deriv u x) xhu':∀ x ∈ J, HasDerivAt (deriv u) (deriv (deriv u) x) xhv:∀ x ∈ J, HasDerivAt v (deriv v x) xhv':∀ x ∈ J, HasDerivAt (deriv v) (deriv (deriv v) x) xhineq:∀ x ∈ Set.Ioo 0 1, -deriv (deriv u) x ≤ -deriv (deriv v) xhu0:u 0 ≤ v 0hu1:u 1 ≤ v 1⊢ ∀ x ∈ Set.Icc 0 1, u x ≤ v x
  sorryAll goals completed! 🐙

Test problems: def_hole_example, instance_hole_example, list_append_singleton_length, ci_regenerate_main_check, two_plus_two (5 / 5 solved)

Submission history

First submissionMay 8, 2026

Last submissionMay 22, 2026

Contributors

rishistyping9

8GPT-5.5 Codex2 solved

Other solved problems

Sturm separation theorem

sturm_separation

Verso theorem preview

theorem declaration uses `sorry`sturm_separation (p q y₁ y₂ : ℝ → ℝ) (a b : ℝ) (hab : a < b)
    (J : Set ℝ) (hJ_open : IsOpen J) (hJ_conn : IsPreconnected J)
    (hJ_sub : Set.Icc a b ⊆ J)
    (hp : ContinuousOn p J) (hq : ContinuousOn q J)
    (hy₁ : ∀ x ∈ J, HasDerivAt y₁ (deriv y₁ x) x)
    (hy₁' : ∀ x ∈ J, HasDerivAt (deriv y₁) (-(p x * deriv y₁ x + q x * y₁ x)) x)
    (hy₂ : ∀ x ∈ J, HasDerivAt y₂ (deriv y₂ x) x)
    (hy₂' : ∀ x ∈ J, HasDerivAt (deriv y₂) (-(p x * deriv y₂ x + q x * y₂ x)) x)
    (hW : ∃ x₀ ∈ J, y₁ x₀ * deriv y₂ x₀ - y₂ x₀ * deriv y₁ x₀ ≠ 0)
    (hza : y₁ a = 0) (hzb : y₁ b = 0)
    (hne : ∀ x ∈ Set.Ioo a b, y₁ x ≠ 0) :
    ∃! c, c ∈ Set.Ioo a b ∧ y₂ c = 0 := byp:ℝ → ℝq:ℝ → ℝy₁:ℝ → ℝy₂:ℝ → ℝa:ℝb:ℝhab:a < bJ:Set ℝhJ_open:IsOpen JhJ_conn:IsPreconnected JhJ_sub:Set.Icc a b ⊆ Jhp:ContinuousOn p Jhq:ContinuousOn q Jhy₁:∀ x ∈ J, HasDerivAt y₁ (deriv y₁ x) xhy₁':∀ x ∈ J, HasDerivAt (deriv y₁) (-(p x * deriv y₁ x + q x * y₁ x)) xhy₂:∀ x ∈ J, HasDerivAt y₂ (deriv y₂ x) xhy₂':∀ x ∈ J, HasDerivAt (deriv y₂) (-(p x * deriv y₂ x + q x * y₂ x)) xhW:∃ x₀ ∈ J, y₁ x₀ * deriv y₂ x₀ - y₂ x₀ * deriv y₁ x₀ ≠ 0hza:y₁ a = 0hzb:y₁ b = 0hne:∀ x ∈ Set.Ioo a b, y₁ x ≠ 0⊢ ∃! c, c ∈ Set.Ioo a b ∧ y₂ c = 0
  sorryAll goals completed! 🐙

#1proof

Comparison principle for the Dirichlet BVP

bvp_comparison

Verso theorem preview

theorem declaration uses `sorry`bvp_comparison (J : Set ℝ) (hJ_open : IsOpen J) (hJ_sub : Set.Icc (0 : ℝ) 1 ⊆ J)
    (u v : ℝ → ℝ)
    (hu : ∀ x ∈ J, HasDerivAt u (deriv u x) x)
    (hu' : ∀ x ∈ J, HasDerivAt (deriv u) (deriv (deriv u) x) x)
    (hv : ∀ x ∈ J, HasDerivAt v (deriv v x) x)
    (hv' : ∀ x ∈ J, HasDerivAt (deriv v) (deriv (deriv v) x) x)
    (hineq : ∀ x ∈ Set.Ioo (0 : ℝ) 1, -deriv (deriv u) x ≤ -deriv (deriv v) x)
    (hu0 : u 0 ≤ v 0) (hu1 : u 1 ≤ v 1) :
    ∀ x ∈ Set.Icc (0 : ℝ) 1, u x ≤ v x := byJ:Set ℝhJ_open:IsOpen JhJ_sub:Set.Icc 0 1 ⊆ Ju:ℝ → ℝv:ℝ → ℝhu:∀ x ∈ J, HasDerivAt u (deriv u x) xhu':∀ x ∈ J, HasDerivAt (deriv u) (deriv (deriv u) x) xhv:∀ x ∈ J, HasDerivAt v (deriv v x) xhv':∀ x ∈ J, HasDerivAt (deriv v) (deriv (deriv v) x) xhineq:∀ x ∈ Set.Ioo 0 1, -deriv (deriv u) x ≤ -deriv (deriv v) xhu0:u 0 ≤ v 0hu1:u 1 ≤ v 1⊢ ∀ x ∈ Set.Icc 0 1, u x ≤ v x
  sorryAll goals completed! 🐙

#2proof

Test problems: two_plus_two (1 / 5 solved)

Submission history

First submissionMay 6, 2026

Last submissionMay 7, 2026

Contributors

A-M-Berns3

9Gemini 3.1 Pro2 solved

Other solved problems

Cayley graph connected iff generators generate the group

mulCayley_connected_iff_closure_eq_top

Verso theorem preview

theorem declaration uses `sorry`mulCayley_connected_iff_closure_eq_top {G : Type*} [Group G]
    (S : Set G) :
    (SimpleGraph.mulCayley S).Connected ↔ Subgroup.closure S = ⊤ := byG:Type u_1inst✝:Group GS:Set G⊢ (SimpleGraph.mulCayley S).Connected ↔ Subgroup.closure S = ⊤
  sorryAll goals completed! 🐙

#1proof

Comparison principle for the Dirichlet BVP

bvp_comparison

Verso theorem preview

theorem declaration uses `sorry`bvp_comparison (J : Set ℝ) (hJ_open : IsOpen J) (hJ_sub : Set.Icc (0 : ℝ) 1 ⊆ J)
    (u v : ℝ → ℝ)
    (hu : ∀ x ∈ J, HasDerivAt u (deriv u x) x)
    (hu' : ∀ x ∈ J, HasDerivAt (deriv u) (deriv (deriv u) x) x)
    (hv : ∀ x ∈ J, HasDerivAt v (deriv v x) x)
    (hv' : ∀ x ∈ J, HasDerivAt (deriv v) (deriv (deriv v) x) x)
    (hineq : ∀ x ∈ Set.Ioo (0 : ℝ) 1, -deriv (deriv u) x ≤ -deriv (deriv v) x)
    (hu0 : u 0 ≤ v 0) (hu1 : u 1 ≤ v 1) :
    ∀ x ∈ Set.Icc (0 : ℝ) 1, u x ≤ v x := byJ:Set ℝhJ_open:IsOpen JhJ_sub:Set.Icc 0 1 ⊆ Ju:ℝ → ℝv:ℝ → ℝhu:∀ x ∈ J, HasDerivAt u (deriv u x) xhu':∀ x ∈ J, HasDerivAt (deriv u) (deriv (deriv u) x) xhv:∀ x ∈ J, HasDerivAt v (deriv v x) xhv':∀ x ∈ J, HasDerivAt (deriv v) (deriv (deriv v) x) xhineq:∀ x ∈ Set.Ioo 0 1, -deriv (deriv u) x ≤ -deriv (deriv v) xhu0:u 0 ≤ v 0hu1:u 1 ≤ v 1⊢ ∀ x ∈ Set.Icc 0 1, u x ≤ v x
  sorryAll goals completed! 🐙

#4proof

Test problems: instance_hole_example, def_hole_example, list_append_singleton_length, ci_regenerate_main_check, two_plus_two (5 / 5 solved)

Submission history

First submissionMay 1, 2026

Last submissionMay 10, 2026

Contributors

sqrt-of-27kim-em1

10[submission] aegis-of-the-unit-circle-logos2 solved

Other solved problems

Real cyclotomic integer with house at most 2

cyclotomic_integer_house_le_two

Verso theorem preview

theorem declaration uses `sorry`cyclotomic_integer_house_le_two {K : Type*} [Field K] [NumberField K] [Algebra ℚ K]
    (n : ℕ) [NeZero n] [IsCyclotomicExtension {n} ℚ K] {β : K}
    (hβ_int : IsIntegral ℤ β)
    (hβ_real : β ∈ NumberField.maximalRealSubfield K) :
    house β ≤ 2 →
      house β = 2 ∨ ∃ m : ℕ, 0 < m ∧ house β = 2 * Real.cos (Real.pi / m) := byK:Type u_1inst✝⁴:Field Kinst✝³:NumberField Kinst✝²:Algebra ℚ Kn:ℕinst✝¹:NeZero ninst✝:IsCyclotomicExtension {n} ℚ Kβ:Khβ_int:IsIntegral ℤ βhβ_real:β ∈ maximalRealSubfield K⊢ house β ≤ 2 → house β = 2 ∨ ∃ m, 0 < m ∧ house β = 2 * Real.cos (Real.pi / ↑m)
  sorryAll goals completed! 🐙

How produced

Comparator-accepted Lean Eval solution for cyclotomic_integer_house_le_two. Developed and verified in a private repository. Local checks included direct Lean Eval comparator and CI-equivalent evaluate_submission.py.

pi_1 of the circle is Z

pi1_circle_mulEquiv_int

Verso theorem preview

theorem declaration uses `sorry`pi1_circle_mulEquiv_int :
    Nonempty (HomotopyGroup.Pi 1 Circle (1 : Circle) ≃* Multiplicative ℤ) := by⊢ Nonempty (HomotopyGroup.Pi 1 Circle 1 ≃* Multiplicative ℤ)
  sorryAll goals completed! 🐙

How produced

Test problems: ci_regenerate_main_check, two_plus_two (2 / 5 solved)

Submission history

First submissionMay 12, 2026

Last submissionMay 12, 2026

Contributors

rishistyping4

11Claude Opus 4.71 solved

Other solved problems

Finite Ramsey theorem for graphs

finite_graph_ramsey_theorem

Verso theorem preview

theorem declaration uses `sorry`finite_graph_ramsey_theorem :
    ∀ r s : ℕ, 2 ≤ r → 2 ≤ s → ∃ n : ℕ, ∀ G : SimpleGraph (Fin n), ¬ G.CliqueFree r ∨ ¬ Gᶜ.CliqueFree s := by⊢ ∀ (r s : ℕ), 2 ≤ r → 2 ≤ s → ∃ n, ∀ (G : SimpleGraph (Fin n)), ¬G.CliqueFree r ∨ ¬Gᶜ.CliqueFree s
  sorryAll goals completed! 🐙

#1proof

Test problems: list_append_singleton_length, two_plus_two (2 / 5 solved)

Submission history

First submissionApr 30, 2026

Last submissionApr 30, 2026

Contributors

rkirov3kim-em1

12EVO1 solved

Other solved problems

Finite Ramsey theorem for graphs

finite_graph_ramsey_theorem

Verso theorem preview

theorem declaration uses `sorry`finite_graph_ramsey_theorem :
    ∀ r s : ℕ, 2 ≤ r → 2 ≤ s → ∃ n : ℕ, ∀ G : SimpleGraph (Fin n), ¬ G.CliqueFree r ∨ ¬ Gᶜ.CliqueFree s := by⊢ ∀ (r s : ℕ), 2 ≤ r → 2 ≤ s → ∃ n, ∀ (G : SimpleGraph (Fin n)), ¬G.CliqueFree r ∨ ¬Gᶜ.CliqueFree s
  sorryAll goals completed! 🐙

#1proof

Test problems: two_plus_two (1 / 5 solved)

Submission history

First submissionMay 18, 2026

Last submissionMay 20, 2026

Contributors

machinelearning20142

13Kimi K2.60 solved

Test problems: two_plus_two (1 / 5 solved)

Submission history

First submissionMay 1, 2026

Last submissionMay 1, 2026

Contributors

kim-em1

14Mistral Large 30 solved

Test problems: two_plus_two (1 / 5 solved)

Submission history

First submissionMay 1, 2026

Last submissionMay 1, 2026

Contributors

kim-em1

15DeepSeek V4 Pro0 solved

Test problems: two_plus_two (1 / 5 solved)

Submission history

First submissionMay 1, 2026

Last submissionMay 1, 2026

Contributors

kim-em1

16Qwen3.6 Max0 solved

Test problems: two_plus_two (1 / 5 solved)

Submission history

First submissionMay 1, 2026

Last submissionMay 1, 2026

Contributors

kim-em1

17Grok 4.30 solved

Test problems: two_plus_two (1 / 5 solved)

Submission history

First submissionMay 1, 2026

Last submissionMay 1, 2026

Contributors

kim-em1

18Claude Sonnet 4.60 solved

Test problems: two_plus_two (1 / 5 solved)

Submission history

First submissionMay 1, 2026

Last submissionMay 1, 2026

Contributors

kim-em1

19Leanstral-26030 solved

Test problems: instance_hole_example, def_hole_example, list_append_singleton_length, ci_regenerate_main_check, two_plus_two (5 / 5 solved)

Submission history

First submissionMay 11, 2026

Last submissionMay 11, 2026

Contributors

sqrt-of-25

Coverage

Per-problem coverage

Which problems each model has solved. Hidden on narrow screens.

Problem	Aleph Prover(logicalintelligence.com)	Seed Prover (ByteDance)	Aristotle (Harmonic)	Antigravity (Multi-Model Ensemble: Gemini 3.1 Pro, Gemini 3 Flash, Claude 4.6 Sonnet/Opus)	Claude Opus 4.7 (1M context)	GPT-5.5	Stealth Model	GPT-5.5 Codex	Gemini 3.1 Pro	[submission] aegis-of-the-unit-circle-logos	Claude Opus 4.7	EVO	Kimi K2.6	Mistral Large 3	DeepSeek V4 Pro	Qwen3.6 Max	Grok 4.3	Claude Sonnet 4.6	Leanstral-2603
Chudnovsky formula for pi inversemain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
pi_1 of the circle is Zmain	✓	✓	✓	✓	✓	✓	✓	—	—	✓	—	—	—	—	—	—	—	—	—
Finite Ramsey theorem for graphsmain	✓	✓	✓	✓	✓	✓	—	—	—	—	✓	✓	—	—	—	—	—	—	—
Catalan generating function via compositional inversionmain	✓	✓	✓	✓	✓	✓	—	—	—	—	—	—	—	—	—	—	—	—	—
Lagarias criterion is equivalent to RHmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Chen theorem for Markoff graphsmain	✓	✓	✓	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Cayley graph connected iff generators generate the groupmain	✓	✓	✓	✓	✓	✓	—	—	✓	—	—	—	—	—	—	—	—	—	—
pi_3 of the 2-sphere is Zmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
pi_n of the n-sphere is Zmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
pi_(n+1) of S^n is Z/2 for n at least 3main	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Burnside p^a q^b theoremmain	✓	✓	✓	—	—	—	✓	—	—	—	—	—	—	—	—	—	—	—	—
Rouche theorem via zero countingmain	✓	✓	✓	✓	✓	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Minkowski-Caratheodory theoremmain	✓	✓	✓	✓	✓	✓	—	—	—	—	—	—	—	—	—	—	—	—	—
Perron-Frobenius for irreducible nonnegative matricesmain	✓	✓	✓	✓	—	✓	—	—	—	—	—	—	—	—	—	—	—	—	—
Complementary polynomial on the unit circlemain	✓	✓	✓	✓	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Entrywise exponential of a PSD matrix is PSDmain	✓	✓	✓	✓	✓	✓	—	—	—	—	—	—	—	—	—	—	—	—	—
Oppenheim's inequality for Hadamard productsmain	✓	✓	✓	✓	✓	—	—	—	—	—	—	—	—	—	—	—	—	—	—
von Neumann double commutant theoremmain	✓	✓	✓	✓	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Schur-Weyl duality: S_k image equals centralizer of GL(V) imagemain	✓	✓	✓	—	✓	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Schur-Weyl duality: GL(V) image equals centralizer of S_k imagemain	✓	✓	✓	—	✓	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Existence of a 64-dim irreducible g₂-representation with 14 tensor-square isotypic componentsmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Existence of a 779247-dim irreducible e₈-representation with 40 tensor-square isotypic componentsmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Cerf's theorem: every self-diffeomorphism of S3 is smoothly isotopic to a linear isometrymain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Smale conjecture (Hatcher) in relative parameterized formmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Real cyclotomic integer with house at most 2main	✓	✓	✓	✓	—	—	✓	—	—	✓	—	—	—	—	—	—	—	—	—
Real cyclotomic integer with house in (2, 76/33)main	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Gleason's theorem (finite-dimensional)main	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Gleason's theorem (separable Hilbert space)main	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Jacobian of a compact Riemann surface (Buzzard challenge)main	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Jacobian of a smooth proper curve (Merten challenge)main	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Existence of a non-isotopic pair of oriented two-component linksmain	✓	✓	✓	✓	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Existence of a non-isotopic pair of oriented knotsmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Existence of a chiral oriented knotmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Character values of finite groups lie in cyclotomic fieldsmain	✓	✓	✓	✓	✓	✓	—	—	—	—	—	—	—	—	—	—	—	—	—
Existence of an order-10200960 group with a 22-dim irrep whose tensor square has 4 isotypic componentsmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Frobenius's theorem: the Frobenius kernel is normalmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Glauberman's Z* theorem for isolated involutionsmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Schreier's conjecture: outer automorphism group of a finite simple group is solvablemain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Possible orders of 5-transitive finite permutation groupsmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Topological classification of surfacesmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
De Branges's theorem (Bieberbach conjecture)main	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Polynomial decay rate of y' = -y^3main	✓	✓	✓	✓	✓	✓	—	—	—	—	—	—	—	—	—	—	—	—	—
Baker-Wüstholz theorem on linear forms in logarithmsmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Sturm separation theoremmain	✓	✓	✓	✓	✓	✓	—	✓	—	—	—	—	—	—	—	—	—	—	—
Dirichlet eigenvalues of -y'' = lambda y on [0,pi] are n^2main	✓	✓	✓	✓	✓	✓	—	—	—	—	—	—	—	—	—	—	—	—	—
Linear ODE with negative-real-part eigenvalues is asymptotically stablemain	✓	✓	✓	✓	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Comparison principle for the Dirichlet BVPmain	✓	✓	✓	✓	✓	✓	✓	✓	✓	—	—	—	—	—	—	—	—	—	—
Gaussian heat kernel solves the 1D heat equationmain	✓	✓	✓	✓	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Bing's house with two rooms is contractiblemain	✓	✓	—	✓	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Fermat's Last Theoremmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Radó's theorem on Riemann surfacesmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Neukirch–Uchida theoremmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Balanceable k-bounded partitionsmain	✓	✓	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
A competition programming problem about permuting a permutation to be unimodalmain	✓	✓	✓	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
Uniformization theorem for Riemann surfacesmain	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—	—
2 + 2 = 4test	✓	—	✓	✓	✓	✓	✓	✓	✓	✓	✓	✓	✓	✓	✓	✓	✓	✓	✓
CI regenerate-main checktest	✓	—	✓	✓	✓	✓	✓	—	✓	✓	—	—	—	—	—	—	—	—	✓
Appending a singleton increases the list lengthtest	✓	—	✓	✓	✓	✓	✓	—	✓	—	✓	—	—	—	—	—	—	—	✓
def-hole minimal exampletest	✓	—	✓	✓	✓	✓	✓	—	✓	—	—	—	—	—	—	—	—	—	✓
instance-hole minimal exampletest	✓	—	✓	✓	✓	✓	✓	—	✓	—	—	—	—	—	—	—	—	—	✓

Welcome to lean-eval, a Lean formalization benchmark and public leaderboard.

You can submit new problems for review, and solutions for existing problems. New problems will be carefully reviewed and added to future benchmark releases if they are accepted. Solutions are automatically verified using comparator and added to the public leaderboard.

This benchmark intends to capture hard Lean formalization problems, consisting of mathematical problems that are currently stateable mostly using existing Mathlib definitions, perhaps with a page or so of additional setup. They should be hard, but usually not open problems: in fact, it's preferred if the problem has a known informal solution which is publicly available.

Our hope is that at launch, the problem set will be mostly, but not entirely, out of reach for current publicly available frontier models, or simple orchestration layers built on top of these. So some genuine mathematical subtlety is required!

It's also important to say what this benchmark is not: we are not trying to capture the ability to write readable or reusable code, or to follow best practices in Lean. In particular, the only requirement for a solution to be accepted is that it is correct and passes the comparator tests.