-
Notifications
You must be signed in to change notification settings - Fork 715
feat: linear-size noConfusionType construction
#8037
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PR adds a realistic large-inductive benchmark, taken from https://github.com/opencompl/sail-riscv-lean
…-39f73806633d6950c636f49a1be75b6aee53bb4b' into joachim/linear-noConfusion
|
!bench |
|
Here are the benchmark results for commit 92de76d. |
|
!bench |
|
Thank you, @nomeata. This is very promising. |
|
Mathlib CI status (docs):
|
|
Here are the benchmark results for commit 00155d0. Benchmark Metric Change
==============================================================
+ bv_decide_large_aig.lean branch-misses -1.6% (-33.3 σ)
+ bv_decide_large_aig.lean wall-clock -4.7% (-20.1 σ)
- channel.lean bounded0_spsc 14.3% (35.3 σ)
- channel.lean unbounded_mpsc 6.6% (12.4 σ)
- channel.lean unbounded_spsc 24.4% (14.1 σ)
+ ilean roundtrip compress -4.3% (-10.5 σ)
+ lake build clean task-clock -2.3% (-33.3 σ)
+ riscv-ast.lean branch-misses -52.9% (-169.5 σ)
+ riscv-ast.lean branches -73.4% (-2827.9 σ)
+ riscv-ast.lean instructions -74.7% (-2855.8 σ)
+ riscv-ast.lean maxrss -46.3% (-52.6 σ)
+ riscv-ast.lean task-clock -73.7% (-76.2 σ)
+ riscv-ast.lean wall-clock -72.3% (-70.7 σ)
+ stdlib blocked -2.7% (-11.0 σ)
+ stdlib maxrss -1.4% (-29.4 σ)
+ stdlib task-clock -2.3% (-18.4 σ)
+ stdlib wall-clock -2.2% (-66.8 σ)
+ unionfind task-clock -6.8% (-26.4 σ)
+ unionfind wall-clock -6.8% (-26.2 σ) |
|
!bench |
|
!bench |
|
Here are the benchmark results for commit dfc1df4. Benchmark Metric Change
===================================================
- liasolver maxrss 1.1% (24.1 σ)
- nat_repr maxrss 1.1% (24.1 σ)
- parser maxrss 1.1% (24.1 σ)
- qsort maxrss 1.1% (26.6 σ)
+ rbmap task-clock -1.2% (-27.2 σ)
+ rbmap wall-clock -1.2% (-15.7 σ)
+ riscv-ast.lean branch-misses -55.1% (-54.7 σ)
+ riscv-ast.lean branches -73.5% (-1016.6 σ)
+ riscv-ast.lean instructions -74.8% (-1021.0 σ)
+ riscv-ast.lean maxrss -45.9% (-41.8 σ)
+ riscv-ast.lean task-clock -75.5% (-99.6 σ)
+ riscv-ast.lean wall-clock -73.8% (-163.9 σ)
+ stdlib maxrss -1.4% (-59.9 σ) |
|
Here are the benchmark results for commit 51cc087. Benchmark Metric Change
===============================================================================
+ Init.Data.BitVec.Lemmas re-elab branch-misses -1.8% (-14.9 σ)
+ Init.Data.BitVec.Lemmas re-elab task-clock -2.4% (-12.9 σ)
+ Std.Data.Internal.List.Associative branch-misses -2.1% (-10.4 σ)
+ big_omega.lean MT task-clock -4.2% (-17.5 σ)
+ big_omega.lean MT wall-clock -4.5% (-17.1 σ)
+ bv_decide_large_aig.lean branch-misses -1.4% (-13.8 σ)
+ const_fold task-clock -2.8% (-74.9 σ)
+ const_fold wall-clock -2.8% (-67.9 σ)
+ lake build clean task-clock -1.9% (-75.1 σ)
+ omega_stress.lean async branch-misses -1.1% (-10.3 σ)
+ parser task-clock -8.4% (-12.2 σ)
+ parser wall-clock -8.4% (-12.1 σ)
+ rbmap task-clock -5.1% (-14.7 σ)
+ rbmap wall-clock -5.1% (-14.1 σ)
+ rbmap_library task-clock -5.2% (-13.2 σ)
+ rbmap_library wall-clock -5.2% (-13.2 σ)
+ riscv-ast.lean branch-misses -57.6% (-98.7 σ)
+ riscv-ast.lean branches -78.2% (-1320.3 σ)
+ riscv-ast.lean instructions -79.4% (-1297.4 σ)
+ riscv-ast.lean maxrss -53.0% (-21.6 σ)
+ riscv-ast.lean task-clock -75.9% (-284.4 σ)
+ riscv-ast.lean wall-clock -75.1% (-118.2 σ)
+ stdlib attribute application -1.3% (-15.2 σ)
+ stdlib blocked (unaccounted) -1.9% (-10.7 σ)
+ stdlib tactic execution -1.6% (-33.6 σ)
+ stdlib wall-clock -1.9% (-20.0 σ) |
|
This seems very promising! I wonder if a similar strategy could be used for a linear |
|
Yes, there is hope that we can apply this technique to the other currently quadratic constructions. |
|
@nomeata, we are currently trying to get the lean definitions of |
|
The roadmap has it not this quarter; this is mostly an experiment. I'd still have to find out if my construction works in general, whether it can fully replace the current one or whether it should kick in only for large inductives and clean it up. |
…lean4 into joachim/linear-noConfusion
|
!bench |
|
Here are the benchmark results for commit a1cecfd. Benchmark Metric Change
=====================================================================
+ const_fold task-clock -2.5% (-21.0 σ)
+ const_fold wall-clock -2.5% (-20.5 σ)
+ nat_repr branches -1.9% (-1254.0 σ)
+ omega_stress.lean async branches -1.9% (-118.4 σ)
+ omega_stress.lean async instructions -1.9% (-116.4 σ)
+ riscv-ast.lean branch-misses -56.3% (-700.2 σ)
+ riscv-ast.lean branches -77.6% (-81839.3 σ)
+ riscv-ast.lean instructions -78.9% (-248079.5 σ)
+ riscv-ast.lean maxrss -51.8% (-20.2 σ)
+ riscv-ast.lean task-clock -76.0% (-60.2 σ)
+ riscv-ast.lean wall-clock -75.1% (-60.2 σ)
+ stdlib instantiate metavars -1.6% (-20.9 σ) |
…ot general purpose
|
Ok, mathlib benchmarks are inconspicious, as far as I can tell: https://speed.lean-lang.org/mathlib4/compare/58b8accd-a1ae-4b0e-9f85-674a4f295f9f/to/2e918b87-fd9c-462b-b3b9-a2b38b581368 So it’s plausible to use this constructions by default, not just for inductives with many constructors (to keep things uniform). I fear we still need the previous constructions for |
…chim/linear-noConfusion
I thought with #7096 fixed we can do that, but it seems that at least the code that I tried to use here would require even further improvements to the kernel level normalization around |
noConfusionType constructionnoConfusionType construction
This PR introduces a
noConfusionTypeconstruction that’s sub-quadratic in size, and reduces faster.The previous
noConfusionconstruction with two nestedmatchstatements is quadratic in size and reduction behavior. Using some helper definitions, a linear size construction is possible.With this, processing the RISC-V-AST definition from https://github.com/opencompl/sail-riscv-lean takes 6s instead of 60s.
The previous construction is still used when processing the early prelude, and can be enabled elsewhere using
set_option backwards.linearNoConfusionType false.