3.0 Release by 0x0f0f0f · Pull Request #185 · JuliaSymbolics/Metatheory.jl

0x0f0f0f · 2024-01-14T10:21:27Z

Release 3.0 when merged

codecov-commenter · 2024-01-14T10:23:40Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 88.40773% with 174 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.66%. Comparing base (a8331d6) to head (b743e75).
⚠️ Report is 539 commits behind head on master.

Files with missing lines	Patch %	Lines
src/EGraphs/Schedulers.jl	34.78%	30 Missing ⚠️
src/EGraphs/egraph.jl	89.23%	28 Missing ⚠️
src/Syntax.jl	89.15%	18 Missing ⚠️
src/utils.jl	11.11%	16 Missing ⚠️
src/EGraphs/saturation.jl	92.85%	13 Missing ⚠️
src/Patterns.jl	88.76%	10 Missing ⚠️
src/ematch_compiler.jl	96.00%	10 Missing ⚠️
src/Rules.jl	84.61%	8 Missing ⚠️
src/optbuffer.jl	78.94%	8 Missing ⚠️
src/vecexpr.jl	84.90%	8 Missing ⚠️
... and 6 more
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@             Coverage Diff             @@
##           master     #185       +/-   ##
===========================================
+ Coverage   69.17%   82.66%   +13.48%     
===========================================
  Files          16       18        +2     
  Lines        1353     1777      +424     
===========================================
+ Hits          936     1469      +533     
+ Misses        417      308      -109

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2024-01-14T10:26:27Z

Benchmark Results

	egg-sym	egg-cust	MT@b743e7597b2...	MT@dec61ce1816...	egg-sym/MT@b74...	egg-cust/MT@b7...	MT@dec61ce1816...
egraph_addexpr	1.42 ms		5.05 ms	14.2 ms	0.281		2.823
basic_maths_simpl2	14.6 ms	4.44 ms	17.7 ms	889 ms	0.824	0.251	50.288
prop_logic_freges_theorem	2.51 ms	1.43 ms	2.09 ms	35.9 ms	1.204	0.687	17.215
calc_logic_demorgan	59.5 μs	31.7 μs	79.4 μs	524 μs	0.749	0.399	6.606
calc_logic_freges_theorem	22 ms	10.4 ms	38.7 ms	1.68e+04 ms	0.569	0.269	432.910
basic_maths_simpl1	6.74 ms	2.56 ms	4.27 ms	49.6 ms	1.577	0.598	11.622
egraph_constructor	0.0833 μs		0.497 μs	0.108 μs	0.168		0.217
prop_logic_prove1	35.8 ms	12.8 ms	44.4 ms	8.95e+03 ms	0.806	0.288	201.592
prop_logic_demorgan	79.1 μs	42.6 μs	96.6 μs	757 μs	0.820	0.441	7.840
while_superinterpreter_while_10			19.2 ms	96.2 ms			5.006
prop_logic_rewrite			115 μs	114 μs			0.987
time_to_load			100 ms	159 ms			1.591

	egg-sym	egg-cust
egraph_addexpr	6769 6769 6769
basic_maths_simpl2	440 2235 2836	440 2235 2839
prop_logic_freges_theorem	316 1197 2315	316 1197 2322
calc_logic_demorgan	16 33 35	16 33 35
calc_logic_freges_theorem	1072 4289 17394	1072 4289 17280
basic_maths_simpl1	368 1910 2543	368 1910 2567
egraph_constructor
prop_logic_prove1	5510 17371 27976	4668 13644 18522
prop_logic_demorgan	16 35 42	16 35 42
while_superinterpreter_while_10
prop_logic_rewrite
time_to_load

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

0x0f0f0f · 2024-01-14T13:05:10Z

This is 14.5 times faster than egg in rust!

Julia Code

using Metatheory, BenchmarkTools

t = @theory a b begin
  a + b --> b + a
  a * b --> b * a
  a + 0 --> a
  a * 0 --> 0
  a * 1 --> a
end

using BenchmarkTools

p = SaturationParams(; timer = false)

function simpl(ex)
  g = EGraph(ex)
  saturate!(g, t, p)
  extract!(g, astsize)
end

ex = :(0 + (1 * foo) * 0 + (a * 0) + a)

simpl(ex)

@btime simpl(ex)

94.462 μs (1412 allocations: 66.08 KiB)

Rust code

use egg::{rewrite as rw, *};
//use std::time::Duration;
fn main() {
    env_logger::init();
    use egg::*;

    define_language! {
        enum SimpleLanguage {
            Num(i32),
            "+" = Add([Id; 2]),
            "*" = Mul([Id; 2]),
            Symbol(Symbol),
        }
    }

    fn make_rules() -> Vec<Rewrite<SimpleLanguage, ()>> {
        vec![
            rewrite!("commute-add"; "(+ ?a ?b)" => "(+ ?b ?a)"),
            rewrite!("commute-mul"; "(* ?a ?b)" => "(* ?b ?a)"),
            rewrite!("add-0"; "(+ ?a 0)" => "?a"),
            rewrite!("mul-0"; "(* ?a 0)" => "0"),
            rewrite!("mul-1"; "(* ?a 1)" => "?a"),
        ]
    }

    /// parse an expression, simplify it using egg, and pretty print it back out
    fn simplify(s: &str) -> String {
        // parse the expression, the type annotation tells it which Language to use
        let expr: RecExpr<SimpleLanguage> = s.parse().unwrap();

        // simplify the expression using a Runner, which creates an e-graph with
        // the given expression and runs the given rules over it
        let runner = Runner::default().with_expr(&expr).run(&make_rules());

        // the Runner knows which e-class the expression given with `with_expr` is in
        let root = runner.roots[0];

        // use an Extractor to pick the best element of the root eclass
        let extractor = Extractor::new(&runner.egraph, AstSize);
        let (best_cost, best) = extractor.find_best(root);
        println!("Simplified {} to {} with cost {}", expr, best, best_cost);
        best.to_string()
    }

    // assert_eq!(simplify("(* 0 42)"), "0");
    let apply_time: std::time::Instant = instant::Instant::now();
    // assert_eq!(simplify("(+ 0 (* 1 foo))"), "foo");
    assert_eq!(simplify("(+ (+ (+ 0 (* (* 1 foo) 0)) (* a 0)) a)"), "a");
    let apply_time = apply_time.elapsed().as_secs_f64();
    println!("simplification time {}", apply_time);
}

simplification time 0.001375786 seconds which is 1375.786microseconds

1375.786 / 94.462 = 14.56x faster

well

0x0f0f0f · 2024-03-01T23:08:54Z

@gkronber I have just updated this branch to include the latest release of https://github.com/JuliaSymbolics/TermInterface.jl - the interface for custom types has changed, please let me know if you encounter any issue

gkronber · 2024-03-02T12:58:30Z

@gkronber I have just updated this branch to include the latest release of https://github.com/JuliaSymbolics/TermInterface.jl - the interface for custom types has changed, please let me know if you encounter any issue

Thanks for the heads up. I only had to make minor changes because of the changed names for functions in VecExpr.

shashi · 2024-03-12T22:03:31Z

@0x0f0f0f Can you summarize the changes in this PR? What's the huge number of deletions from? And what's the huge performance gain from? This is awesome.

0x0f0f0f · 2024-03-19T11:30:06Z

@0x0f0f0f Can you summarize the changes in this PR? What's the huge number of deletions from? And what's the huge performance gain from? This is awesome.

@shashi I removed some unnecessary parts of the codebase, and basically changed the core types used for rewriting. The trick for performance was basically packing e-nodes in Vector{UInt64} such that loops (which happens hundreds of thousands of times) can be vectorized. Instead of having a struct with operation, hash, flags... fields, I just encode and pack everything into UInt64.

https://github.com/JuliaSymbolics/Metatheory.jl/blob/8f228fd0f8a5e1f97fd0ed986cb1c7e94cbce5c8/src/vecexpr.jl

Also some algorithms were updated, as the egg repo now has more efficient versions of parts of equality saturation (rebuilding) that were in the original paper

0x0f0f0f · 2024-03-19T11:30:39Z

I will hold a 5 min lightning talk on thursday 6 pm CET https://egraphs.org/ if interested

chriselrod · 2024-04-17T03:01:12Z

For fun, I benchmarked this PR

julia> @benchmark simpl(ex)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  74.508 μs …   8.173 ms  ┊ GC (min … max): 0.00% … 97.73%
 Time  (median):     80.918 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   87.298 μs ± 150.197 μs  ┊ GC (mean ± σ):  5.42% ±  3.34%

   ▂▃▂   ▄▇█▇▆▆▆▅▄▅▅▅▄▃▃▃▂▂▂▁▁▁ ▁▁▁▁ ▁                         ▂
  █████▆▇██████████████████████████████▇▇▇▇▇▇█▇▇▆▆▆▆▅▆▄▅▂▅▂▄▅▅ █
  74.5 μs       Histogram: log(frequency) by time       107 μs <

 Memory estimate: 55.08 KiB, allocs estimate: 1003.

> target/release/eggvsmetatheory                               (base) 
Simplified (+ (+ (+ 0 (* (* 1 foo) 0)) (* a 0)) a) to a with cost 1
simplification time 46.603975 us

It's still slower than egg for that example, but this is a huge improvement compared to being like 300x slower the last time I compared them.

I modified the Rust script so that we aren't benchmarking printing, and also taking the average of 1000 runs:

    let apply_time: std::time::Instant = instant::Instant::now();
    for _i in 0..1000 {
        simplify("(+ (+ (+ 0 (* (* 1 foo) 0)) (* a 0)) a)", false);
    }
    let apply_time = apply_time.elapsed().as_secs_f64();
    println!("simplification time {} us", apply_time*1e3);

where simplify takes a print: bool arg for whether or not it should print.

shashi · 2024-04-17T19:34:11Z

Let's go!!!

0x0f0f0f · 2024-04-18T12:10:23Z

It's still slower than egg for that example, but this is a huge improvement compared to being like 300x slower the last time I compared them.

I modified the Rust script so that we aren't benchmarking printing, and also taking the average of 1000 runs:

@chriselrod we just published a preprint with some results here https://arxiv.org/pdf/2404.08751.pdf

@nmheim is working on https://github.com/nmheim/egg-benchmark which should be the repository containing benchmarks for egg against the ones in the benchmarks directory in repository (1-1 comparison with same hyperparameters/rules where possible).

Statically Compiled E-Matcher

…nction for benchmarks, rollback changes to tests.

…g-benchmarks.

Ale/3.0 optimize pat

Update docs for egraph semantic analysis.

…nchmarks Update benchmarks to report the egraph size for each case

0x0f0f0f · 2026-05-21T09:21:23Z

Back at this! @gkronber @nmheim

Here are both texts:

Fixed congruence closure, ematcher correctness, lock leak, and FreezingScheduler (not used still). added theory-level predicates.

The last two commits introduce theory-level variable predicates (@theory x::Number y begin … end) that apply to every rule in a theory without repeating the annotation per rule, with documentation and tests. On the correctness side it fixes several bugs: rebuild_memo! is added to egraph.jl and called after every rebuild that produced unions — this is necessary because canonicalize! updates child ids in VecExpr nodes in-place, which changes their cached hash and makes the corresponding Dict keys unreachable, silently breaking congruence closure (the e-graph would create duplicate eclasses for the same canonical expression instead of merging them). FreezingSchedulerStat is changed from struct to mutable struct so the ban mechanism can actually write to it at runtime. direct(EqualityRule) is fixed to not share the internal stack with the original rule, preventing aliasing bugs when both are used concurrently. The ematcher stack is reset (empty!(stack)) before each call to the classical matcher to prevent leftover entries from early-callback returns accumulating indefinitely. The isliteral_bitvec bit for a pattern variable is now cleared on backtrack in the e-graph ematcher to prevent stale literal-vs-eclass bindings when a variable transitions between matching a literal and a non-literal node. Finally, eqsat_apply! is refactored into an inner _eqsat_apply_impl! so the try/finally lock guard only wraps the locking path, removing JIT optimization inhibition from the common single-threaded hot loop.

Benchmarks vs. origin/ale/3.0: most workloads improve (simpl1 −6.5 %, simpl2 −12.4 %, addexpr −5.1 %) or are invariant. prop_logic/prove1 regresses ~21 % because the congruence-closure fix causes the e-graph to correctly detect and cascade more unions, producing a larger but more complete saturation (4 765 classes vs. 3 061 in the baseline, which had 6 461 stale/unreachable memo entries masking missed merges). All tests pass.

Fixed the tests that have been broken since, like 2021? 🚀

0x0f0f0f · 2026-05-21T11:53:31Z

Latest commit adds segment variables (~~x) for e-graph matching

Added has_segment_children::Bool to Pat — precomputed in pat_expr
Added is_segment_patvar helper used by @rule macro.
RewriteRule gets has_segments::Bool, segment_patvars::BitVector, and segment_buffer::OptBuffer{UInt64}. The buffer is cleared once per rule per iteration before ematch, then read during apply. No race with the buffer's own offset values stored in ematch_buffer.
New bind_segment_expr instruction in ematch compiler. Protocol: each segment patvar gets a local seg_offset_sym initialized to typemax(UInt64) ("no prior push"). On every re-entry (including backtracks), if seg_offset_sym != typemax, seg_buf.i is restored to that value. undoing both this segment's data and any
inner segment data pushed after it. The typemax sentinel distinguishes a fresh first entry from a backtrack, preventing outer segment data from being messed up.
New ematch_compile_segment_expr! allocates σ addresses only for fixed children; the segment patvar gets de Bruijn address -2 (sentinel), plus an entry in segment_patvar_syms.
Updated yield_expr to push seg_offset_sym (buffer offset) for segment patvars instead of σaddr.
Generated ematchers have seg_buf as 6th argument with a default, so old 5-arg call sites still work.
cached_ids: segment patterns fall back to a head-hash-filtered scan of all e-classes (marked # EXPENSIVE). Non-segment rules still use the fast classes_by_op index, no regression.
instantiate_enode!: the p.has_segment_children check is one field read. When true, allocates a fresh VecExpr with the correct runtime arity and signature hash(p.name, hash(total_arity)), splatting segment children from seg_buf. WARNING: THIS IS EXPENSIVE
eqsat_search!: calls empty!(rule.segment_buffer) once before each rule's ematch, passes it as 6th arg.

tecosaur · 2026-06-19T02:57:19Z

I was thinking it's been a while since I heard about the "upcoming" v3, and so took a peek at this. Exciting to see it's been picked up again!

This was referenced Jan 14, 2024

Draft: Complete Overhaul of Egraphs (even faster) #184

Closed

Draft: Complete overhaul #177

Closed

0x0f0f0f mentioned this pull request Jan 14, 2024

Redesign Proposal JuliaSymbolics/TermInterface.jl#22

Closed

This was referenced Feb 14, 2024

Exploding egraphs - Troubles defining rule sets working with eq-sat #190

Closed

Could Julia help? marcusrossel/lean-egg#18

Closed

0x0f0f0f mentioned this pull request Mar 5, 2024

Relationship to SymbolicUtils etc JuliaSymbolics/TermInterface.jl#15

Closed

0x0f0f0f mentioned this pull request Apr 9, 2024

Q: Using TermInterface to set up Metatheory #173

Open

a and others added 12 commits April 28, 2024 15:32

finalize

a43213d

trigger CI

35d6e4b

immutable eclass

06c4998

cleanup

3ae8173

cleanup

6278476

Merge pull request #199 from JuliaSymbolics/ale/ultra-ematcher

24bc137

Statically Compiled E-Matcher

optimize extraction and application

0e5bb1a

less allocations

e88996b

adjust type

c13867c

Merge branch 'ale/3.0' into ale/immutable-eclass

c901fc2

cleanup and adjustments

410e0b3

benchmark mt vs. egg

acea4e1

gkronber and others added 13 commits February 18, 2025 16:50

Actually evaluate the benchmarked expression.

31db7e8

Change identifiers.

ddd5619

Track sizes of all steps in prove and implement a separate prove() fu…

3cfb889

…nction for benchmarks, rollback changes to tests.

Report size for egraph/addexpr.

a783920

Bugfix.

c776370

Triple-check benchmarking parameters and use same parameters as in eg…

641212a

…g-benchmarks.

Update docs for semantic analysis (for #267).

d1b050e

Remove @show statement.

e9e8244

Merge pull request #262 from JuliaSymbolics/ale/3.0-optimize-pat

57bfe22

Ale/3.0 optimize pat

Merge pull request #268 from JuliaSymbolics/267_make_analysis_docs

f687b00

Update docs for egraph semantic analysis.

Merge pull request #265 from JuliaSymbolics/report_egraph_size_for_be…

74f9d5b

…nchmarks Update benchmarks to report the egraph size for each case

improvements

9cd5da1

Lots of correctness fixes and performance improvements

e9d9263

0x0f0f0f added 4 commits May 21, 2026 11:29

cover documentation todos

917b545

Merge remote-tracking branch 'origin/master' into ale/3.0

a95340e

remove docstring extensions

d18c85b

use each revision for benchmarks

73a695e

0x0f0f0f mentioned this pull request May 21, 2026

@slots x::T #150

Closed

add segment variables in ematching

56f7923

0x0f0f0f mentioned this pull request May 21, 2026

Support segment variables ~x... in egraphs pattern matching #3

Closed

move to Memory for buffers

9720371

0x0f0f0f mentioned this pull request May 21, 2026

Perf Regression in Julia 1.12 #288

Closed

0x0f0f0f added 5 commits May 21, 2026 15:20

fixes

20f1ccb

extra deps

1baf03a

Performance optimizations

949cf31

deterministic extraction

9f0caa5

fix nightly ci

b743e75

Uh oh!

Conversation

0x0f0f0f commented Jan 14, 2024

Uh oh!

codecov-commenter commented Jan 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented Jan 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results

Benchmark Plots

Uh oh!

0x0f0f0f commented Jan 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

0x0f0f0f commented Mar 1, 2024

Uh oh!

gkronber commented Mar 2, 2024

Uh oh!

shashi commented Mar 12, 2024

Uh oh!

0x0f0f0f commented Mar 19, 2024

Uh oh!

0x0f0f0f commented Mar 19, 2024

Uh oh!

chriselrod commented Apr 17, 2024

Uh oh!

shashi commented Apr 17, 2024

Uh oh!

0x0f0f0f commented Apr 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

0x0f0f0f commented May 21, 2026

Uh oh!

0x0f0f0f commented May 21, 2026

Uh oh!

tecosaur commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

codecov-commenter commented Jan 14, 2024 •

edited

Loading

github-actions Bot commented Jan 14, 2024 •

edited

Loading

0x0f0f0f commented Jan 14, 2024 •

edited

Loading

0x0f0f0f commented Apr 18, 2024 •

edited

Loading