Skip to content

3.0 Release#185

Open
0x0f0f0f wants to merge 391 commits into
masterfrom
ale/3.0
Open

3.0 Release#185
0x0f0f0f wants to merge 391 commits into
masterfrom
ale/3.0

Conversation

@0x0f0f0f
Copy link
Copy Markdown
Member

Release 3.0 when merged

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jan 14, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 88.40000% with 174 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.65%. Comparing base (a8331d6) to head (9f0caa5).
⚠️ Report is 539 commits behind head on master.

Files with missing lines Patch % Lines
src/EGraphs/Schedulers.jl 34.78% 30 Missing ⚠️
src/EGraphs/egraph.jl 89.23% 28 Missing ⚠️
src/Syntax.jl 89.15% 18 Missing ⚠️
src/utils.jl 11.11% 16 Missing ⚠️
src/EGraphs/saturation.jl 92.85% 13 Missing ⚠️
src/Patterns.jl 88.76% 10 Missing ⚠️
src/ematch_compiler.jl 96.00% 10 Missing ⚠️
src/Rules.jl 84.61% 8 Missing ⚠️
src/optbuffer.jl 78.94% 8 Missing ⚠️
src/vecexpr.jl 84.61% 8 Missing ⚠️
... and 6 more
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@             Coverage Diff             @@
##           master     #185       +/-   ##
===========================================
+ Coverage   69.17%   82.65%   +13.47%     
===========================================
  Files          16       18        +2     
  Lines        1353     1776      +423     
===========================================
+ Hits          936     1468      +532     
+ Misses        417      308      -109     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jan 14, 2024

Benchmark Results

egg-sym egg-cust MT@b743e7597b2... MT@dec61ce1816... egg-sym/MT@b74... egg-cust/MT@b7... MT@dec61ce1816...
egraph_addexpr 1.42 ms 5.05 ms 14.2 ms 0.281 2.823
basic_maths_simpl2 14.6 ms 4.44 ms 17.7 ms 889 ms 0.824 0.251 50.288
prop_logic_freges_theorem 2.51 ms 1.43 ms 2.09 ms 35.9 ms 1.204 0.687 17.215
calc_logic_demorgan 59.5 μs 31.7 μs 79.4 μs 524 μs 0.749 0.399 6.606
calc_logic_freges_theorem 22 ms 10.4 ms 38.7 ms 1.68e+04 ms 0.569 0.269 432.910
basic_maths_simpl1 6.74 ms 2.56 ms 4.27 ms 49.6 ms 1.577 0.598 11.622
egraph_constructor 0.0833 μs 0.497 μs 0.108 μs 0.168 0.217
prop_logic_prove1 35.8 ms 12.8 ms 44.4 ms 8.95e+03 ms 0.806 0.288 201.592
prop_logic_demorgan 79.1 μs 42.6 μs 96.6 μs 757 μs 0.820 0.441 7.840
while_superinterpreter_while_10 19.2 ms 96.2 ms 5.006
prop_logic_rewrite 115 μs 114 μs 0.987
time_to_load 100 ms 159 ms 1.591
egg-sym egg-cust MT@b743e7597b2... MT@dec61ce1816... egg-sym/MT@b74... egg-cust/MT@b7... MT@dec61ce1816...
egraph_addexpr 6769 6769 6769
basic_maths_simpl2 440 2235 2836 440 2235 2839
prop_logic_freges_theorem 316 1197 2315 316 1197 2322
calc_logic_demorgan 16 33 35 16 33 35
calc_logic_freges_theorem 1072 4289 17394 1072 4289 17280
basic_maths_simpl1 368 1910 2543 368 1910 2567
egraph_constructor
prop_logic_prove1 5510 17371 27976 4668 13644 18522
prop_logic_demorgan 16 35 42 16 35 42
while_superinterpreter_while_10
prop_logic_rewrite
time_to_load

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

@0x0f0f0f
Copy link
Copy Markdown
Member Author

0x0f0f0f commented Jan 14, 2024

This is 14.5 times faster than egg in rust!

Julia Code

using Metatheory, BenchmarkTools

t = @theory a b begin
  a + b --> b + a
  a * b --> b * a
  a + 0 --> a
  a * 0 --> 0
  a * 1 --> a
end

using BenchmarkTools

p = SaturationParams(; timer = false)

function simpl(ex)
  g = EGraph(ex)
  saturate!(g, t, p)
  extract!(g, astsize)
end

ex = :(0 + (1 * foo) * 0 + (a * 0) + a)

simpl(ex)

@btime simpl(ex)

94.462 μs (1412 allocations: 66.08 KiB)

Rust code

use egg::{rewrite as rw, *};
//use std::time::Duration;
fn main() {
    env_logger::init();
    use egg::*;

    define_language! {
        enum SimpleLanguage {
            Num(i32),
            "+" = Add([Id; 2]),
            "*" = Mul([Id; 2]),
            Symbol(Symbol),
        }
    }

    fn make_rules() -> Vec<Rewrite<SimpleLanguage, ()>> {
        vec![
            rewrite!("commute-add"; "(+ ?a ?b)" => "(+ ?b ?a)"),
            rewrite!("commute-mul"; "(* ?a ?b)" => "(* ?b ?a)"),
            rewrite!("add-0"; "(+ ?a 0)" => "?a"),
            rewrite!("mul-0"; "(* ?a 0)" => "0"),
            rewrite!("mul-1"; "(* ?a 1)" => "?a"),
        ]
    }

    /// parse an expression, simplify it using egg, and pretty print it back out
    fn simplify(s: &str) -> String {
        // parse the expression, the type annotation tells it which Language to use
        let expr: RecExpr<SimpleLanguage> = s.parse().unwrap();

        // simplify the expression using a Runner, which creates an e-graph with
        // the given expression and runs the given rules over it
        let runner = Runner::default().with_expr(&expr).run(&make_rules());

        // the Runner knows which e-class the expression given with `with_expr` is in
        let root = runner.roots[0];

        // use an Extractor to pick the best element of the root eclass
        let extractor = Extractor::new(&runner.egraph, AstSize);
        let (best_cost, best) = extractor.find_best(root);
        println!("Simplified {} to {} with cost {}", expr, best, best_cost);
        best.to_string()
    }

    // assert_eq!(simplify("(* 0 42)"), "0");
    let apply_time: std::time::Instant = instant::Instant::now();
    // assert_eq!(simplify("(+ 0 (* 1 foo))"), "foo");
    assert_eq!(simplify("(+ (+ (+ 0 (* (* 1 foo) 0)) (* a 0)) a)"), "a");
    let apply_time = apply_time.elapsed().as_secs_f64();
    println!("simplification time {}", apply_time);
}

simplification time 0.001375786 seconds which is 1375.786microseconds

1375.786 / 94.462 = 14.56x faster

well

@0x0f0f0f
Copy link
Copy Markdown
Member Author

0x0f0f0f commented Mar 1, 2024

@gkronber I have just updated this branch to include the latest release of https://github.com/JuliaSymbolics/TermInterface.jl - the interface for custom types has changed, please let me know if you encounter any issue

@gkronber
Copy link
Copy Markdown
Collaborator

gkronber commented Mar 2, 2024

@gkronber I have just updated this branch to include the latest release of https://github.com/JuliaSymbolics/TermInterface.jl - the interface for custom types has changed, please let me know if you encounter any issue

Thanks for the heads up. I only had to make minor changes because of the changed names for functions in VecExpr.

@shashi
Copy link
Copy Markdown
Member

shashi commented Mar 12, 2024

@0x0f0f0f Can you summarize the changes in this PR? What's the huge number of deletions from? And what's the huge performance gain from? This is awesome.

@0x0f0f0f
Copy link
Copy Markdown
Member Author

@0x0f0f0f Can you summarize the changes in this PR? What's the huge number of deletions from? And what's the huge performance gain from? This is awesome.

@shashi I removed some unnecessary parts of the codebase, and basically changed the core types used for rewriting. The trick for performance was basically packing e-nodes in Vector{UInt64} such that loops (which happens hundreds of thousands of times) can be vectorized. Instead of having a struct with operation, hash, flags... fields, I just encode and pack everything into UInt64.

https://github.com/JuliaSymbolics/Metatheory.jl/blob/8f228fd0f8a5e1f97fd0ed986cb1c7e94cbce5c8/src/vecexpr.jl

Also some algorithms were updated, as the egg repo now has more efficient versions of parts of equality saturation (rebuilding) that were in the original paper

@0x0f0f0f
Copy link
Copy Markdown
Member Author

I will hold a 5 min lightning talk on thursday 6 pm CET https://egraphs.org/ if interested

@chriselrod
Copy link
Copy Markdown
Contributor

For fun, I benchmarked this PR

julia> @benchmark simpl(ex)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  74.508 μs    8.173 ms  ┊ GC (min  max): 0.00%  97.73%
 Time  (median):     80.918 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   87.298 μs ± 150.197 μs  ┊ GC (mean ± σ):  5.42% ±  3.34%

   ▂▃▂   ▄▇█▇▆▆▆▅▄▅▅▅▄▃▃▃▂▂▂▁▁▁ ▁▁▁▁ ▁                         ▂
  █████▆▇██████████████████████████████▇▇▇▇▇▇█▇▇▆▆▆▆▅▆▄▅▂▅▂▄▅▅ █
  74.5 μs       Histogram: log(frequency) by time       107 μs <

 Memory estimate: 55.08 KiB, allocs estimate: 1003.
> target/release/eggvsmetatheory                               (base) 
Simplified (+ (+ (+ 0 (* (* 1 foo) 0)) (* a 0)) a) to a with cost 1
simplification time 46.603975 us

It's still slower than egg for that example, but this is a huge improvement compared to being like 300x slower the last time I compared them.

I modified the Rust script so that we aren't benchmarking printing, and also taking the average of 1000 runs:

    let apply_time: std::time::Instant = instant::Instant::now();
    for _i in 0..1000 {
        simplify("(+ (+ (+ 0 (* (* 1 foo) 0)) (* a 0)) a)", false);
    }
    let apply_time = apply_time.elapsed().as_secs_f64();
    println!("simplification time {} us", apply_time*1e3);

where simplify takes a print: bool arg for whether or not it should print.

@shashi
Copy link
Copy Markdown
Member

shashi commented Apr 17, 2024

Let's go!!!

@0x0f0f0f
Copy link
Copy Markdown
Member Author

0x0f0f0f commented Apr 18, 2024

It's still slower than egg for that example, but this is a huge improvement compared to being like 300x slower the last time I compared them.

I modified the Rust script so that we aren't benchmarking printing, and also taking the average of 1000 runs:

@chriselrod we just published a preprint with some results here https://arxiv.org/pdf/2404.08751.pdf

@nmheim is working on https://github.com/nmheim/egg-benchmark which should be the repository containing benchmarks for egg against the ones in the benchmarks directory in repository (1-1 comparison with same hyperparameters/rules where possible).

@0x0f0f0f
Copy link
Copy Markdown
Member Author

Back at this! @gkronber @nmheim

Here are both texts:

Fixed congruence closure, ematcher correctness, lock leak, and FreezingScheduler (not used still). added theory-level predicates.

The last two commits introduce theory-level variable predicates (@theory x::Number y begin … end) that apply to every rule in a theory without repeating the annotation per rule, with documentation and tests. On the correctness side it fixes several bugs: rebuild_memo! is added to egraph.jl and called after every rebuild that produced unions — this is necessary because canonicalize! updates child ids in VecExpr nodes in-place, which changes their cached hash and makes the corresponding Dict keys unreachable, silently breaking congruence closure (the e-graph would create duplicate eclasses for the same canonical expression instead of merging them). FreezingSchedulerStat is changed from struct to mutable struct so the ban mechanism can actually write to it at runtime. direct(EqualityRule) is fixed to not share the internal stack with the original rule, preventing aliasing bugs when both are used concurrently. The ematcher stack is reset (empty!(stack)) before each call to the classical matcher to prevent leftover entries from early-callback returns accumulating indefinitely. The isliteral_bitvec bit for a pattern variable is now cleared on backtrack in the e-graph ematcher to prevent stale literal-vs-eclass bindings when a variable transitions between matching a literal and a non-literal node. Finally, eqsat_apply! is refactored into an inner _eqsat_apply_impl! so the try/finally lock guard only wraps the locking path, removing JIT optimization inhibition from the common single-threaded hot loop.

Benchmarks vs. origin/ale/3.0: most workloads improve (simpl1 −6.5 %, simpl2 −12.4 %, addexpr −5.1 %) or are invariant. prop_logic/prove1 regresses ~21 % because the congruence-closure fix causes the e-graph to correctly detect and cascade more unions, producing a larger but more complete saturation (4 765 classes vs. 3 061 in the baseline, which had 6 461 stale/unreachable memo entries masking missed merges). All tests pass.

Fixed the tests that have been broken since, like 2021? 🚀

@0x0f0f0f 0x0f0f0f mentioned this pull request May 21, 2026
@0x0f0f0f
Copy link
Copy Markdown
Member Author

Latest commit adds segment variables (~~x) for e-graph matching

  • Added has_segment_children::Bool to Pat — precomputed in pat_expr
  • Added is_segment_patvar helper used by @rule macro.
  • RewriteRule gets has_segments::Bool, segment_patvars::BitVector, and segment_buffer::OptBuffer{UInt64}. The buffer is cleared once per rule per iteration before ematch, then read during apply. No race with the buffer's own offset values stored in ematch_buffer.
  • New bind_segment_expr instruction in ematch compiler. Protocol: each segment patvar gets a local seg_offset_sym initialized to typemax(UInt64) ("no prior push"). On every re-entry (including backtracks), if seg_offset_sym != typemax, seg_buf.i is restored to that value. undoing both this segment's data and any
    inner segment data pushed after it. The typemax sentinel distinguishes a fresh first entry from a backtrack, preventing outer segment data from being messed up.
  • New ematch_compile_segment_expr! allocates σ addresses only for fixed children; the segment patvar gets de Bruijn address -2 (sentinel), plus an entry in segment_patvar_syms.
  • Updated yield_expr to push seg_offset_sym (buffer offset) for segment patvars instead of σaddr.
  • Generated ematchers have seg_buf as 6th argument with a default, so old 5-arg call sites still work.
  • cached_ids: segment patterns fall back to a head-hash-filtered scan of all e-classes (marked # EXPENSIVE). Non-segment rules still use the fast classes_by_op index, no regression.
  • instantiate_enode!: the p.has_segment_children check is one field read. When true, allocates a fresh VecExpr with the correct runtime arity and signature hash(p.name, hash(total_arity)), splatting segment children from seg_buf. WARNING: THIS IS EXPENSIVE
  • eqsat_search!: calls empty!(rule.segment_buffer) once before each rule's ematch, passes it as 6th arg.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants