MCSAT: L2O by ahmed-irfan · Pull Request #635 · SRI-CSL/yices2

ahmed-irfan · 2026-06-01T18:38:05Z

Work done by

…de improvements

Expose the l2o (learn-to-optimize) feature through a command-line/SMT2 option and gate when it runs relative to the target-cache recaching scheme. - Add a `bool l2o` mcsat option, wired as `--mcsat-l2o` / `:yices-mcsat-l2o` and auto-enabled for QF_NIA. - In the recache handler, when l2o is enabled drop the recache interval 300->50 and alternate: every even recache runs l2o_run() and clears the target cache (keeping the best cache), odd recaches fall back to trail_recache(count/2). Vary l2o's cache seeding so every 3rd run cold-starts. - Add a keep_best parameter to trail_clear_extra_cache(). - Handle finite-field terms (ARITH_FF_CONSTANT/ARITH_FF_POLY) in collect_free_vars so l2o can collect and then gracefully reject FF assertions instead of crashing.

The ff_plugin was no longer registered in mcsat_add_plugins (the ff_plugin_id field and the mcsat_add_plugin(..., ff_plugin_allocator, ...) call had been dropped on this branch). As a result no plugin owned FF_TYPE / ARITH_FF_* terms, so finite-field constraints went unenforced and QF_FFA problems could be reported sat when they are unsat. Restore the field and the registration line. Fixes the 4 failing tests/regress/mcsat/ff regressions (now unsat = gold); verified no other mcsat regression changes (differential over mcsat/{nra,bv,ff}: exactly those 4 flip to passing, zero new failures).

Harden the l2o evaluator/search against crashes: - Add l2o_term_is_supported(): walk each assertion and reject (skip l2o) any whose subterm kinds the cost function / evaluator can't handle. Checked before collect_free_vars, so neither collect_free_vars (get_composite NULL-deref) nor the evaluator (default-case longjmp into an already-returned setjmp frame) is reached for unsupported kinds. - Drop the input-sized stack VLAs: the evaluator's POWER_PRODUCT and ARITH_POLY cases now re-fetch already-memoized values instead of a vars_eval[n] array; hill_climbing's step_size[n_var] is heap-allocated. These could overflow the stack on large polynomials / many variables. - Route XOR_TERM through the unsupported path instead of silently storing INFINITY, and free eval_stack before the defensive longjmp.

A NaN cost (e.g. from Inf-Inf in a polynomial) made the search loop guard `best_cost > 0.0` false, so hill climbing aborted immediately without optimizing; an Inf/NaN seed could also leave did_improve unable to accept any move. - Loop guard `best_cost > 0.0` -> `!(best_cost <= 0.0)`: a non-finite cost is treated as unsatisfied so the search keeps running toward a finite cost. - did_improve now only moves to a finite cost, accepting any finite value when the current best is non-finite (escape a bad seed) and a strict improvement otherwise. Behavior for finite costs is unchanged.

Equality/disequality atoms were decided with bit-exact == / != on doubles produced by lossy q_get_double / pow / division, so terms that are mathematically equal could compare unequal (and vice versa), mis-scoring an assignment. Add an L2O_APPROX_EQ(x,y) macro (|x-y| within a magnitude-scaled tolerance) and use it in the evaluator (ARITH_EQ_ATOM, EQ_TERM, ARITH_BINEQ_ATOM) and in the disequality branches of l2o_calculate. Cheap branchless ops, so the hot evaluation loop stays fast. Also document that integer div/mod stays in double on purpose (exact within 2^53) for speed. Affects only the cost/hint, not soundness.

l2o_evaluator_run_term runs per candidate move during hill climbing. - Hoist the trace-tag lookup: trace_enabled(...) was called on essentially every term visited (36 sites); compute it once into a local bool. - Stop computing term_type_kind eagerly each iteration; inline it at its only (trace/assert) uses, so release builds don't compute it at all. - Skip the libm pow() call for the common small exponents in POWER_PRODUCT (exp 1 -> x, exp 2 -> x*x). Behavior-preserving (results only feed the hint); no test regression.

The evaluator's eval_map/eval_cache were double_hmap hash maps keyed by term id, but term ids are dense small integers so every lookup paid a hash. Replace them with term_double_map_t: a flat array indexed by the term id with a tracked "marked" set for O(#entries) reset -- the same sparse-array-map scheme as utils/tag_map / utils/mark_vectors. Lookups are now direct indexing, no hashing. Growth mirrors tag_map (with an extra size_t-overflow guard for the double allocation). Behavior-preserving: these maps were only ever used as term_id -> double (no key readback, no iteration). Verified identical regression output on mcsat/{nra,bv,ff}, and under a debug build l2o runs (up to 504 evaluator calls) exercise the new map with the consistency asserts active and no failures.

Record the parked Tier 1 perf idea (recompute only assertions touching the changed variable) with its design sketch and the two caveats: interaction with the evaluator eval_cache swap, and that the regression diff can't catch a wrong delta (l2o only emits a hint) so it needs a debug-only self-check assert. Benchmark before adopting.

double_hash_map (originally eval_hash_map, "adapted from int_hash_map") was added by the L2O module solely as the evaluator's term_id -> double value cache. The Tier 2 change replaced that with term_double_map, so the type now has no users. Delete it: the .c/.h, its src/Makefile entry, and the stale include in l2o.h. ~575 lines of dead, branch-introduced code. Verified: builds clean and the mcsat/{nra,bv,ff} regression set is unchanged.

Branch-introduced dead code with zero references anywhere (and not used by the FF code or anything else): - int_hset_close_and_sort (+ its now-unused int_array_sort.h include); kept int_hset_close and the is_closed debug guard, which are live. - the #if 0 iteration/remove/iterate scaffolding in int_hash_mmap (.c definitions and the matching .h declarations). Builds clean; mcsat/{nra,bv,ff} regression set unchanged.

These scratch files under examples/l2o/ were added in the original "Add L2O module" commit and never referenced by any test harness, build rule, or script.

Added by the L2O branch to int_vectors.h but never called anywhere.

get_composite now lives in terms.c and is used by both the preprocessor and l2o; the extraction note is done.

An empty libpoly feasibility set means the variable has no value consistent with the current trail -- a conflict the next propagation will report (l2o_run executes before it). l2o_search_state_create now detects this and skips the run instead of seeding a hint from a doomed assignment. Previously it passed the empty set to lp_feasibility_set_pick_value, whose emptiness assert is compiled out in release libpoly, causing a NULL deref (SIGSEGV) on many QF_NRA benchmarks (e.g. 20180501-Economics-Mulligan). Also initialize the search state before the n_var==0 early return so every early-return path leaves a valid (skippable) state.

l2o_set_hint converts each variable's value to a rational via double_to_mcsat_value -> mpq_set_d, which is undefined for NaN/Inf. Hill climbing should not produce a non-finite value (the cost functions saturate at finite points), but guard it anyway: skip the hint for that variable instead of feeding mpq_set_d a non-finite double. The hint is advisory, so leaving one variable unset is safe.

double_to_mcsat_value converted a hill-climbing double to a rational with an exact mpq_set_d, so e.g. 0.1 became a ~2^52 fraction that libpoly then carried (and amplified by polynomial degree) through exact arithmetic. Approximate instead by the simplest p/q with q <= 1024 using continued-fraction convergents: simple values are recovered exactly (0.1 -> 1/10, 2.4 -> 12/5, integers -> n/1), others are bounded. The hint is advisory and the search is O(log cap) of double/int64 ops, so the approximation is harmless and cheap. Large-magnitude (>= 2^52, already integral) values use the exact conversion; non-finite values are rejected defensively rather than reaching mpq_set_d.

No regression test ran with --mcsat-l2o, so the recent l2o crash fixes were unguarded. Add two QF_NRA benchmarks (from SMT-LIB 20180501-Economics-Mulligan) with .options enabling --mcsat-l2o: - MulliganEconomicsModel0064c (unsat): l2o_run hits a variable with an empty feasible set and bails; SIGSEGV'd before the empty-set fix. - MulliganEconomicsModel0055a (sat): l2o hill-climbing runs (~2900 eval steps) and sets hints, exercising the value->rational conversion.

Match the smtcomp2025 branch's recache heuristic: drop the extra non-base-level recache trigger at the top of the search loop, leaving only the base-level-gated trigger after propagation. The recache intervals and every-other-recache l2o split were already identical.

Change the cold-start cadence so the first two l2o runs after a recache reuse the cached (target/best) assignment as the hill-climbing start, and only every third run cold-starts. Previously the very first l2o run cold-started, discarding the warm cache already populated during search.

clang under -std=gnu23 with -Werror flags a declaration immediately following the case label (-Wc23-extensions). Wrap the case body in braces so the declaration sits inside a compound statement, matching the other cases in l2o_calculate.

…alized In release builds (-O2/-O3, NDEBUG) the asserts that guarantee a map hit are compiled out, so gcc sees term_double_map_find's out-param 'v' as possibly-unwritten in evaluator_get and evaluator_get_cache. These are static inline, so the warning propagated to every call site. The caller always guarantees a hit; initialize 'v' to 0.0 to satisfy the analysis without changing behavior.

coveralls · 2026-06-01T19:27:55Z

coverage: 68.804% (+0.01%) from 68.793% — elipparini:L2O_calc_update into SRI-CSL:master

elipparini and others added 30 commits December 13, 2024 20:07

Add L2O module

db3730d

Merge branch 'SRI-CSL:master' into L2O

d1cd257

fix bug no vars to hill_climbing

a84738c

Merge branch 'L2O' of github.com:elipparini/yices2 into L2O

05aff72

+l2o TODOs file

c38788b

Fixed some code issues.

2be8f0c

Fixed some code issues.

18272ff

Fixed code formatting for some files

0230318

Fixed memory leak in hill climbing, removed hill climbing dl list, co…

5506563

…de improvements

Removed unused files

823e91d

Moved EMPTY_KEY and DELTED_KEY of hmaps in .c files.

f80bda1

Renamed eval_hash_map to double_hash_map

a316c88

+l2o_apply_ls: classical local search cost fx

f96ac9b

assert

7aa41b1

removed redundant code with apply_l2o_ls

8e95001

Fixed some breask in l2o_apply

aad5167

Added proper conversion from double to mcsat_value

3dee3a4

minor code updates

7bcc088

Update function signature for l2o

5e9ca10

Portfolio 1

a53bfb7

Portfolio 2

b5b1df0

Portfolio 2 + delayed start

938d58d

Portfolio with start L2O

5b06a11

Start, no portfolio

b406321

Portfolio mod 13

3bb2705

L2O only at restarts

0eee5c8

Increased initial delay

8e129d9

No initial delay, cache clear every 2nd l2o run

2f8316b

Fixed cache memory leak

4ef03a9

Fixed another memory leak

d4a46ce

Ovascos and others added 27 commits August 20, 2025 14:56

fixed warnings

86c882c

Removed CNF cost fx

b0265ab

Merge branch 'master' into L2O_calc_update

17f7137

Minor fixes (missed cases and comments)

c01a070

minor

e443083

Merge branch 'SRI-CSL:master' into L2O_calc_update

b222628

Remove unused L2O example smt2 files

dc932c9

These scratch files under examples/l2o/ were added in the original "Add L2O module" commit and never referenced by any test harness, build rule, or script.

Remove unused ivector_append helper

af51f1e

Added by the L2O branch to int_vectors.h but never called anywhere.

Drop completed get_composite TODO

2f09218

get_composite now lives in terms.c and is used by both the preprocessor and l2o; the extraction note is done.

ahmed-irfan requested review from Ovascos and disteph June 1, 2026 19:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MCSAT: L2O#635

MCSAT: L2O#635
ahmed-irfan wants to merge 131 commits into
SRI-CSL:masterfrom
elipparini:L2O_calc_update

ahmed-irfan commented Jun 1, 2026 •

edited

Loading

Uh oh!

coveralls commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ahmed-irfan commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ahmed-irfan commented Jun 1, 2026 •

edited

Loading