CallElimCorrect: rewrite using small-step semantics by PROgram52bc · Pull Request #1306 · strata-org/Strata

PROgram52bc · 2026-06-01T17:53:55Z

Summary

This PR replaces the deprecated big-step CallElimCorrect proof with a small-step version, plus reusable infrastructure (SubstProps, CoreTransformProps) and shared-file helpers. The previous file carried a DEPRECATED notice flagging it for a small-step rewrite; this is that rewrite.

Top-level public theorem callElimStatementCorrect is preserved (now a conjunction over a _terminal arm and an _exiting arm), the contract semantics for procedure calls (EvalCommandContract.call_sem) is unified into a single failure-flag–parameterized rule, and the _terminal arm is fully polymorphic over the failure flag ({f : Bool}) — a source success trace maps to a target success trace and a source failure trace maps to a target failure trace, in one statement.

Net repository delta vs main2: +9,277 / -4,861 LoC across 19 files (the bulk being reusable infrastructure rather than CallElim-specific). Compared to the original 14,972-LoC big-step proof, this represents a ~-8k LoC reduction overall, while strictly strengthening what is proved.

What's added

The 19 files modified group into 4 layers:

Top-level proof (~10,525 LoC churn)

Strata/Transform/CallElimCorrect.lean: full rewrite against small-step semantics. Same public theorem name callElimStatementCorrect; structured as _terminal ∧ ∀ lbl, _exiting, mirroring Specification.Overapproximates and TripleBlock. The _terminal arm is polymorphic in the failure flag {f : Bool}; the _exiting arm reuses the original Heval for non-call cases and is vacuously discharged for the call case (a singleton call statement only ever reaches .terminal, never .exiting).

Reusable infrastructure (~2,747 LoC, new files)

Strata/Transform/SubstProps.lean (1,621 LoC, new): generic substitution-semantics infrastructure used by CallElim and reusable by future transforms. Includes H_asserts_zip / H_assumes_zip / H_check_block_zip walkers, singletonAssertEval / singletonAssumeEval and their _poly failure-flag variants, plus H_asserts_zip_fail (the L4 flag-flip walker that powers the polymorphic-f failure arm).
Strata/Transform/CoreTransformProps.lean (1,126 LoC, new): reusable helpers for the Core small-step layer — singleCmdToStmts / _poly, H_havocs / _poly, H_init, H_inits, genIdent* correctness, fresh-name predicates, EvalCallElim_glue and EvalCallElim_glue_fail (composing the L1–L6 segments with the L4 success/failure flip).

Core semantics adjustments (~30 LoC)

Strata/Languages/Core/StatementSemantics.lean: EvalCommandContract.call_sem unified — previously two separate path shapes (success: havoc + write-back; failure: σ' = σ and the precondition iff in the wrong direction); now a single rule parameterized by a Bool failure flag indicator, with the precondition iff narrowed to non-Free preconditions to match callElimCmd's actual emission rule. EvalCommand.call_sem propagates the callee body's terminal hasFailure flag to the caller (previously hardcoded to false, silently dropping any failures inside callee bodies).
Strata/Languages/Core/StatementSemanticsProps.lean (+297 LoC): 8 new helpers around updatedStates / readValues / evalExpressions / initStates, plus the revived 5-theorem refinement-transit chain (EvalCommandRefinesContract, StepStmt[Star]_refines_contract, EvalStatement[s]RefinesContract) over the unified call_sem shape. EvalCallBodyRefinesContract is closed (was a long-standing sorry inside a comment block).
Strata/Languages/Core/WF.lean: WFCallSiteProp.preBoolTyped and WFPrePostProp.boolTyped per-precondition Bool-totality clauses, used by the failure-arm to derive a failing-pre witness.
Strata/Transform/ProcBodyVerifyCorrect.lean: takes CalleesNoFailure π φ as a new hypothesis (every procedure body, run from a non-failing init env, terminates with hasFailure = false), threaded through core_noFailure_preserved. The hypothesis is exactly the body-noFailure clause that clause 2 of the theorem's own conclusion (ProcedureCorrect) establishes per procedure, so callers discharge it inductively.

Shared-file helpers (~250 LoC)

Strata/Languages/Core/StatementSemanticsProps.lean (above): 8 new shared helpers consumed by both CallElim and its sibling Core proofs.
Strata/DL/Util/ListUtils.lean (+150 LoC net): adds notin_append4, unzip_snd_length, disjoint_of_nodup_append_three, zip_pair_split, nodup_3_decompose, notin_3_append_of, zip_zip_unzip_{snd,fst_unzip_fst}_of_lengths, zip_notin_{fst,snd}_pair; promotes existing helpers to public.
Strata/DL/Imperative/CmdSemanticsProps.lean (new, +33 LoC): generalized pure-Imperative lemmas subst_defined_tail, subst_nodup_tail (P : PureExpr).
@[expose] attributes added where required by the new module-system reductions:
- Strata/DL/Imperative/CmdSemantics.lean: isNotDefined, substStores, substDefined, substNodup, invStores, substSwap, WellFormedSemanticEvalVal.
- Strata/DL/Lambda/LExprWF.lean: substFvar, substFvars.
- Strata/DL/Util/StringGen.lean: StringGenState.gen.
- Strata/Languages/Core/CoreGen.lean: CoreGenState.WF, CoreGenState.gen.
- Strata/Languages/Core/Procedure.lean: Procedure.Spec.getCheckExprs, Procedure.Spec.updateCheckExprs.
- Strata/Languages/Core/StatementSemantics.lean: updatedState, updatedStates', updatedStates, WellFormedCoreEvalTwoState.
- Strata/Transform/CoreTransform.lean: ~17 generator/transform defs.
- Strata/Transform/CallElim.lean: callElimCmd, plus an extracted oldTyLookupCallElim helper required for the proof to share auxiliary match definitions across two do-blocks.

Notable commits

The branch is structured so each landed commit can be reviewed independently:

286c6ce36 — initial small-step rewrite (preserves the original _terminal-only theorem at f = false).
3710bb339 — add the exit-arm to callElimStatementCorrect (review feedback from @aqjune-aws): mirrors Specification.Overapproximates shape; non-call cases reuse the original Heval, the call case is vacuously discharged via seq_reaches_exiting.
0eb0342ee — review-feedback rename *Semantics → *SemanticsProps and relocate generic lemmas to ListUtils / CmdSemanticsProps / SubstSemanticsProps.
4bfccc31b — integrate hasFailure flag into EvalCommand.call_sem (closes a soundness gap inherited from the original semantics).
e679bd560 — second review-feedback rename *SemanticsProps → *Props (the Semantics qualifier was misleading: these files hold transform machinery and substitution lemmas, not semantics defs).
44deb44b2 — close EvalCallBodyRefinesContract and revive the dormant Core refinement bridge.
ba122da08 — unify EvalCommandContract.call_sem (failure-flag-parameterized) and revive the 5-theorem refinement-transit chain.
3daec283b — cleanup pass: 6 unused theorems in CoreTransformProps, 5-helper dead chain in SubstProps, dangling docstring references, TODO disposition. Net -327 LoC.
be987fddc — lift _terminal to polymorphic {f : Bool} (PR CallElimCorrect: lift _terminal to polymorphic {f : Bool} #1340, merged into this branch).

Build status

lake build: clean (490 jobs)
0 sorries introduced (baseline sorry inventory in ProgramWF / ProcedureWF / StatementWF / CmdEval is unchanged; this PR closes one previously commented-out sorry in EvalCallBodyRefinesContract)
0 axioms introduced
#print axioms callElimStatementCorrect → [propext, Classical.choice, Quot.sound] (Lean stdlib only)

Test plan

lake build clean across all 490 jobs
#print axioms callElimStatementCorrect reports stdlib axioms only
No new sorries / admits / axioms introduced
Conjunctive shape of callElimStatementCorrect matches Specification.Overapproximates / TripleBlock
_terminal arm polymorphic in {f : Bool} (success and failure traces both covered)
Review-feedback follow-ups landed (exit arm, file renames, generic-lemma relocation)
CI passes

Warning: This repository will shortly undergo a split into several separate repositories. If you're creating a PR that crosses the boundaries between these repositories, you may want to hold off until the split is complete or be prepared to rework your PR into multiple PRs once the split is complete.

The code that will be moved includes:

Strata/DDM/*
Strata/Languages/Boole/*
Strata/Languages/Python/* along with Tools/Python/*
Tools/BoogieToStrata

Replaces the deprecated big-step `CallElimCorrect` proof with a small-step version, plus reusable infrastructure (`SubstSemantics`, `CoreTransformSemantics`) and shared-file helpers. The previous file carried a `DEPRECATED` notice flagging it for a small-step rewrite; this is that rewrite. ## Summary - `Strata/Transform/CallElimCorrect.lean`: 4597 -> 3814 LoC (-17%). Same top-level theorem name (`callElimStatementCorrect`); 6 public declarations preserved. - New `Strata/Transform/SubstSemantics.lean` (1721 LoC): generic substitution-semantics infrastructure (used by CallElim; reusable by future transforms). - New `Strata/Transform/CoreTransformSemantics.lean` (1151 LoC): reusable helpers for the Core small-step layer (`singleCmdToStmts`, `H_havocs`, `H_init`, `H_inits`, `genIdent*` correctness, fresh-name predicates, etc.). Net repository delta: +2,345 LoC (6,916 insertions / 4,571 deletions across 13 files), with the bulk being reusable infrastructure rather than CallElim-specific. Compared to the original 14,972-LoC big-step proof, this represents a -8,056 LoC reduction overall. ## Changes to shared files - `Strata/Languages/Core/StatementSemanticsProps.lean` (+159 LoC): 8 new helpers (`updatedStates_get_notin`, `readValues_get`, `evalExpressions_get`, `readValues_updatedStatesSame`, `evalExpressions_isDefined_flatMap`, `initStates_get_notin`, plus 2-/3-layer fall-through variants). - `Strata/DL/Util/ListUtils.lean` (+60 LoC net): adds `notin_append4`, `unzip_snd_length`, `disjoint_of_nodup_append_three`, `nodup_append_three_disjoint`; promotes existing helpers to `public`. - `@[expose]` attributes added where needed by the new module-system reductions: - `Strata/DL/Imperative/CmdSemantics.lean`: `isNotDefined`, `substStores`, `substDefined`, `substNodup`, `invStores`, `substSwap`, `WellFormedSemanticEvalVal`. - `Strata/DL/Lambda/LExprWF.lean`: `substFvar`, `substFvars`. - `Strata/DL/Util/StringGen.lean`: `StringGenState.gen`. - `Strata/Languages/Core/CoreGen.lean`: `CoreGenState.WF`, `CoreGenState.gen`. - `Strata/Languages/Core/Procedure.lean`: `Procedure.Spec.getCheckExprs`, `Procedure.Spec.updateCheckExprs`. - `Strata/Languages/Core/StatementSemantics.lean`: `updatedState`, `updatedStates'`, `updatedStates`, `WellFormedCoreEvalTwoState`. - `Strata/Transform/CoreTransform.lean`: ~17 generator/transform defs. - `Strata/Transform/CallElim.lean`: `callElimCmd`, plus an extracted `oldTyLookupCallElim` helper required for the proof to share auxiliary `match` definitions across two `do`-blocks. ## Testing Full repo build green (602 jobs); StrataTest green (617 jobs); zero warnings.

aqjune-aws

I left a few comments which can be applied orthogonally to our ongoing discussion in a private channel.
One extra question is that, I think the correctness of the top-level statement must also consider the case when a program may 'throw', through Imperative.exit. For example,

call f();
exit "some-block";

Note that the original program exits to "some-block", hence the final state after small-steps isn't ".terminal" (unlike what EvalStmtsSmall states, which is used transitively by EvalStatementsContract). If we apply call-elimination on this,

(omit temporary vars)
assume <pre>;
assert <post>;
exit "some-block";

it will have to exit to the right state.

@aqjune-aws

Address PR #1306 review comment from @aqjune-aws: the previous statement of `callElimStatementCorrect` only covered programs reaching `.terminal`, silently excluding programs whose top-level execution ends in an `.exiting lbl ρ` configuration (e.g., `[call f(); exit "L"]` outside an enclosing block). `EvalStatementsContract` unfolds to `StepStmtStar (.terminal _)`, so such runs failed the hypothesis and were vacuously "covered" by the existing proof. Mirror the conjunctive shape used by `Specification.Overapproximates` (Specification.lean:594) and `TripleBlock` (Specification.lean:211): return `(terminal-arm) ∧ (∀ lbl, exiting-arm)`. Refactor into three theorems for clarity: - `callElimStatementCorrect_terminal` (private, unchanged): the existing call-elim chain via `EvalCallElim_glue`. - `callElimStatementCorrect_exit` (private, new): non-call cases reuse the original `Heval` via `callElimStmt_non_call_eq` (sts = [st]); the call case is vacuously discharged because `step_cmd` only ever yields `.terminal`, so `(.stmts [.cmd (.call ...)] _) →* .exiting lbl _` is unreachable (proven by `seq_reaches_exiting` plus inversion). - `callElimStatementCorrect` (public): assembles the two arms. No label-collision hazard: callElim only introduces fresh temp idents, not block labels, so exit labels transport unchanged. Zero external callers, so the conjunctive signature change is free downstream.

@aqjune-aws

…nticsProps Address PR #1306 review feedback from @aqjune-aws: * Files holding only lemmas (not actual semantics defs) follow the existing repo convention `*SemanticsProps.lean` (e.g., `Strata/Languages/Core/StatementSemanticsProps.lean`): - `Strata/Transform/CoreTransformSemantics.lean` -> `Strata/Transform/CoreTransformSemanticsProps.lean` - `Strata/Transform/SubstSemantics.lean` -> `Strata/Transform/SubstSemanticsProps.lean` Audited `SubstSemantics.lean`'s deps: still requires `Core.Transform.createFvar(s)` and `singletonAssertEval`/ `singletonAssumeEval`, so it stays under `Strata/Transform/`. * Generic lemmas relocated to their proper homes: - Pure list lemmas -> `Strata/DL/Util/ListUtils.lean` (now in `List` namespace): `zip_pair_split`, `nodup_3_decompose`, `notin_3_append_of`, `zip_zip_unzip_snd_of_lengths`, `zip_zip_unzip_fst_unzip_fst_of_lengths`, `zip_notin_fst_pair`, `zip_notin_snd_pair` - Pure-Imperative lemmas -> new `Strata/DL/Imperative/CmdSemanticsProps.lean` (generalized from CoreStore/Expression.Ident to P : PureExpr, now in `Imperative` namespace): `subst_defined_tail`, `subst_nodup_tail` - Core-substitution helper moved out of `CallElimCorrect.lean` into `SubstSemanticsProps.lean`: `substDefined_of_app` * Imports updated: `CallElimCorrect`, `SubstSemanticsProps`, and `Strata/DL/Imperative/Imperative.lean` (added new `CmdSemanticsProps` import).

PROgram52bc · 2026-06-04T21:01:08Z

Fixed in 3710bb3. The theorem now has a conjunctive shape, refactored into three theorems: _terminal (private, unchanged), _exit (private, new for the call case), and a public conjunction. No external callers needed updating.

…ep-on-main2 # Conflicts: # Strata/Transform/CallElimCorrect.lean

Layer-A small-step semantics for procedure calls now propagates the callee body's terminal hasFailure flag to the caller. Previously, EvalCommand.call_sem hardcoded the output flag to `false`, silently dropping any failures inside callee bodies (a soundness gap inherited from the original semantics). The change exposes a previously-vacuous case in `evalCommand_failure_implies_assert_ff` (StatementSemanticsProps.lean): when call_sem yields `true`, the failing assert lives somewhere inside the callee body, not at the call site itself, so the lemma's existing existential `∃ a, coreIsAtAssert (.stmt (.cmd c) ρ) a ∧ …` cannot be discharged structurally. Closure strategy: a new precondition `CalleesNoFailure π φ` asserts that every procedure body in `π`, run from a non-failing init env, terminates with `hasFailure = false`. This is threaded through both `evalCommand_failure_implies_assert_ff` and `core_noFailure_preserved`. The call_sem arm of the former is then discharged by applying `hCallees` to the body trace bound by call_sem and contradicting the resulting `false` against the `true` index. Downstream: `procBodyVerify_procedureCorrect` (the sole consumer of `core_noFailure_preserved`) takes `CalleesNoFailure` as a new hypothesis. The hypothesis is exactly the body-noFailure clause that clause 2 of `ProcedureCorrect` (this theorem's own conclusion) establishes per procedure, so callers can discharge it inductively across all procedures in `π` once the program is verified. No new sorries, no axioms, no termination admits. Build clean (488 jobs).

aqjune-aws · 2026-06-04T23:01:17Z

Thanks for adding the exiting cases. It is good to see them proven :)

I have two high-level important comments which will more be related to what is to be proven.

The call_sem of EvalCommandContract

You would have already figured out the issue in call_sem. To talk about its proper fix, let me introduce the difference between the small-step semantics of assume and assert first:

For assume, if the program state doesn't satisfy its condition, it is simply stuck; it doesn't proceed.
For assert, even if the program state doesn't satisfy the condition, it proceeds; it only updates the hasFailure flag.
You can confirm this from EvalCmd which has eval_assert_pass, eval_assert_fail and eval_assume. This is designed in this way so that even if a program includes multiple asserts and the first assert fails, we can still talk about the second assert individually (without using hasFailure, because hasFailure will already have been marked as true in this case).

Now coming to call_sem... When we are calling a procedure and the procedure has pre-post condition, from the caller's side I think a natural flow would be (1) we assert the precondition, (2) execute the body, and (3) assume the postcondition. It means that actually this part is too strong:

    (∀ pre, (Procedure.Spec.getCheckExprs p.spec.preconditions).contains pre →
      isDefinedOver (HasVarsPure.getVars) σAO pre ∧
      δ σAO pre = .some HasBool.tt) →

This has to be something like this:

    preconditionPassed <-> (∀ pre, (Procedure.Spec.getCheckExprs p.spec.preconditions).contains pre →
      isDefinedOver (HasVarsPure.getVars) σAO pre ∧
      δ σAO pre = .some HasBool.tt) →
...
    ----
    EvalCommandContract π δ σ (.call n callArgs md) σ' (\neg preconditionPassed)

Connection between EvalCommand and EvalCommandContract

I thought EvalCommand should be the ground-truth operational semantics for procedure call in a closed program, because when we do a procedure call, its interpreter will actually get into the body of the procedure call and concretely execute stuffs. How are EvalCommand and EvalCommandContracts related? Is there a theorem stating equivalence or refinement relation betweeen these two?

…icsProps -> SubstProps Address PR #1306 follow-up review comment: the *Semantics qualifier in the prior names is misleading since CoreTransform.lean and SubstSemantics.lean do not define formal semantics — they hold transform machinery and substitution lemmas respectively. The Props-suffix convention should match the root file name without an inserted Semantics qualifier. Strata/Transform/CoreTransformSemanticsProps.lean -> Strata/Transform/CoreTransformProps.lean Strata/Transform/SubstSemanticsProps.lean -> Strata/Transform/SubstProps.lean Updated importers (CallElimCorrect.lean, SubstProps.lean) and one stale docstring reference in CmdSemanticsProps.lean.

PROgram52bc · 2026-06-04T23:32:35Z

The call_sem of EvalCommandContract

Thanks for pointing it out. That makes sense. I will weaken the precondition requirement in call_sem, and adapt the proof accordingly.

Connection between EvalCommand and EvalCommandContract

Here is a potential theorem we could prove to connect the two. It is currently commented out because the contract version doesn't create the extra variables in the store, which makes the proof a little more complicated than it needs to be. But thanks for the question -- I will try to look into that next.

@aqjune-aws

…ondition Per @aqjune-aws review feedback, model the procedure-call contract following the assert/assume distinction in EvalCmd: * Caller asserts the precondition (flag-encoded violation). * Body abstraction = havoc + postcondition assume (stuck on violation). * Caller assumes the postcondition (no flag, only stuckness). Implementation: split the existing single `call_sem` constructor into two arms mirroring `eval_assert_pass` / `eval_assert_fail`: * `call_sem` (renamed in spirit to "pass arm" via docstring) — the precondition holds (∀ pre, … = some tt); execution proceeds through havoc + postcondition assume; conclusion σ' false. Premise list and binders unchanged from the prior single rule, so all downstream destructures (notably CallElimCorrect.lean's `cases Hcc | call_sem` at lines 1393-1398) continue to work verbatim. * `call_sem_pre_fail` (new) — some precondition fails (∃ pre, … = some ff); execution halts at the assert violation; store unchanged (σ' = σ), flag = true. Mirrors eval_assert_fail. Drops post-precondition premises (HavocVars / postcondition assume / output ReadValues / UpdateStates) since execution does not reach those phases. The new fail arm is currently inert in proofs because: * CallElimCorrect's terminal arm extracts `Hcc : EvalCommandContract … σ' false` (failure index pinned to false). Lean's index discrimination during `cases Hcc` auto-discharges the new arm — its conclusion `σ true` cannot unify with the σ' false hypothesis. * core_noFailure_preserved is stated against EvalCommand (concrete semantics), not EvalCommandContract, so it is unaffected. Build clean (488 jobs); zero new sorries; no axioms. Pre-existing sorries unchanged at 22 (only 2 outside comment blocks: ProgramWF and the inactive EvalCallBodyRefinesContract bridge). Diff: Strata/Languages/Core/StatementSemantics.lean +27/-1. Postcondition violations remain stuck-on-violation (no derivation), mirroring `eval_assume`'s single-rule design. The verifier's postcondition discharge happens on the callee side via `procBodyVerify_procedureCorrect` (clause 2 of ProcedureCorrect), not at the contract rule.

…for precondition" This reverts commit b6904b0.

The dormant Core refinement bridge from concrete EvalCommand to abstract EvalCommandContract has carried a long-standing `sorry` inside a `/- ... -/` block-comment at StatementSemanticsProps.lean's EvalCallBodyRefinesContract since commit 74e7be4 (2025-08-08). This commit closes that sorry and brings the leaf bridge theorem out of the comment block. Proof sketch: given concrete `call_sem` premises (body trace `CoreStepStar π φ (.stmts p.body ⟨σAO, δ, false⟩) (.terminal ρ')`, postcondition validity `δ ρ'.store post = some HasBool.tt`, ReadValues from ρ'.store, UpdateStates writeback), construct a matching `EvalCommandContract.call_sem` derivation by: 1. Take σO := updatedStates σAO (ListMap.keys p.header.outputs) modvals where modvals is the concrete-trace ReadValues witness. 2. Discharge `HavocVars σAO outputs σO` by induction on outputs (havoc admits any value; we choose modvals[i] for output i). 3. Discharge `ReadValues σO outputs modvals` directly from σO's definition via `readValues_updatedStatesSame`. 4. Discharge `δ σO post = some HasBool.tt` via congruence: σO and ρ'.store agree at output keys (both = modvals[i] there); under `p_post_scoped`, postcondition free vars are confined to outputs; by `WellFormedSemanticEvalExprCongr`, agreement on FVs implies equal evaluation; transport concrete Hpost to σO. 5. UpdateStates is reused verbatim from the concrete trace. The theorem now takes three new outer hypotheses, each mathematically true and derivable from upstream WF predicates: * `WellFormedSemanticEvalExprCongr δ` — a standard evaluator well-formedness assumption (existing congruence machinery). * `(ListMap.keys p.header.outputs).Nodup` — output parameter names are distinct in the procedure header. * `p_post_scoped : ∀ post ∈ postconditions, getVars post ⊆ p.header.outputs.keys` — postconditions only mention output parameters (a scoping assumption typical for procedure contracts). Implementation note: the destructure `cases H | call_sem` uses the `generalize hf : (false : Bool) = b at H` trick (borrowed from `evalCommand_failure_implies_assert_ff`) because the `false` index cannot auto-substitute under `cases` when ρ' is existentially bound in the constructor. The five dependent transit theorems (EvalCommandRefinesContract, StepStmt_refines_contract, StepStmtStar_refines_contract, EvalStatementsRefinesContract, EvalStatementRefinesContract) remain inside a fresh `/- ... -/` block following the proof, with an explanatory header noting they need the three new outer hypotheses threaded through. This is a follow-up; closing the leaf bridge is the load-bearing step. Build clean (488/488 jobs); zero new sorries; no axioms; no termination admits. Net diff: +109/-12 LoC in Strata/Languages/Core/StatementSemanticsProps.lean.

Unify `EvalCommandContract.call_sem` to be parameterized by a Bool failure flag indicator. The precondition status is connected to `failed` via an iff: the call fails iff some precondition fails to evaluate to `tt` at `σAO`. On the success path (`failed = false`) the result store is produced by havoc + write-back via `UpdateStates`; on the failure path (`failed = true`) the result store is unchanged. Revive the 5 refinement-transit theorems (View B: f = false-only) threading 3 outer hypotheses (`δ_wfCong`, `π_wf` packaging `p_outputs_nodup` and `p_post_scoped`, plus a `h_no_fail` non-failure witness that back-propagates via `StepStmtStar_hasFailure_monotone`): - EvalCommandRefinesContract - StepStmt_refines_contract - StepStmtStar_refines_contract - EvalStatementsRefinesContract - EvalStatementRefinesContract Adjust the `call_sem` destructure in CallElimCorrect.lean to re-synthesize the legacy combined `Hpre` and the success-path `Hupdate` from the new bool-indicator-shaped premises (the destructure site is pinned to `failed = false`).

Resolves two conflicts introduced by upstream commit 0044067 (#1327) "Factor out theorems and lemmas into dedicated Props files": * Strata/Languages/Core/StatementSemanticsProps.lean — drop the local `import all Strata.DL.Imperative.SemanticsProps` (file deleted upstream) and adopt upstream's `Strata.DL.Imperative.StmtSemanticsProps` import, which now re-houses `StepStmtStar_hasFailure_monotone` and friends. * Strata/DL/Imperative/CmdSemanticsProps.lean — add/add conflict. Take upstream's full content (isDefined / isNotDefined / Init/UpdateState / set commutation lemmas) and append the local `subst_defined_tail` and `subst_nodup_tail` helpers, plus the Nodup import they require. Follow-up fixes after the merge: * Strata/Transform/SubstProps.lean and CallElimCorrect.lean — update `import Strata.DL.Lambda.Lambda` references to `Strata.DL.Lambda` for the Lambda.lean → Lambda.lean rename (also from #1327). * Strata/Languages/Core/StatementSemanticsProps.lean — pass explicit `Expression`, `EvalCommand π φ`, `EvalPureFunc φ` arguments to `StepStmtStar_hasFailure_monotone` (the upstream version of the lemma binds these from the surrounding section's `variable` declarations, not as implicits). `lake build` passes cleanly after resolution.

…sition Audit-driven cleanup of branch htd/callelim-smallstep-on-main2 (no logic changes; build clean at 490 jobs): DELETIONS (~327 LoC removed across 6 files): - Strata/Transform/CoreTransformProps.lean: removed 6 unused theorems (genOldExprIdentsTrip_snd, genIdentGeneratedWF, genOldExprIdentsTripGeneratedWF, genOldExprIdentsTrip_isOldTempIdent, genArgExprIdents_length, genOutExprIdents_length). Each grep-confirmed zero callers across Strata/. - Strata/Transform/SubstProps.lean: removed 5-helper dead chain (createAsserts_list, createAssumes_list, H_check_block, H_asserts, H_assumes). The *_zip variants are the ones actually used by CallElimCorrect; these unzipped variants have no consumers. - Strata/DL/Util/ListUtils.lean: removed List.nodup_append_three_disjoint (specialization of base lemma; no callers). - Strata/Languages/Core/StatementSemanticsProps.lean: removed 10-line commented InitVarsNotDefMonotone' stub (would not typecheck if uncommented). COMMENT/DOCSTRING FIXES (~12 sites): - Strata/Transform/CallElimCorrect.lean: - line 1091: 'six call-site WF clauses' → 'seven' (WFCallSiteProp has 7 fields); dropped legacy pointer to Hpre_post_lhs_disj (no longer in the codebase). - line 1221: 'two halves of legacy Hwfgenst' → 'three fields' (WFGenStateProp has 3 including wfgen). - line 1300: 'see WFCallSiteProp in Strata/Languages/Core/WF.lean' → 'see WFCallSiteProp above (line 1095 of this file)'. - line 1390: 'call_sem hardwires the failure flag to false' → accurate comment about Hcc being pinned to failed=false at this destructure (call_sem itself is failure-flag–parameterized). - Strata/Transform/SubstProps.lean: dropped 5 dangling references to removed legacy theorems (Lambda.LExpr.substFvarCorrect, ReadValuesUpdatedStates, etc.). - Strata/Transform/CoreTransformProps.lean: dropped 2 dangling references (ReadValuesUpdatedState, EvalExpressionUpdatedState). - Strata/Languages/Core/StatementSemanticsProps.lean: fixed 'assymmetry' → 'asymmetry' typo. TODO DISPOSITION: - Strata/Transform/CallElim.lean:171: replaced bare `/- TODO -/ false` with explicit comment documenting that the validation hook is intentionally conservative — call elimination replaces a callee body with its contract (over-approximation); the hook returns false until per-obligation proof witnesses are available. Logical value preserved. REFINEMENT GENERALIZATION INVESTIGATED (no change): - Investigated generalizing EvalCallBodyRefinesContract from `false → false` to a Bool failure-flag-indexed signature. Two structural obstacles make strict generalization infeasible: (1) σ' mismatch — concrete EvalCommand.call_sem always writes back via UpdateStates, while contract EvalCommandContract.call_sem forces σ' = σ when failed = true. (2) Precondition iff incompatibility — concrete failed = true (body asserts fail) carries Hpre saying ALL preconditions hold, but the contract iff says failed = false IFF all preconditions hold, so all-pre-hold forces contract failed = false. The current View B (false-only) signature is therefore retained. Bridging would require either weakening the contract's failure-path σ' = σ semantics or adding a procedure-correctness premise; both are out of scope for this cleanup pass. Build clean (490 jobs); zero new sorries; no axioms. Working-tree sorry inventory unchanged from pre-cleanup baseline (all sorries live in ProgramWF/ProcedureWF/StatementWF/CmdEval — files this cleanup did not touch).

aqjune-aws · 2026-06-05T21:51:20Z

Hi David, thanks for working on this topic - I think that including the refinement proof might be a too big work & interfere with the unstructured CFG. I will be glad to work on the task.
In parallel, I am trying to port a few updates that fixes problems in small-step semantics and extending the definition of correctness of transformation.
Could you wait for a while until the updates from my end are delivered please? :)

PROgram52bc · 2026-06-05T21:57:32Z

Hi June, Yes. I am still working on adapting the call elimination proof to work with a generic failure flag (the case where preconditions fail). So I can definitely wait for your updates while iterating on that. I think it would be good to reuse some of the definitions in your updates in the current call-elimination proof.

As for the refinement proof, I do think it is quite complicated with the failure flag integrated -- if we constrain it to failed = false, I believe the proof should be straightforward, but that's just a conjecture.

## Summary Lifts `callElimStatementCorrect_terminal` from the parent branch's `f = false → false` shape (PR #1306, success-only refinement) to fully polymorphic `{f : Bool}`: a source trace ending in failure maps to a target trace ending in failure, and a source success trace maps to a target success trace, in one statement. Downstream consumers no longer case-split on the failure flag at the call site. This is a foundational refactor + generalization, not a feature. Reviewers should follow the proof structure: Stages 1-3 are pure extractions (success theorem unchanged), Path 1 narrows a public iff for semantic alignment, Stage 6 (split into a scaffold and a fill-in) lands the actual polymorphic-f result. ## Base branch This PR sits on top of `htd/callelim-smallstep-on-main2` (PR #1306, currently open). Merge into that branch, or rebase onto `main` once #1306 lands. ## Why this matters The parent's success-only theorem suffices to argue that call-elimination preserves the *absence* of bugs, but it leaves a dual gap: it does not say anything about preserving the *presence* of bugs. Without the failure arm, callElim could in principle turn a buggy program into one that verifies. Lifting `_terminal` to `{f : Bool}` makes that argument compositional — the same lemma covers both directions, and any future transform layered on top of call-elimination inherits the polymorphic guarantee for free. The failure arm also exercises corners of the contract semantics the success arm never touches: the assert-zip on call-site preconditions, havoc of out-parameters in the failure branch, and the assume-zip on call-site postconditions all have to compose correctly under `hasFailure = true`. Closing it flushed out a real semantic mismatch in `EvalCommandContract.call_sem` (Path 1 below) that was invisible from the success-only side. ## Commit-by-commit walkthrough Read the seven commits in order: ### 1. 193dee2 — WIP: polymorphic-f infrastructure (no live consumers) Forward-looking scaffolding, kept as a separate commit so that the consumer-side proof changes in later commits aren't buried under unrelated infrastructure. Lands in: - `Strata/Languages/Core/StatementSemantics.lean` — **Change A**: collapses the two conditional `Hupdate` premises of `EvalCommandContract.call_sem` into a single unconditional `UpdateStates` premise. The failure path no longer pins `σ' = σ`, which is what makes a polymorphic-f rule shape coherent. - `Strata/Languages/Core/WF.lean` — `WFPrePostProp.boolTyped` clause added (zero migration cost, no live construction sites yet). - `Strata/Transform/CoreTransformProps.lean` and `Strata/Transform/SubstProps.lean` — new helpers `singleCmdToStmts_poly`, `singletonAssumeEval_poly`, `H_havocs_poly`, `H_check_block_zip_poly`, `H_assumes_zip_poly`, `singletonAssertFailEval`, `singletonAssertEval_poly`, plus the substantive new walker `H_asserts_zip_fail` (L4 flag-flip from `false` to `true` via a failing-pre witness) and `EvalCallElim_glue_fail` (composes L1-L6 with the L4 flip). Build clean, sorry-free, axiom-free at this commit. ### 2. 580ef2d — Stage 1: extract `HoldEval_bridge_at_σO` Pure refactor. Lifts the per-index δ-eval bridge for `mkOld` old-variable fvars at the post-havoc store `σO` out of `callElimStatementCorrect_terminal`'s call arm into a private theorem with 11 explicit hypotheses. Net diff in `Strata/Transform/CallElimCorrect.lean`: +114 / -61. Same proof body as the inline version; the signature overhead is intentional because Stage 6 will share this helper between the success and failure arms. Sorry count and axiom set unchanged. ### 3. 5406b08 — Stage 2a: extract `HoldSubBridge_at_σO` Pure refactor in `Strata/Transform/CallElimCorrect.lean`. Lifts the per-fvar bridge for `createOldVarsSubst`'s codomain at the L6 intermediate stores `σ_R1` / `σO` into a private theorem with 6 hypotheses; the inline block becomes a one-line call. Net diff: +82 / -29. Sorry count and axiom set unchanged. `HinputSubBridge` (the larger sibling, originally targeted for Stage 4) is deferred, then ultimately made redundant by the flag-generalization approach (see "Stages 4 & 5: skipped" below). ### 4. cf10129 — Stage 3: extract `b1`/`b2_var_witness_at_oldSubst` Pure refactor in `Strata/Transform/CallElimCorrect.lean`. Lifts the two class-(b) substitution-decomposition witnesses out of the call arm: - `b1_var_witness_at_oldSubst`: when a `Map.find?` lands in the `createOldVarsSubst` segment of `oldSubst_L6`, the codomain entry is a fresh-old `createFvar gen` and the witness `var ∈ getVars w` forces `var = genOldIdents[i]`. - `b2_var_witness_at_oldSubst`: when the lookup misses `createOldVarsSubst` and hits `callElim_inputOnlyOldSubst`, the codomain is a positional `inArgs` element. Both helpers take `proc'` and `args` directly so `callElim_inputOnlyOldSubst` can be referenced by name (needed for `find?_append_none_elim` unification at the consumer). 4 callsites each in `_terminal` (inv + post variants); Stage 6's failure arm reuses the same callsites with identical arguments. Net diff: +123 / -77. ### Stages 4 & 5: skipped Originally planned to extract `HinputSubBridge` and a final shared-plumbing block. The flag-generalization approach made these unnecessary — the call branch in `_terminal` `cases` on `f`, and the `false` arm reuses its existing 2200-LoC body verbatim, so we never needed shared sub-helpers between the two arms. Keeping the inline block on the `false` side was strictly cheaper than extracting it. ### 5. fdf2fc4 — Path 1: narrow `EvalCommandContract.call_sem` iff to non-Free preconditions This is the semantic alignment that makes the failure arm provable. `callElimCmd` only emits asserts for non-`Free` preconditions (Free requires are assumed by the implementation, not checked at call sites). The previous `call_sem` iff ranged over **all** preconditions, which made the polymorphic-f failure arm unattainable: a witness `pre` with `eval ≠ tt` at `σAO` could fall in the Free segment, giving no L4 entry to flip the flag. That mismatch was harmless at `f = false` (the success arm never relies on the iff direction that exposed it) but a hard blocker on the failure arm. Restricting the iff to non-Free preconditions aligned the two semantics without changing any existing proof: - `Strata/Languages/Core/StatementSemantics.lean`: `call_sem` rule's `Hpre_def` and `Hpre_iff` now range over `getCheckExprs (preconditions.filter (·.attr ≠ .Free))`. - `Strata/Languages/Core/StatementSemanticsProps.lean`: `EvalCallBodyRefinesContract` / `EvalCommandRefinesContract` / `StepStmt[Star]_refines_contract` / `EvalStatement[s]RefinesContract` gain `[LawfulBEq Expression.Expr]` instances (needed for the contains↔mem bridge in the new `presFiltered_subset` derivation that powers the rule constructor). - `Strata/Transform/CallElimCorrect.lean`: success arm's `Hpre_evalTt` / `Hpre` re-synthesis ranges over filtered preconditions; `HpreFiltered` no longer needs `filterCheck_in_getCheckExprs` — direct membership in `presFiltered` transports straight to the filtered contains-form. Failure-arm helper signature updated to take `Hpre_def` / `Hpre_iff` at the new filtered shape. Build clean (490 jobs) with one expected sorry at line 1654 (failure-arm body, filled by the next commit). ### 6. 45db77d — Stage 6 scaffold: lift `_terminal` to `{f : Bool}`, delegate failure arm Shape change without filling in the body. In `Strata/Transform/CallElimCorrect.lean`, `callElimStatementCorrect_terminal`'s signature lifts from `f = false → false` to `{f : Bool}`, with `Heval` and the conclusion both at flag `f`. Non-call cases close polymorphically via `nc_close` (`Heval` passes through unchanged for identity statements). The call branch dispatches: ```lean cases f with | false => existing 2200-LoC body verbatim (f=false case) | true => exact callElimStatementCorrect_terminal_call_arm_fail ... ``` The `_terminal_call_arm_fail` helper is declared with full signature (~25 hypotheses from the `cases Hcc` destructure) and a `sorry` body, ready for the final commit. Splitting the signature lift from the proof close means the lifted public signature is reviewable and buildable on its own, and the failure-arm proof body lands against an already-stable interface. Sorry count: 18 → 19 (temporary). ### 7. be35c7f — Stage 6: close polymorphic-f failure arm + lift public signature Replaces the sorry-stubbed `callElimStatementCorrect_terminal_call_arm_fail` in `Strata/Transform/CallElimCorrect.lean` with the full proof body (~2,148 LoC) that mirrors the success arm's L1-L3 + L5/L6 plumbing while flag-flipping at L4: - **L1, L2, L3**: identical to success (init segments produce `f = false → false`). - **L4 (asserts)**: `H_asserts_zip_fail` with failing-pre witness. - **L5 (havocs)**: `H_havocs_poly` at `f := true`. - **L6 (assumes)**: `H_assumes_zip_poly` at `f := true`. - **Glue**: `EvalCallElim_glue_fail` composing `false → false → false → false → true`. Failing-pre witness derivation (now unblocked by Path 1): 1. `Hpre_iff` at `flag = true`: `Classical.em` on "all preconditions eval to `tt`" → contrapositive gives `∃ pre ∈ presFiltered, δ σAO pre ≠ some tt`. 2. Bool-totality from new `WFCallSiteProp.preBoolTyped` field (mirrors the `boolTyped` clause on `WFPrePostProp`): `δ σAO pre = some tt ∨ ff`. 3. Combine → `∃ pre, δ σAO pre = some HasBool.ff`. 4. Transport `σAO ↦ σ_old` via `subst_fvars_correct` against `Hsubst_L4`. WF additions in `Strata/Languages/Core/WF.lean`: - `WFCallSiteProp.preBoolTyped` (~line 1156): per-precondition bool-totality. - `WFCallSiteSpec.preBoolTyped` (~line 1207): per-call specialization. - `WFCallSiteProp.specialize` forwards the new field. The exit arm is unchanged — call statements never produce `.exiting`. Sorry count: 19 → 18 (back to baseline). Build clean, no new axioms. ## Polymorphic helpers — all genuinely consumed | Helper | Source | Callsites | | --- | --- | --- | | `EvalCallElim_glue_fail` | `Strata/Transform/CoreTransformProps.lean` | 1 | | `H_asserts_zip_fail` | `Strata/Transform/SubstProps.lean` | 1 | | `H_havocs_poly` | `Strata/Transform/CoreTransformProps.lean` | 1 | | `H_assumes_zip_poly` | `Strata/Transform/CoreTransformProps.lean` | 1 | | `WFPrePostProp.preBoolTyped` | `Strata/Languages/Core/WF.lean` | 1 | Helpers are deliberately single-callsite for now: the polymorphic-f infrastructure is staged to be reusable, but until a second consumer materializes the helpers stay scoped to this proof and we avoid speculative API design. No dormant scaffolding. ## Soundness - **Build**: clean (490 jobs) - **New sorries**: 0 (baseline 18 in `Strata/Languages/Core/WF.lean` are pre-existing and unrelated to this work) - **New admits**: 0 - **New axioms**: 0 - **`#print axioms callElimStatementCorrect_terminal`**: `[propext, Classical.choice, Quot.sound]` — Lean stdlib only, no project-local axioms - **Helper consumption**: all 4 polymorphic helpers + `preBoolTyped` have ≥ 1 live callsite (verified above) - **Path 1 conservativity**: the narrowed `call_sem` iff matches `callElimCmd`'s actual emission rule (asserts only for non-Free precondition components); success-arm reproof is mechanical ## Commit list ``` be35c7f Stage 6: close polymorphic-f failure arm + lift public signature fdf2fc4 Path 1: narrow EvalCommandContract.call_sem iff to non-Free preconditions 45db77d Stage 6 scaffold: lift _terminal to {f : Bool}, delegate failure arm cf10129 Stage 3: extract b1/b2_var_witness_at_oldSubst helpers 5406b08 Stage 2a: extract HoldSubBridge_at_σO helper 580ef2d Stage 1: extract HoldEval_bridge_at_σO helper 193dee2 WIP: polymorphic-f infrastructure (no live consumers) ``` ## Test plan - [x] `lake build` clean across all 490 jobs - [x] Zero new sorries introduced (baseline 18 in `Strata/Languages/Core/WF.lean` preserved) - [x] Zero new admits/axioms introduced - [x] `#print axioms callElimStatementCorrect_terminal` → Lean stdlib only (`[propext, Classical.choice, Quot.sound]`) - [x] Each polymorphic helper genuinely consumed (`EvalCallElim_glue_fail`, `H_asserts_zip_fail`, `H_havocs_poly`, `H_assumes_zip_poly`, `WFPrePostProp.preBoolTyped` — 1 callsite each) - [ ] Reviewer: spot-check Stages 1 / 2a / 3 (580ef2d, 5406b08, cf10129) are pure refactors by diffing helper bodies against the inline blocks they replaced - [ ] Reviewer: confirm Path 1 (fdf2fc4) preserves success-arm soundness — `Hpre_evalTt` re-synthesis over filtered preconditions is the only consumer-side change in the success arm - [ ] Reviewer: confirm Change A in 193dee2 (collapsed `UpdateStates` premise on `call_sem`) preserves the parent branch's `false → false` semantics - [ ] Downstream consumers of `_terminal` (none on `htd/callelim-smallstep-on-main2` today; re-check after rebase if #1306 lands first)

## Summary Round 3 tier-1 cleanups for the polymorphic-`f` callElim work merged in #1340. These commits were authored on top of #1340's head before the squash merge but were not included; this PR brings them onto the parent branch. Net delta: **-287 LoC** (5 files, +181/-468) across 19 small, atomic commits. No new sorries, no new axioms, no behavioral change. ## Highlights | Commit | Description | LoC | |---|---|---| | 2bb66b3 | R3-1: collapse `WFCallSiteProp` into `WFCallSiteSpec` via def-wrapper | -64 | | ca67ea7 | R3-2: drop f=false corollaries; rely on poly versions | -121 | | 680e092 | R3-12: hoist `π/φ/δ` binders into file-scope `variable` block | -10 | | 26bf9ae | R3-10: extract `havocVars_3layer_lift` helper | -23 | | ed3725d | R3-8: add `Procedure.Spec.checkedPreconditions` abbrev | -12 | | 7b6fbb3 | R3-19: simplify HentryFail mem-zip pattern in fail arm | -10 | | 8e3bc3f | R3-32: flatten nested mem_append cases via rcases | -14 | Plus 12 smaller items: bool-totality cleanup, redundant hypothesis drops, helper extractions, doc fixes (stale "seven" counts, removed references to deleted `WFPrePostProp.boolTyped`, fixed inverted section header hierarchy, dropped legacy `Hwfgenst` references). ## Soundness - **Build clean**: 490 jobs, no warnings or errors - **Sorries**: 0 in modified files - **Axioms**: `callElimStatementCorrect` depends only on `[propext, Classical.choice, Quot.sound]` — Lean stdlib only - **No behavioral change**: each commit is a refactor verified by `lake build` ## Test plan - [ ] CI green - [ ] No sorry/axiom regressions vs base

github-actions Bot added the Core label Jun 1, 2026

PROgram52bc marked this pull request as ready for review June 2, 2026 21:56

PROgram52bc requested a review from a team as a code owner June 2, 2026 21:56

github-actions Bot added the Waiting-For-Review label Jun 2, 2026

PROgram52bc requested a review from atomb June 2, 2026 22:19

aqjune-aws reviewed Jun 4, 2026

View reviewed changes

PROgram52bc added 2 commits June 4, 2026 11:57

github-actions Bot added the Git conflicts label Jun 4, 2026

Merge remote-tracking branch 'origin/main2' into htd/callelim-smallst…

8a84e31

…ep-on-main2 # Conflicts: # Strata/Transform/CallElimCorrect.lean

github-actions Bot removed the Git conflicts label Jun 4, 2026

PROgram52bc requested a review from aqjune-aws June 4, 2026 21:28

PROgram52bc added 6 commits June 4, 2026 16:49

Revert "Core: split EvalCommandContract.call_sem into pass/fail arms …

6c87b84

…for precondition" This reverts commit b6904b0.

PROgram52bc mentioned this pull request Jun 7, 2026

CallElimCorrect: lift _terminal to polymorphic {f : Bool} #1340

Merged

9 tasks

PROgram52bc added 3 commits June 8, 2026 08:28

Merge branch 'main2' into htd/callelim-smallstep-on-main2

837cac1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CallElimCorrect: rewrite using small-step semantics#1306

CallElimCorrect: rewrite using small-step semantics#1306
PROgram52bc wants to merge 15 commits into
main2from
htd/callelim-smallstep-on-main2

PROgram52bc commented Jun 1, 2026 •

edited

Loading

Uh oh!

aqjune-aws left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

PROgram52bc commented Jun 4, 2026 •

edited

Loading

Uh oh!

aqjune-aws commented Jun 4, 2026

Uh oh!

PROgram52bc commented Jun 4, 2026

Uh oh!

aqjune-aws commented Jun 5, 2026

Uh oh!

PROgram52bc commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

PROgram52bc commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's added

Top-level proof (~10,525 LoC churn)

Reusable infrastructure (~2,747 LoC, new files)

Core semantics adjustments (~30 LoC)

Shared-file helpers (~250 LoC)

Notable commits

Build status

Test plan

Uh oh!

aqjune-aws left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

PROgram52bc commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aqjune-aws commented Jun 4, 2026

Uh oh!

PROgram52bc commented Jun 4, 2026

Uh oh!

aqjune-aws commented Jun 5, 2026

Uh oh!

PROgram52bc commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

PROgram52bc commented Jun 1, 2026 •

edited

Loading

PROgram52bc commented Jun 4, 2026 •

edited

Loading