CallElimCorrect: rewrite using small-step semantics#1306
Conversation
Replaces the deprecated big-step `CallElimCorrect` proof with a
small-step version, plus reusable infrastructure
(`SubstSemantics`, `CoreTransformSemantics`) and shared-file helpers.
The previous file carried a `DEPRECATED` notice flagging it for a
small-step rewrite; this is that rewrite.
## Summary
- `Strata/Transform/CallElimCorrect.lean`: 4597 -> 3814 LoC (-17%).
Same top-level theorem name (`callElimStatementCorrect`); 6 public
declarations preserved.
- New `Strata/Transform/SubstSemantics.lean` (1721 LoC): generic
substitution-semantics infrastructure (used by CallElim; reusable by
future transforms).
- New `Strata/Transform/CoreTransformSemantics.lean` (1151 LoC):
reusable helpers for the Core small-step layer (`singleCmdToStmts`,
`H_havocs`, `H_init`, `H_inits`, `genIdent*` correctness, fresh-name
predicates, etc.).
Net repository delta: +2,345 LoC (6,916 insertions / 4,571 deletions
across 13 files), with the bulk being reusable infrastructure rather
than CallElim-specific. Compared to the original 14,972-LoC big-step
proof, this represents a -8,056 LoC reduction overall.
## Changes to shared files
- `Strata/Languages/Core/StatementSemanticsProps.lean` (+159 LoC): 8
new helpers (`updatedStates_get_notin`, `readValues_get`,
`evalExpressions_get`, `readValues_updatedStatesSame`,
`evalExpressions_isDefined_flatMap`, `initStates_get_notin`, plus
2-/3-layer fall-through variants).
- `Strata/DL/Util/ListUtils.lean` (+60 LoC net): adds
`notin_append4`, `unzip_snd_length`,
`disjoint_of_nodup_append_three`, `nodup_append_three_disjoint`;
promotes existing helpers to `public`.
- `@[expose]` attributes added where needed by the new module-system
reductions:
- `Strata/DL/Imperative/CmdSemantics.lean`: `isNotDefined`,
`substStores`, `substDefined`, `substNodup`, `invStores`,
`substSwap`, `WellFormedSemanticEvalVal`.
- `Strata/DL/Lambda/LExprWF.lean`: `substFvar`, `substFvars`.
- `Strata/DL/Util/StringGen.lean`: `StringGenState.gen`.
- `Strata/Languages/Core/CoreGen.lean`: `CoreGenState.WF`,
`CoreGenState.gen`.
- `Strata/Languages/Core/Procedure.lean`:
`Procedure.Spec.getCheckExprs`, `Procedure.Spec.updateCheckExprs`.
- `Strata/Languages/Core/StatementSemantics.lean`: `updatedState`,
`updatedStates'`, `updatedStates`, `WellFormedCoreEvalTwoState`.
- `Strata/Transform/CoreTransform.lean`: ~17 generator/transform
defs.
- `Strata/Transform/CallElim.lean`: `callElimCmd`, plus an extracted
`oldTyLookupCallElim` helper required for the proof to share
auxiliary `match` definitions across two `do`-blocks.
## Testing
Full repo build green (602 jobs); StrataTest green (617 jobs); zero
warnings.
aqjune-aws
left a comment
There was a problem hiding this comment.
I left a few comments which can be applied orthogonally to our ongoing discussion in a private channel.
One extra question is that, I think the correctness of the top-level statement must also consider the case when a program may 'throw', through Imperative.exit. For example,
call f();
exit "some-block";
Note that the original program exits to "some-block", hence the final state after small-steps isn't ".terminal" (unlike what EvalStmtsSmall states, which is used transitively by EvalStatementsContract). If we apply call-elimination on this,
(omit temporary vars)
assume <pre>;
assert <post>;
exit "some-block";
it will have to exit to the right state.
Address PR #1306 review comment from @aqjune-aws: the previous statement of `callElimStatementCorrect` only covered programs reaching `.terminal`, silently excluding programs whose top-level execution ends in an `.exiting lbl ρ` configuration (e.g., `[call f(); exit "L"]` outside an enclosing block). `EvalStatementsContract` unfolds to `StepStmtStar (.terminal _)`, so such runs failed the hypothesis and were vacuously "covered" by the existing proof. Mirror the conjunctive shape used by `Specification.Overapproximates` (Specification.lean:594) and `TripleBlock` (Specification.lean:211): return `(terminal-arm) ∧ (∀ lbl, exiting-arm)`. Refactor into three theorems for clarity: - `callElimStatementCorrect_terminal` (private, unchanged): the existing call-elim chain via `EvalCallElim_glue`. - `callElimStatementCorrect_exit` (private, new): non-call cases reuse the original `Heval` via `callElimStmt_non_call_eq` (sts = [st]); the call case is vacuously discharged because `step_cmd` only ever yields `.terminal`, so `(.stmts [.cmd (.call ...)] _) →* .exiting lbl _` is unreachable (proven by `seq_reaches_exiting` plus inversion). - `callElimStatementCorrect` (public): assembles the two arms. No label-collision hazard: callElim only introduces fresh temp idents, not block labels, so exit labels transport unchanged. Zero external callers, so the conjunctive signature change is free downstream.
…nticsProps Address PR #1306 review feedback from @aqjune-aws: * Files holding only lemmas (not actual semantics defs) follow the existing repo convention `*SemanticsProps.lean` (e.g., `Strata/Languages/Core/StatementSemanticsProps.lean`): - `Strata/Transform/CoreTransformSemantics.lean` -> `Strata/Transform/CoreTransformSemanticsProps.lean` - `Strata/Transform/SubstSemantics.lean` -> `Strata/Transform/SubstSemanticsProps.lean` Audited `SubstSemantics.lean`'s deps: still requires `Core.Transform.createFvar(s)` and `singletonAssertEval`/ `singletonAssumeEval`, so it stays under `Strata/Transform/`. * Generic lemmas relocated to their proper homes: - Pure list lemmas -> `Strata/DL/Util/ListUtils.lean` (now in `List` namespace): `zip_pair_split`, `nodup_3_decompose`, `notin_3_append_of`, `zip_zip_unzip_snd_of_lengths`, `zip_zip_unzip_fst_unzip_fst_of_lengths`, `zip_notin_fst_pair`, `zip_notin_snd_pair` - Pure-Imperative lemmas -> new `Strata/DL/Imperative/CmdSemanticsProps.lean` (generalized from CoreStore/Expression.Ident to P : PureExpr, now in `Imperative` namespace): `subst_defined_tail`, `subst_nodup_tail` - Core-substitution helper moved out of `CallElimCorrect.lean` into `SubstSemanticsProps.lean`: `substDefined_of_app` * Imports updated: `CallElimCorrect`, `SubstSemanticsProps`, and `Strata/DL/Imperative/Imperative.lean` (added new `CmdSemanticsProps` import).
|
Fixed in 3710bb3. The theorem now has a conjunctive shape, refactored into three theorems: |
…ep-on-main2 # Conflicts: # Strata/Transform/CallElimCorrect.lean
Layer-A small-step semantics for procedure calls now propagates the callee body's terminal hasFailure flag to the caller. Previously, EvalCommand.call_sem hardcoded the output flag to `false`, silently dropping any failures inside callee bodies (a soundness gap inherited from the original semantics). The change exposes a previously-vacuous case in `evalCommand_failure_implies_assert_ff` (StatementSemanticsProps.lean): when call_sem yields `true`, the failing assert lives somewhere inside the callee body, not at the call site itself, so the lemma's existing existential `∃ a, coreIsAtAssert (.stmt (.cmd c) ρ) a ∧ …` cannot be discharged structurally. Closure strategy: a new precondition `CalleesNoFailure π φ` asserts that every procedure body in `π`, run from a non-failing init env, terminates with `hasFailure = false`. This is threaded through both `evalCommand_failure_implies_assert_ff` and `core_noFailure_preserved`. The call_sem arm of the former is then discharged by applying `hCallees` to the body trace bound by call_sem and contradicting the resulting `false` against the `true` index. Downstream: `procBodyVerify_procedureCorrect` (the sole consumer of `core_noFailure_preserved`) takes `CalleesNoFailure` as a new hypothesis. The hypothesis is exactly the body-noFailure clause that clause 2 of `ProcedureCorrect` (this theorem's own conclusion) establishes per procedure, so callers can discharge it inductively across all procedures in `π` once the program is verified. No new sorries, no axioms, no termination admits. Build clean (488 jobs).
|
Thanks for adding the exiting cases. It is good to see them proven :) I have two high-level important comments which will more be related to what is to be proven.
You would have already figured out the issue in
Now coming to This has to be something like this:
I thought |
…icsProps -> SubstProps Address PR #1306 follow-up review comment: the *Semantics qualifier in the prior names is misleading since CoreTransform.lean and SubstSemantics.lean do not define formal semantics — they hold transform machinery and substitution lemmas respectively. The Props-suffix convention should match the root file name without an inserted Semantics qualifier. Strata/Transform/CoreTransformSemanticsProps.lean -> Strata/Transform/CoreTransformProps.lean Strata/Transform/SubstSemanticsProps.lean -> Strata/Transform/SubstProps.lean Updated importers (CallElimCorrect.lean, SubstProps.lean) and one stale docstring reference in CmdSemanticsProps.lean.
Thanks for pointing it out. That makes sense. I will weaken the precondition requirement in
Here is a potential theorem we could prove to connect the two. It is currently commented out because the contract version doesn't create the extra variables in the store, which makes the proof a little more complicated than it needs to be. But thanks for the question -- I will try to look into that next. |
…ondition Per @aqjune-aws review feedback, model the procedure-call contract following the assert/assume distinction in EvalCmd: * Caller asserts the precondition (flag-encoded violation). * Body abstraction = havoc + postcondition assume (stuck on violation). * Caller assumes the postcondition (no flag, only stuckness). Implementation: split the existing single `call_sem` constructor into two arms mirroring `eval_assert_pass` / `eval_assert_fail`: * `call_sem` (renamed in spirit to "pass arm" via docstring) — the precondition holds (∀ pre, … = some tt); execution proceeds through havoc + postcondition assume; conclusion σ' false. Premise list and binders unchanged from the prior single rule, so all downstream destructures (notably CallElimCorrect.lean's `cases Hcc | call_sem` at lines 1393-1398) continue to work verbatim. * `call_sem_pre_fail` (new) — some precondition fails (∃ pre, … = some ff); execution halts at the assert violation; store unchanged (σ' = σ), flag = true. Mirrors eval_assert_fail. Drops post-precondition premises (HavocVars / postcondition assume / output ReadValues / UpdateStates) since execution does not reach those phases. The new fail arm is currently inert in proofs because: * CallElimCorrect's terminal arm extracts `Hcc : EvalCommandContract … σ' false` (failure index pinned to false). Lean's index discrimination during `cases Hcc` auto-discharges the new arm — its conclusion `σ true` cannot unify with the σ' false hypothesis. * core_noFailure_preserved is stated against EvalCommand (concrete semantics), not EvalCommandContract, so it is unaffected. Build clean (488 jobs); zero new sorries; no axioms. Pre-existing sorries unchanged at 22 (only 2 outside comment blocks: ProgramWF and the inactive EvalCallBodyRefinesContract bridge). Diff: Strata/Languages/Core/StatementSemantics.lean +27/-1. Postcondition violations remain stuck-on-violation (no derivation), mirroring `eval_assume`'s single-rule design. The verifier's postcondition discharge happens on the callee side via `procBodyVerify_procedureCorrect` (clause 2 of ProcedureCorrect), not at the contract rule.
…for precondition" This reverts commit b6904b0.
The dormant Core refinement bridge from concrete EvalCommand to abstract EvalCommandContract has carried a long-standing `sorry` inside a `/- ... -/` block-comment at StatementSemanticsProps.lean's EvalCallBodyRefinesContract since commit 74e7be4 (2025-08-08). This commit closes that sorry and brings the leaf bridge theorem out of the comment block. Proof sketch: given concrete `call_sem` premises (body trace `CoreStepStar π φ (.stmts p.body ⟨σAO, δ, false⟩) (.terminal ρ')`, postcondition validity `δ ρ'.store post = some HasBool.tt`, ReadValues from ρ'.store, UpdateStates writeback), construct a matching `EvalCommandContract.call_sem` derivation by: 1. Take σO := updatedStates σAO (ListMap.keys p.header.outputs) modvals where modvals is the concrete-trace ReadValues witness. 2. Discharge `HavocVars σAO outputs σO` by induction on outputs (havoc admits any value; we choose modvals[i] for output i). 3. Discharge `ReadValues σO outputs modvals` directly from σO's definition via `readValues_updatedStatesSame`. 4. Discharge `δ σO post = some HasBool.tt` via congruence: σO and ρ'.store agree at output keys (both = modvals[i] there); under `p_post_scoped`, postcondition free vars are confined to outputs; by `WellFormedSemanticEvalExprCongr`, agreement on FVs implies equal evaluation; transport concrete Hpost to σO. 5. UpdateStates is reused verbatim from the concrete trace. The theorem now takes three new outer hypotheses, each mathematically true and derivable from upstream WF predicates: * `WellFormedSemanticEvalExprCongr δ` — a standard evaluator well-formedness assumption (existing congruence machinery). * `(ListMap.keys p.header.outputs).Nodup` — output parameter names are distinct in the procedure header. * `p_post_scoped : ∀ post ∈ postconditions, getVars post ⊆ p.header.outputs.keys` — postconditions only mention output parameters (a scoping assumption typical for procedure contracts). Implementation note: the destructure `cases H | call_sem` uses the `generalize hf : (false : Bool) = b at H` trick (borrowed from `evalCommand_failure_implies_assert_ff`) because the `false` index cannot auto-substitute under `cases` when ρ' is existentially bound in the constructor. The five dependent transit theorems (EvalCommandRefinesContract, StepStmt_refines_contract, StepStmtStar_refines_contract, EvalStatementsRefinesContract, EvalStatementRefinesContract) remain inside a fresh `/- ... -/` block following the proof, with an explanatory header noting they need the three new outer hypotheses threaded through. This is a follow-up; closing the leaf bridge is the load-bearing step. Build clean (488/488 jobs); zero new sorries; no axioms; no termination admits. Net diff: +109/-12 LoC in Strata/Languages/Core/StatementSemanticsProps.lean.
Unify `EvalCommandContract.call_sem` to be parameterized by a Bool failure flag indicator. The precondition status is connected to `failed` via an iff: the call fails iff some precondition fails to evaluate to `tt` at `σAO`. On the success path (`failed = false`) the result store is produced by havoc + write-back via `UpdateStates`; on the failure path (`failed = true`) the result store is unchanged. Revive the 5 refinement-transit theorems (View B: f = false-only) threading 3 outer hypotheses (`δ_wfCong`, `π_wf` packaging `p_outputs_nodup` and `p_post_scoped`, plus a `h_no_fail` non-failure witness that back-propagates via `StepStmtStar_hasFailure_monotone`): - EvalCommandRefinesContract - StepStmt_refines_contract - StepStmtStar_refines_contract - EvalStatementsRefinesContract - EvalStatementRefinesContract Adjust the `call_sem` destructure in CallElimCorrect.lean to re-synthesize the legacy combined `Hpre` and the success-path `Hupdate` from the new bool-indicator-shaped premises (the destructure site is pinned to `failed = false`).
Resolves two conflicts introduced by upstream commit 0044067 (#1327) "Factor out theorems and lemmas into dedicated Props files": * Strata/Languages/Core/StatementSemanticsProps.lean — drop the local `import all Strata.DL.Imperative.SemanticsProps` (file deleted upstream) and adopt upstream's `Strata.DL.Imperative.StmtSemanticsProps` import, which now re-houses `StepStmtStar_hasFailure_monotone` and friends. * Strata/DL/Imperative/CmdSemanticsProps.lean — add/add conflict. Take upstream's full content (isDefined / isNotDefined / Init/UpdateState / set commutation lemmas) and append the local `subst_defined_tail` and `subst_nodup_tail` helpers, plus the Nodup import they require. Follow-up fixes after the merge: * Strata/Transform/SubstProps.lean and CallElimCorrect.lean — update `import Strata.DL.Lambda.Lambda` references to `Strata.DL.Lambda` for the Lambda.lean → Lambda.lean rename (also from #1327). * Strata/Languages/Core/StatementSemanticsProps.lean — pass explicit `Expression`, `EvalCommand π φ`, `EvalPureFunc φ` arguments to `StepStmtStar_hasFailure_monotone` (the upstream version of the lemma binds these from the surrounding section's `variable` declarations, not as implicits). `lake build` passes cleanly after resolution.
…sition
Audit-driven cleanup of branch htd/callelim-smallstep-on-main2 (no logic
changes; build clean at 490 jobs):
DELETIONS (~327 LoC removed across 6 files):
- Strata/Transform/CoreTransformProps.lean: removed 6 unused theorems
(genOldExprIdentsTrip_snd, genIdentGeneratedWF,
genOldExprIdentsTripGeneratedWF, genOldExprIdentsTrip_isOldTempIdent,
genArgExprIdents_length, genOutExprIdents_length). Each grep-confirmed
zero callers across Strata/.
- Strata/Transform/SubstProps.lean: removed 5-helper dead chain
(createAsserts_list, createAssumes_list, H_check_block, H_asserts,
H_assumes). The *_zip variants are the ones actually used by
CallElimCorrect; these unzipped variants have no consumers.
- Strata/DL/Util/ListUtils.lean: removed
List.nodup_append_three_disjoint (specialization of base lemma; no
callers).
- Strata/Languages/Core/StatementSemanticsProps.lean: removed
10-line commented InitVarsNotDefMonotone' stub (would not typecheck
if uncommented).
COMMENT/DOCSTRING FIXES (~12 sites):
- Strata/Transform/CallElimCorrect.lean:
- line 1091: 'six call-site WF clauses' → 'seven' (WFCallSiteProp
has 7 fields); dropped legacy pointer to Hpre_post_lhs_disj
(no longer in the codebase).
- line 1221: 'two halves of legacy Hwfgenst' → 'three fields'
(WFGenStateProp has 3 including wfgen).
- line 1300: 'see WFCallSiteProp in Strata/Languages/Core/WF.lean'
→ 'see WFCallSiteProp above (line 1095 of this file)'.
- line 1390: 'call_sem hardwires the failure flag to false' →
accurate comment about Hcc being pinned to failed=false at this
destructure (call_sem itself is failure-flag–parameterized).
- Strata/Transform/SubstProps.lean: dropped 5 dangling references to
removed legacy theorems (Lambda.LExpr.substFvarCorrect,
ReadValuesUpdatedStates, etc.).
- Strata/Transform/CoreTransformProps.lean: dropped 2 dangling
references (ReadValuesUpdatedState, EvalExpressionUpdatedState).
- Strata/Languages/Core/StatementSemanticsProps.lean: fixed
'assymmetry' → 'asymmetry' typo.
TODO DISPOSITION:
- Strata/Transform/CallElim.lean:171: replaced bare `/- TODO -/ false`
with explicit comment documenting that the validation hook is
intentionally conservative — call elimination replaces a callee
body with its contract (over-approximation); the hook returns false
until per-obligation proof witnesses are available. Logical value
preserved.
REFINEMENT GENERALIZATION INVESTIGATED (no change):
- Investigated generalizing EvalCallBodyRefinesContract from
`false → false` to a Bool failure-flag-indexed signature. Two
structural obstacles make strict generalization infeasible:
(1) σ' mismatch — concrete EvalCommand.call_sem always writes back
via UpdateStates, while contract EvalCommandContract.call_sem
forces σ' = σ when failed = true.
(2) Precondition iff incompatibility — concrete failed = true
(body asserts fail) carries Hpre saying ALL preconditions hold,
but the contract iff says failed = false IFF all preconditions
hold, so all-pre-hold forces contract failed = false.
The current View B (false-only) signature is therefore retained.
Bridging would require either weakening the contract's failure-path
σ' = σ semantics or adding a procedure-correctness premise; both
are out of scope for this cleanup pass.
Build clean (490 jobs); zero new sorries; no axioms. Working-tree
sorry inventory unchanged from pre-cleanup baseline (all sorries live
in ProgramWF/ProcedureWF/StatementWF/CmdEval — files this cleanup did
not touch).
|
Hi David, thanks for working on this topic - I think that including the refinement proof might be a too big work & interfere with the unstructured CFG. I will be glad to work on the task. |
|
Hi June, Yes. I am still working on adapting the call elimination proof to work with a generic failure flag (the case where preconditions fail). So I can definitely wait for your updates while iterating on that. I think it would be good to reuse some of the definitions in your updates in the current call-elimination proof. As for the refinement proof, I do think it is quite complicated with the failure flag integrated -- if we constrain it to |
## Summary Lifts `callElimStatementCorrect_terminal` from the parent branch's `f = false → false` shape (PR #1306, success-only refinement) to fully polymorphic `{f : Bool}`: a source trace ending in failure maps to a target trace ending in failure, and a source success trace maps to a target success trace, in one statement. Downstream consumers no longer case-split on the failure flag at the call site. This is a foundational refactor + generalization, not a feature. Reviewers should follow the proof structure: Stages 1-3 are pure extractions (success theorem unchanged), Path 1 narrows a public iff for semantic alignment, Stage 6 (split into a scaffold and a fill-in) lands the actual polymorphic-f result. ## Base branch This PR sits on top of `htd/callelim-smallstep-on-main2` (PR #1306, currently open). Merge into that branch, or rebase onto `main` once #1306 lands. ## Why this matters The parent's success-only theorem suffices to argue that call-elimination preserves the *absence* of bugs, but it leaves a dual gap: it does not say anything about preserving the *presence* of bugs. Without the failure arm, callElim could in principle turn a buggy program into one that verifies. Lifting `_terminal` to `{f : Bool}` makes that argument compositional — the same lemma covers both directions, and any future transform layered on top of call-elimination inherits the polymorphic guarantee for free. The failure arm also exercises corners of the contract semantics the success arm never touches: the assert-zip on call-site preconditions, havoc of out-parameters in the failure branch, and the assume-zip on call-site postconditions all have to compose correctly under `hasFailure = true`. Closing it flushed out a real semantic mismatch in `EvalCommandContract.call_sem` (Path 1 below) that was invisible from the success-only side. ## Commit-by-commit walkthrough Read the seven commits in order: ### 1. 193dee2 — WIP: polymorphic-f infrastructure (no live consumers) Forward-looking scaffolding, kept as a separate commit so that the consumer-side proof changes in later commits aren't buried under unrelated infrastructure. Lands in: - `Strata/Languages/Core/StatementSemantics.lean` — **Change A**: collapses the two conditional `Hupdate` premises of `EvalCommandContract.call_sem` into a single unconditional `UpdateStates` premise. The failure path no longer pins `σ' = σ`, which is what makes a polymorphic-f rule shape coherent. - `Strata/Languages/Core/WF.lean` — `WFPrePostProp.boolTyped` clause added (zero migration cost, no live construction sites yet). - `Strata/Transform/CoreTransformProps.lean` and `Strata/Transform/SubstProps.lean` — new helpers `singleCmdToStmts_poly`, `singletonAssumeEval_poly`, `H_havocs_poly`, `H_check_block_zip_poly`, `H_assumes_zip_poly`, `singletonAssertFailEval`, `singletonAssertEval_poly`, plus the substantive new walker `H_asserts_zip_fail` (L4 flag-flip from `false` to `true` via a failing-pre witness) and `EvalCallElim_glue_fail` (composes L1-L6 with the L4 flip). Build clean, sorry-free, axiom-free at this commit. ### 2. 580ef2d — Stage 1: extract `HoldEval_bridge_at_σO` Pure refactor. Lifts the per-index δ-eval bridge for `mkOld` old-variable fvars at the post-havoc store `σO` out of `callElimStatementCorrect_terminal`'s call arm into a private theorem with 11 explicit hypotheses. Net diff in `Strata/Transform/CallElimCorrect.lean`: +114 / -61. Same proof body as the inline version; the signature overhead is intentional because Stage 6 will share this helper between the success and failure arms. Sorry count and axiom set unchanged. ### 3. 5406b08 — Stage 2a: extract `HoldSubBridge_at_σO` Pure refactor in `Strata/Transform/CallElimCorrect.lean`. Lifts the per-fvar bridge for `createOldVarsSubst`'s codomain at the L6 intermediate stores `σ_R1` / `σO` into a private theorem with 6 hypotheses; the inline block becomes a one-line call. Net diff: +82 / -29. Sorry count and axiom set unchanged. `HinputSubBridge` (the larger sibling, originally targeted for Stage 4) is deferred, then ultimately made redundant by the flag-generalization approach (see "Stages 4 & 5: skipped" below). ### 4. cf10129 — Stage 3: extract `b1`/`b2_var_witness_at_oldSubst` Pure refactor in `Strata/Transform/CallElimCorrect.lean`. Lifts the two class-(b) substitution-decomposition witnesses out of the call arm: - `b1_var_witness_at_oldSubst`: when a `Map.find?` lands in the `createOldVarsSubst` segment of `oldSubst_L6`, the codomain entry is a fresh-old `createFvar gen` and the witness `var ∈ getVars w` forces `var = genOldIdents[i]`. - `b2_var_witness_at_oldSubst`: when the lookup misses `createOldVarsSubst` and hits `callElim_inputOnlyOldSubst`, the codomain is a positional `inArgs` element. Both helpers take `proc'` and `args` directly so `callElim_inputOnlyOldSubst` can be referenced by name (needed for `find?_append_none_elim` unification at the consumer). 4 callsites each in `_terminal` (inv + post variants); Stage 6's failure arm reuses the same callsites with identical arguments. Net diff: +123 / -77. ### Stages 4 & 5: skipped Originally planned to extract `HinputSubBridge` and a final shared-plumbing block. The flag-generalization approach made these unnecessary — the call branch in `_terminal` `cases` on `f`, and the `false` arm reuses its existing 2200-LoC body verbatim, so we never needed shared sub-helpers between the two arms. Keeping the inline block on the `false` side was strictly cheaper than extracting it. ### 5. fdf2fc4 — Path 1: narrow `EvalCommandContract.call_sem` iff to non-Free preconditions This is the semantic alignment that makes the failure arm provable. `callElimCmd` only emits asserts for non-`Free` preconditions (Free requires are assumed by the implementation, not checked at call sites). The previous `call_sem` iff ranged over **all** preconditions, which made the polymorphic-f failure arm unattainable: a witness `pre` with `eval ≠ tt` at `σAO` could fall in the Free segment, giving no L4 entry to flip the flag. That mismatch was harmless at `f = false` (the success arm never relies on the iff direction that exposed it) but a hard blocker on the failure arm. Restricting the iff to non-Free preconditions aligned the two semantics without changing any existing proof: - `Strata/Languages/Core/StatementSemantics.lean`: `call_sem` rule's `Hpre_def` and `Hpre_iff` now range over `getCheckExprs (preconditions.filter (·.attr ≠ .Free))`. - `Strata/Languages/Core/StatementSemanticsProps.lean`: `EvalCallBodyRefinesContract` / `EvalCommandRefinesContract` / `StepStmt[Star]_refines_contract` / `EvalStatement[s]RefinesContract` gain `[LawfulBEq Expression.Expr]` instances (needed for the contains↔mem bridge in the new `presFiltered_subset` derivation that powers the rule constructor). - `Strata/Transform/CallElimCorrect.lean`: success arm's `Hpre_evalTt` / `Hpre` re-synthesis ranges over filtered preconditions; `HpreFiltered` no longer needs `filterCheck_in_getCheckExprs` — direct membership in `presFiltered` transports straight to the filtered contains-form. Failure-arm helper signature updated to take `Hpre_def` / `Hpre_iff` at the new filtered shape. Build clean (490 jobs) with one expected sorry at line 1654 (failure-arm body, filled by the next commit). ### 6. 45db77d — Stage 6 scaffold: lift `_terminal` to `{f : Bool}`, delegate failure arm Shape change without filling in the body. In `Strata/Transform/CallElimCorrect.lean`, `callElimStatementCorrect_terminal`'s signature lifts from `f = false → false` to `{f : Bool}`, with `Heval` and the conclusion both at flag `f`. Non-call cases close polymorphically via `nc_close` (`Heval` passes through unchanged for identity statements). The call branch dispatches: ```lean cases f with | false => existing 2200-LoC body verbatim (f=false case) | true => exact callElimStatementCorrect_terminal_call_arm_fail ... ``` The `_terminal_call_arm_fail` helper is declared with full signature (~25 hypotheses from the `cases Hcc` destructure) and a `sorry` body, ready for the final commit. Splitting the signature lift from the proof close means the lifted public signature is reviewable and buildable on its own, and the failure-arm proof body lands against an already-stable interface. Sorry count: 18 → 19 (temporary). ### 7. be35c7f — Stage 6: close polymorphic-f failure arm + lift public signature Replaces the sorry-stubbed `callElimStatementCorrect_terminal_call_arm_fail` in `Strata/Transform/CallElimCorrect.lean` with the full proof body (~2,148 LoC) that mirrors the success arm's L1-L3 + L5/L6 plumbing while flag-flipping at L4: - **L1, L2, L3**: identical to success (init segments produce `f = false → false`). - **L4 (asserts)**: `H_asserts_zip_fail` with failing-pre witness. - **L5 (havocs)**: `H_havocs_poly` at `f := true`. - **L6 (assumes)**: `H_assumes_zip_poly` at `f := true`. - **Glue**: `EvalCallElim_glue_fail` composing `false → false → false → false → true`. Failing-pre witness derivation (now unblocked by Path 1): 1. `Hpre_iff` at `flag = true`: `Classical.em` on "all preconditions eval to `tt`" → contrapositive gives `∃ pre ∈ presFiltered, δ σAO pre ≠ some tt`. 2. Bool-totality from new `WFCallSiteProp.preBoolTyped` field (mirrors the `boolTyped` clause on `WFPrePostProp`): `δ σAO pre = some tt ∨ ff`. 3. Combine → `∃ pre, δ σAO pre = some HasBool.ff`. 4. Transport `σAO ↦ σ_old` via `subst_fvars_correct` against `Hsubst_L4`. WF additions in `Strata/Languages/Core/WF.lean`: - `WFCallSiteProp.preBoolTyped` (~line 1156): per-precondition bool-totality. - `WFCallSiteSpec.preBoolTyped` (~line 1207): per-call specialization. - `WFCallSiteProp.specialize` forwards the new field. The exit arm is unchanged — call statements never produce `.exiting`. Sorry count: 19 → 18 (back to baseline). Build clean, no new axioms. ## Polymorphic helpers — all genuinely consumed | Helper | Source | Callsites | | --- | --- | --- | | `EvalCallElim_glue_fail` | `Strata/Transform/CoreTransformProps.lean` | 1 | | `H_asserts_zip_fail` | `Strata/Transform/SubstProps.lean` | 1 | | `H_havocs_poly` | `Strata/Transform/CoreTransformProps.lean` | 1 | | `H_assumes_zip_poly` | `Strata/Transform/CoreTransformProps.lean` | 1 | | `WFPrePostProp.preBoolTyped` | `Strata/Languages/Core/WF.lean` | 1 | Helpers are deliberately single-callsite for now: the polymorphic-f infrastructure is staged to be reusable, but until a second consumer materializes the helpers stay scoped to this proof and we avoid speculative API design. No dormant scaffolding. ## Soundness - **Build**: clean (490 jobs) - **New sorries**: 0 (baseline 18 in `Strata/Languages/Core/WF.lean` are pre-existing and unrelated to this work) - **New admits**: 0 - **New axioms**: 0 - **`#print axioms callElimStatementCorrect_terminal`**: `[propext, Classical.choice, Quot.sound]` — Lean stdlib only, no project-local axioms - **Helper consumption**: all 4 polymorphic helpers + `preBoolTyped` have ≥ 1 live callsite (verified above) - **Path 1 conservativity**: the narrowed `call_sem` iff matches `callElimCmd`'s actual emission rule (asserts only for non-Free precondition components); success-arm reproof is mechanical ## Commit list ``` be35c7f Stage 6: close polymorphic-f failure arm + lift public signature fdf2fc4 Path 1: narrow EvalCommandContract.call_sem iff to non-Free preconditions 45db77d Stage 6 scaffold: lift _terminal to {f : Bool}, delegate failure arm cf10129 Stage 3: extract b1/b2_var_witness_at_oldSubst helpers 5406b08 Stage 2a: extract HoldSubBridge_at_σO helper 580ef2d Stage 1: extract HoldEval_bridge_at_σO helper 193dee2 WIP: polymorphic-f infrastructure (no live consumers) ``` ## Test plan - [x] `lake build` clean across all 490 jobs - [x] Zero new sorries introduced (baseline 18 in `Strata/Languages/Core/WF.lean` preserved) - [x] Zero new admits/axioms introduced - [x] `#print axioms callElimStatementCorrect_terminal` → Lean stdlib only (`[propext, Classical.choice, Quot.sound]`) - [x] Each polymorphic helper genuinely consumed (`EvalCallElim_glue_fail`, `H_asserts_zip_fail`, `H_havocs_poly`, `H_assumes_zip_poly`, `WFPrePostProp.preBoolTyped` — 1 callsite each) - [ ] Reviewer: spot-check Stages 1 / 2a / 3 (580ef2d, 5406b08, cf10129) are pure refactors by diffing helper bodies against the inline blocks they replaced - [ ] Reviewer: confirm Path 1 (fdf2fc4) preserves success-arm soundness — `Hpre_evalTt` re-synthesis over filtered preconditions is the only consumer-side change in the success arm - [ ] Reviewer: confirm Change A in 193dee2 (collapsed `UpdateStates` premise on `call_sem`) preserves the parent branch's `false → false` semantics - [ ] Downstream consumers of `_terminal` (none on `htd/callelim-smallstep-on-main2` today; re-check after rebase if #1306 lands first)
## Summary Round 3 tier-1 cleanups for the polymorphic-`f` callElim work merged in #1340. These commits were authored on top of #1340's head before the squash merge but were not included; this PR brings them onto the parent branch. Net delta: **-287 LoC** (5 files, +181/-468) across 19 small, atomic commits. No new sorries, no new axioms, no behavioral change. ## Highlights | Commit | Description | LoC | |---|---|---| | 2bb66b3 | R3-1: collapse `WFCallSiteProp` into `WFCallSiteSpec` via def-wrapper | -64 | | ca67ea7 | R3-2: drop f=false corollaries; rely on poly versions | -121 | | 680e092 | R3-12: hoist `π/φ/δ` binders into file-scope `variable` block | -10 | | 26bf9ae | R3-10: extract `havocVars_3layer_lift` helper | -23 | | ed3725d | R3-8: add `Procedure.Spec.checkedPreconditions` abbrev | -12 | | 7b6fbb3 | R3-19: simplify HentryFail mem-zip pattern in fail arm | -10 | | 8e3bc3f | R3-32: flatten nested mem_append cases via rcases | -14 | Plus 12 smaller items: bool-totality cleanup, redundant hypothesis drops, helper extractions, doc fixes (stale "seven" counts, removed references to deleted `WFPrePostProp.boolTyped`, fixed inverted section header hierarchy, dropped legacy `Hwfgenst` references). ## Soundness - **Build clean**: 490 jobs, no warnings or errors - **Sorries**: 0 in modified files - **Axioms**: `callElimStatementCorrect` depends only on `[propext, Classical.choice, Quot.sound]` — Lean stdlib only - **No behavioral change**: each commit is a refactor verified by `lake build` ## Test plan - [ ] CI green - [ ] No sorry/axiom regressions vs base
Summary
This PR replaces the deprecated big-step
CallElimCorrectproof with a small-step version, plus reusable infrastructure (SubstProps,CoreTransformProps) and shared-file helpers. The previous file carried aDEPRECATEDnotice flagging it for a small-step rewrite; this is that rewrite.Top-level public theorem
callElimStatementCorrectis preserved (now a conjunction over a_terminalarm and an_exitingarm), the contract semantics for procedure calls (EvalCommandContract.call_sem) is unified into a single failure-flag–parameterized rule, and the_terminalarm is fully polymorphic over the failure flag ({f : Bool}) — a source success trace maps to a target success trace and a source failure trace maps to a target failure trace, in one statement.Net repository delta vs
main2: +9,277 / -4,861 LoC across 19 files (the bulk being reusable infrastructure rather than CallElim-specific). Compared to the original 14,972-LoC big-step proof, this represents a ~-8k LoC reduction overall, while strictly strengthening what is proved.What's added
The 19 files modified group into 4 layers:
Top-level proof (~10,525 LoC churn)
Strata/Transform/CallElimCorrect.lean: full rewrite against small-step semantics. Same public theorem namecallElimStatementCorrect; structured as_terminal ∧ ∀ lbl, _exiting, mirroringSpecification.OverapproximatesandTripleBlock. The_terminalarm is polymorphic in the failure flag{f : Bool}; the_exitingarm reuses the originalHevalfor non-call cases and is vacuously discharged for the call case (a singleton call statement only ever reaches.terminal, never.exiting).Reusable infrastructure (~2,747 LoC, new files)
Strata/Transform/SubstProps.lean(1,621 LoC, new): generic substitution-semantics infrastructure used by CallElim and reusable by future transforms. IncludesH_asserts_zip/H_assumes_zip/H_check_block_zipwalkers,singletonAssertEval/singletonAssumeEvaland their_polyfailure-flag variants, plusH_asserts_zip_fail(the L4 flag-flip walker that powers the polymorphic-f failure arm).Strata/Transform/CoreTransformProps.lean(1,126 LoC, new): reusable helpers for the Core small-step layer —singleCmdToStmts/_poly,H_havocs/_poly,H_init,H_inits,genIdent*correctness, fresh-name predicates,EvalCallElim_glueandEvalCallElim_glue_fail(composing the L1–L6 segments with the L4 success/failure flip).Core semantics adjustments (~30 LoC)
Strata/Languages/Core/StatementSemantics.lean:EvalCommandContract.call_semunified — previously two separate path shapes (success: havoc + write-back; failure:σ' = σand the precondition iff in the wrong direction); now a single rule parameterized by a Bool failure flag indicator, with the precondition iff narrowed to non-Freepreconditions to matchcallElimCmd's actual emission rule.EvalCommand.call_sempropagates the callee body's terminalhasFailureflag to the caller (previously hardcoded tofalse, silently dropping any failures inside callee bodies).Strata/Languages/Core/StatementSemanticsProps.lean(+297 LoC): 8 new helpers aroundupdatedStates/readValues/evalExpressions/initStates, plus the revived 5-theorem refinement-transit chain (EvalCommandRefinesContract,StepStmt[Star]_refines_contract,EvalStatement[s]RefinesContract) over the unifiedcall_semshape.EvalCallBodyRefinesContractis closed (was a long-standingsorryinside a comment block).Strata/Languages/Core/WF.lean:WFCallSiteProp.preBoolTypedandWFPrePostProp.boolTypedper-precondition Bool-totality clauses, used by the failure-arm to derive a failing-pre witness.Strata/Transform/ProcBodyVerifyCorrect.lean: takesCalleesNoFailure π φas a new hypothesis (every procedure body, run from a non-failing init env, terminates withhasFailure = false), threaded throughcore_noFailure_preserved. The hypothesis is exactly the body-noFailure clause that clause 2 of the theorem's own conclusion (ProcedureCorrect) establishes per procedure, so callers discharge it inductively.Shared-file helpers (~250 LoC)
Strata/Languages/Core/StatementSemanticsProps.lean(above): 8 new shared helpers consumed by both CallElim and its sibling Core proofs.Strata/DL/Util/ListUtils.lean(+150 LoC net): addsnotin_append4,unzip_snd_length,disjoint_of_nodup_append_three,zip_pair_split,nodup_3_decompose,notin_3_append_of,zip_zip_unzip_{snd,fst_unzip_fst}_of_lengths,zip_notin_{fst,snd}_pair; promotes existing helpers topublic.Strata/DL/Imperative/CmdSemanticsProps.lean(new, +33 LoC): generalized pure-Imperative lemmassubst_defined_tail,subst_nodup_tail(P : PureExpr).@[expose]attributes added where required by the new module-system reductions:Strata/DL/Imperative/CmdSemantics.lean:isNotDefined,substStores,substDefined,substNodup,invStores,substSwap,WellFormedSemanticEvalVal.Strata/DL/Lambda/LExprWF.lean:substFvar,substFvars.Strata/DL/Util/StringGen.lean:StringGenState.gen.Strata/Languages/Core/CoreGen.lean:CoreGenState.WF,CoreGenState.gen.Strata/Languages/Core/Procedure.lean:Procedure.Spec.getCheckExprs,Procedure.Spec.updateCheckExprs.Strata/Languages/Core/StatementSemantics.lean:updatedState,updatedStates',updatedStates,WellFormedCoreEvalTwoState.Strata/Transform/CoreTransform.lean: ~17 generator/transform defs.Strata/Transform/CallElim.lean:callElimCmd, plus an extractedoldTyLookupCallElimhelper required for the proof to share auxiliarymatchdefinitions across twodo-blocks.Notable commits
The branch is structured so each landed commit can be reviewed independently:
286c6ce36— initial small-step rewrite (preserves the original_terminal-only theorem atf = false).3710bb339— add the exit-arm tocallElimStatementCorrect(review feedback from @aqjune-aws): mirrorsSpecification.Overapproximatesshape; non-call cases reuse the originalHeval, the call case is vacuously discharged viaseq_reaches_exiting.0eb0342ee— review-feedback rename*Semantics → *SemanticsPropsand relocate generic lemmas toListUtils/CmdSemanticsProps/SubstSemanticsProps.4bfccc31b— integratehasFailureflag intoEvalCommand.call_sem(closes a soundness gap inherited from the original semantics).e679bd560— second review-feedback rename*SemanticsProps → *Props(theSemanticsqualifier was misleading: these files hold transform machinery and substitution lemmas, not semantics defs).44deb44b2— closeEvalCallBodyRefinesContractand revive the dormant Core refinement bridge.ba122da08— unifyEvalCommandContract.call_sem(failure-flag-parameterized) and revive the 5-theorem refinement-transit chain.3daec283b— cleanup pass: 6 unused theorems inCoreTransformProps, 5-helper dead chain inSubstProps, dangling docstring references, TODO disposition. Net -327 LoC.be987fddc— lift_terminalto polymorphic{f : Bool}(PR CallElimCorrect: lift _terminal to polymorphic {f : Bool} #1340, merged into this branch).Build status
lake build: clean (490 jobs)sorryinventory inProgramWF/ProcedureWF/StatementWF/CmdEvalis unchanged; this PR closes one previously commented-outsorryinEvalCallBodyRefinesContract)#print axioms callElimStatementCorrect→[propext, Classical.choice, Quot.sound](Lean stdlib only)Test plan
lake buildclean across all 490 jobs#print axioms callElimStatementCorrectreports stdlib axioms onlycallElimStatementCorrectmatchesSpecification.Overapproximates/TripleBlock_terminalarm polymorphic in{f : Bool}(success and failure traces both covered)Warning: This repository will shortly undergo a split into several separate repositories. If you're creating a PR that crosses the boundaries between these repositories, you may want to hold off until the split is complete or be prepared to rework your PR into multiple PRs once the split is complete.
The code that will be moved includes: