feat(core): Add Sequence.empty<T>() syntax for creating empty sequences#1036
feat(core): Add Sequence.empty<T>() syntax for creating empty sequences#1036atomb wants to merge 4 commits into
Conversation
Enable parsing Sequence.empty from .st files by adding a DDM fn declaration with an explicit type parameter. Previously, the 0-ary polymorphic function could not be parsed because the DDM grammar had no value arguments to infer the type parameter from. Users now write Sequence.empty<int>() instead of the workaround of declaring a variable and assuming its length is zero. - Grammar.lean: Add fn seq_empty with explicit Type argument - Translate.lean: Add translation case for seq_empty - FormatCore.lean: Format Sequence.empty back to <T>() syntax - Factory.lean: Update seqEmptyFunc comment (now parseable) - Seq.lean: Add verification tests for the new syntax Resolves #1027
tautschnig
left a comment
There was a problem hiding this comment.
Nice, focused fix that lifts a real user-facing restriction on Sequence.empty. The happy path is well exercised via the #eval verify test (parse → translate → format → typecheck → VC → SMT).
Before approving I'd like to see:
- Tighter behavior in
handleZeroaryOpsfor the.seq .Emptyfallback (see inline). - Slightly decoupled pattern match using the existing
seqTymatch_patternhelper. - Broader test coverage: non-
intelement types (bool,string), nested sequences, andSequence.empty<Map _ _>(). I verified manually that all three work with the current code, but the regressions would slip past CI with only the currentinttest. - A few tiny doc / commit-message nits (commit says "Resolves #1027", PR body / issue scope say it only addresses half of it).
- Some proof /
#guardcoverage of the translation shape (see inline onTranslate.lean).
Refactoring notes:
handleZeroaryOpsis now asymmetric — only one of four cases consumesopTy. If we ever gain another 0-ary polymorphic op, growing this match further will get ugly. Worth a small helper that normalizesopTy→ element type where relevant.- The silent
CoreType.tvar default unknownTypeVarfallback mirrors the other unknown-type paths in this file, but elsewhere those paths callToCSTM.logError. Sticking to that convention here would help catch well-typed-but-malformed inputs.
| let ety ← translateLMonoTy bindings _atp | ||
| let fn : LExpr Core.CoreLParams.mono := | ||
| Core.coreOpExpr (.seq .Empty) | ||
| (.some (Core.seqTy ety)) |
There was a problem hiding this comment.
Would you consider adding a small #guard-style test (in StrataTest/Languages/Core/Tests/ — CoreOpTests.lean or a new TranslateTests.lean) that exercises the expected structural shape of this translation for a couple of element types? Something along the lines of:
/-- The translation of `Sequence.empty<T>()` produces a `.seq .Empty` operator
annotated with `Sequence T`. -/
#guard
match translateExpr (mkTestPgm "Sequence.empty<int>()") with
| .ok (.op _ name (.some ty)) =>
name.name == "Sequence.empty" && ty == Core.seqTy .int
| _ => falseThe #eval verify test in Seq.lean exercises this path end-to-end, but a direct structural test would catch regressions (missing seqTy wrapper, wrong kind tag, etc.) without relying on an SMT call.
For an actual theorem, the nice property to prove is the round-trip with FormatCore: for any T in the Core grammar, format ∘ translate on Sequence.empty<T>() is the identity (up to formatting whitespace). That would tie the three pieces of this PR together.
| Obligation: build_on_empty_elem | ||
| Property: assert | ||
| Result: ✅ pass | ||
| -/ |
There was a problem hiding this comment.
Test coverage is narrow — only Sequence.empty<int>(). I verified manually that each of the following also works end-to-end with the PR as-is, so please extend the test to include them (otherwise we have no regression protection for these cases):
- Non-
intprimitive element:Sequence.empty<bool>() - Nested sequences:
Sequence.empty<Sequence int>() - Compound element type:
Sequence.empty<Map int bool>()
Example follow-up test (feel free to consolidate into one procedure):
| -/ | |
| #eval verify seqEmptyPgm | |
| ---------------------------------------------------------------------- | |
| -- Exercise various element types for Sequence.empty<T>(). | |
| private def seqEmptyTypesPgm := | |
| #strata | |
| program Core; | |
| procedure SeqEmptyTypes() | |
| { | |
| var sb : Sequence bool; | |
| var sss : Sequence (Sequence int); | |
| var smi : Sequence (Map int bool); | |
| sb := Sequence.empty<bool>(); | |
| sss := Sequence.empty<Sequence int>(); | |
| smi := Sequence.empty<Map int bool>(); | |
| assert [bool_len]: Sequence.length(sb) == 0; | |
| assert [seq_seq_len]: Sequence.length(sss) == 0; | |
| assert [seq_map_len]: Sequence.length(smi) == 0; | |
| }; | |
| #end | |
| /-- info: true -/ | |
| #guard_msgs in | |
| #eval TransM.run Inhabited.default (translateProgram seqEmptyTypesPgm) |>.snd |>.isEmpty |
Also worth adding: a negative test showing that Sequence.empty() (no type argument) is rejected at parse time, so we can't silently regress to the old polymorphism-inference behavior if the grammar rule is ever weakened.
Related proof suggestion for SeqModel.lean: you already prove length_empty : ([] : List α).length = 0. It would be nice to extend the axiom-model block with a short lemma tying Sequence.build(Sequence.empty, v) to [v], matching the build_on_empty_* obligations this PR's test introduces.
| #eval verify seqOpsPgm | ||
|
|
||
| --------------------------------------------------------------------- | ||
|
|
There was a problem hiding this comment.
Note that the PR-description / commit-message wording is inconsistent with the actual scope: the commit message on 271fb5d68 says "Resolves #1027", but the PR body correctly says "Addresses but does not completely resolve #1027" (the Map part of the issue is not touched). Would you mind amending the commit message to match, so that squash-merging the PR doesn't auto-close #1027?
There was a problem hiding this comment.
Doesn't squash-merging use the PR description as the commit message? I don't think it includes messages from the intermediate commits on the branch.
Fix pattern match in handleZeroaryOps to use Core.seqTy instead of raw tcons string matching for Sequence.empty element type extraction. Add error logging when the op-type annotation is missing or malformed. - Use structured Core.seqTy pattern in FormatCore.lean - Add ToCSTM.logError for the fallback path - Add comment in Grammar.lean explaining why seq_empty needs explicit type syntax - Clarify seqEmptyFunc doc comment in Factory.lean - Add test exercising Sequence.empty with bool, nested Sequence, and Map element types
| // Unlike other seq_* ops, seq_empty has no value arguments from which DDM can | ||
| // infer the element type, so the type argument must be explicit in surface syntax. | ||
| fn seq_empty (A : Type) : Sequence A => | ||
| "Sequence.empty" "<" A ">" "(" ")"; |
There was a problem hiding this comment.
We need also the roundtrip of pretty-printing it that way
| // the type parameter from). The Factory definition exists for programmatic use. | ||
| // Unlike other seq_* ops, seq_empty has no value arguments from which DDM can | ||
| // infer the element type, so the type argument must be explicit in surface syntax. | ||
| fn seq_empty (A : Type) : Sequence A => |
There was a problem hiding this comment.
This PR's functionality is also implemented in #1100 but could be merged before.
Cherry-pick the verification tests (seqEmptyPgm, seqEmptyTypesPgm) that exercise Sequence.empty with the SMT solver, covering basic usage and various element types (bool, nested Sequence, Map).
Cherry-pick the verification tests (seqEmptyPgm, seqEmptyTypesPgm) that exercise Sequence.empty with the SMT solver, covering basic usage and various element types (bool, nested Sequence, Map). Co-authored-by: Aaron Tomb <aarotomb@amazon.com>
|
Everything unique in this PR was imported into #1100 which was just merged |
…structs (strata-org#1165) Fixes strata-org#1158 Supersedes strata-org#1036 ## Summary Fixes several issues with `Core.formatProgram` that prevented round-trip parsing: 1. **`inline function` formatted without space** — The `inline` grammar op now includes a trailing space, preventing `inlinefunction` concatenation. The formatter also now emits the `inline` attribute when present on functions. 2. **Quantifier variable names** — The formatter now uses the `prettyName` field from `LExpr.quant` instead of generating `__qN` names, preserving the original variable names through formatting. 3. **Overflow predicates** — Added grammar entries, formatter handling, parser entries, and Factory operations for all bitvector overflow predicates (`SNegOverflow`, `UNegOverflow`, `SAddOverflow`, `SSubOverflow`, `SMulOverflow`, `SDivOverflow`, `UAddOverflow`, `USubOverflow`, `UMulOverflow`). `SNegOverflow` and `UNegOverflow` use distinct grammar entries (`Bv.SNegOverflow` / `Bv.UNegOverflow`) to preserve their different semantics through roundtrip. The translate direction correctly dispatches overflow ops by type parameter (bv1/bv8/bv16/bv32/bv64), logging an error for unsupported widths. 4. **Array assignment (`lhsArray`)** — Implemented `m[k] := v` translation in both directions: the parser decomposes nested LHS into base identifier + indices and builds `map_update` expressions; the formatter detects the `map_update(var, idx, val)` pattern and produces `lhsArray` syntax. 5. **`Sequence.empty`** — Added grammar entry with explicit type annotation syntax (`Sequence.empty<int>()`), plus translate and format handling, resolving the 0-ary polymorphic function limitation for this operation. Includes solver verification tests (cherry-picked from strata-org#1036) covering basic usage and various element types. 6. **Datatype roundtrip** — Verified that datatype formatting roundtrips correctly (the extra-parentheses issue noted earlier is resolved). 7. **Roundtrip test infrastructure** — Added `RoundtripTest.lean` that verifies parse → format → re-parse → re-format produces stable output for types, functions, procedures, inline functions, parameterized type arguments, datatypes, array assignments, and `Sequence.empty`. 8. **`getLFuncCall` reuse** — Refactored `lappToExpr` and `decomposeMapUpdate` to use `getLFuncCall` for decomposing nested applications, reducing code duplication with the existing utility. ## Testing - All existing tests pass (with expected output updates for the formatting improvements) - New roundtrip tests verify types, functions, procedures, inline functions, parameterized type arguments, datatypes, lhsArray, and Sequence.empty - Solver verification tests for Sequence.empty (from strata-org#1036) pass ## Remaining limitations - Bitvector operations with widths beyond 1/8/16/32/64 log an error and fall back to bv64 (requires grammar-level type entries per width) --------- Co-authored-by: Aaron Tomb <aarotomb@amazon.com>
Enable parsing Sequence.empty from .st files by adding a DDM fn declaration with an explicit type parameter. Previously, the 0-ary polymorphic function could not be parsed because the DDM grammar had no value arguments to infer the type parameter from.
Users now write Sequence.empty() instead of the workaround of declaring a variable and assuming its length is zero.
Addresses but does not completely resolve #1027
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.