Skip to content

feat(core): Add Sequence.empty<T>() syntax for creating empty sequences#1036

Closed
atomb wants to merge 4 commits into
mainfrom
atomb/fix-issue-1027
Closed

feat(core): Add Sequence.empty<T>() syntax for creating empty sequences#1036
atomb wants to merge 4 commits into
mainfrom
atomb/fix-issue-1027

Conversation

@atomb

@atomb atomb commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

Enable parsing Sequence.empty from .st files by adding a DDM fn declaration with an explicit type parameter. Previously, the 0-ary polymorphic function could not be parsed because the DDM grammar had no value arguments to infer the type parameter from.

Users now write Sequence.empty() instead of the workaround of declaring a variable and assuming its length is zero.

  • Grammar.lean: Add fn seq_empty with explicit Type argument
  • Translate.lean: Add translation case for seq_empty
  • FormatCore.lean: Format Sequence.empty back to () syntax
  • Factory.lean: Update seqEmptyFunc comment (now parseable)
  • Seq.lean: Add verification tests for the new syntax

Addresses but does not completely resolve #1027

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Enable parsing Sequence.empty from .st files by adding a DDM fn
declaration with an explicit type parameter. Previously, the 0-ary
polymorphic function could not be parsed because the DDM grammar had
no value arguments to infer the type parameter from.

Users now write Sequence.empty<int>() instead of the workaround of
declaring a variable and assuming its length is zero.

- Grammar.lean: Add fn seq_empty with explicit Type argument
- Translate.lean: Add translation case for seq_empty
- FormatCore.lean: Format Sequence.empty back to <T>() syntax
- Factory.lean: Update seqEmptyFunc comment (now parseable)
- Seq.lean: Add verification tests for the new syntax

Resolves #1027
aqjune-aws
aqjune-aws previously approved these changes May 6, 2026
MikaelMayer
MikaelMayer previously approved these changes May 6, 2026

@tautschnig tautschnig left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above.

@tautschnig tautschnig left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, focused fix that lifts a real user-facing restriction on Sequence.empty. The happy path is well exercised via the #eval verify test (parse → translate → format → typecheck → VC → SMT).

Before approving I'd like to see:

  • Tighter behavior in handleZeroaryOps for the .seq .Empty fallback (see inline).
  • Slightly decoupled pattern match using the existing seqTy match_pattern helper.
  • Broader test coverage: non-int element types (bool, string), nested sequences, and Sequence.empty<Map _ _>(). I verified manually that all three work with the current code, but the regressions would slip past CI with only the current int test.
  • A few tiny doc / commit-message nits (commit says "Resolves #1027", PR body / issue scope say it only addresses half of it).
  • Some proof / #guard coverage of the translation shape (see inline on Translate.lean).

Refactoring notes:

  • handleZeroaryOps is now asymmetric — only one of four cases consumes opTy. If we ever gain another 0-ary polymorphic op, growing this match further will get ugly. Worth a small helper that normalizes opTy → element type where relevant.
  • The silent CoreType.tvar default unknownTypeVar fallback mirrors the other unknown-type paths in this file, but elsewhere those paths call ToCSTM.logError. Sticking to that convention here would help catch well-typed-but-malformed inputs.

Comment thread Strata/Languages/Core/DDMTransform/FormatCore.lean
Comment thread Strata/Languages/Core/DDMTransform/FormatCore.lean
Comment thread Strata/Languages/Core/DDMTransform/Grammar.lean
let ety ← translateLMonoTy bindings _atp
let fn : LExpr Core.CoreLParams.mono :=
Core.coreOpExpr (.seq .Empty)
(.some (Core.seqTy ety))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you consider adding a small #guard-style test (in StrataTest/Languages/Core/Tests/CoreOpTests.lean or a new TranslateTests.lean) that exercises the expected structural shape of this translation for a couple of element types? Something along the lines of:

/-- The translation of `Sequence.empty<T>()` produces a `.seq .Empty` operator
    annotated with `Sequence T`. -/
#guard
  match translateExpr (mkTestPgm "Sequence.empty<int>()") with
  | .ok (.op _ name (.some ty)) =>
    name.name == "Sequence.empty" && ty == Core.seqTy .int
  | _ => false

The #eval verify test in Seq.lean exercises this path end-to-end, but a direct structural test would catch regressions (missing seqTy wrapper, wrong kind tag, etc.) without relying on an SMT call.

For an actual theorem, the nice property to prove is the round-trip with FormatCore: for any T in the Core grammar, format ∘ translate on Sequence.empty<T>() is the identity (up to formatting whitespace). That would tie the three pieces of this PR together.

Comment thread Strata/Languages/Core/Factory.lean Outdated
Obligation: build_on_empty_elem
Property: assert
Result: ✅ pass
-/

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test coverage is narrow — only Sequence.empty<int>(). I verified manually that each of the following also works end-to-end with the PR as-is, so please extend the test to include them (otherwise we have no regression protection for these cases):

  1. Non-int primitive element: Sequence.empty<bool>()
  2. Nested sequences: Sequence.empty<Sequence int>()
  3. Compound element type: Sequence.empty<Map int bool>()

Example follow-up test (feel free to consolidate into one procedure):

Suggested change
-/
#eval verify seqEmptyPgm
----------------------------------------------------------------------
-- Exercise various element types for Sequence.empty<T>().
private def seqEmptyTypesPgm :=
#strata
program Core;
procedure SeqEmptyTypes()
{
var sb : Sequence bool;
var sss : Sequence (Sequence int);
var smi : Sequence (Map int bool);
sb := Sequence.empty<bool>();
sss := Sequence.empty<Sequence int>();
smi := Sequence.empty<Map int bool>();
assert [bool_len]: Sequence.length(sb) == 0;
assert [seq_seq_len]: Sequence.length(sss) == 0;
assert [seq_map_len]: Sequence.length(smi) == 0;
};
#end
/-- info: true -/
#guard_msgs in
#eval TransM.run Inhabited.default (translateProgram seqEmptyTypesPgm) |>.snd |>.isEmpty

Also worth adding: a negative test showing that Sequence.empty() (no type argument) is rejected at parse time, so we can't silently regress to the old polymorphism-inference behavior if the grammar rule is ever weakened.

Related proof suggestion for SeqModel.lean: you already prove length_empty : ([] : List α).length = 0. It would be nice to extend the axiom-model block with a short lemma tying Sequence.build(Sequence.empty, v) to [v], matching the build_on_empty_* obligations this PR's test introduces.

#eval verify seqOpsPgm

---------------------------------------------------------------------

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the PR-description / commit-message wording is inconsistent with the actual scope: the commit message on 271fb5d68 says "Resolves #1027", but the PR body correctly says "Addresses but does not completely resolve #1027" (the Map part of the issue is not touched). Would you mind amending the commit message to match, so that squash-merging the PR doesn't auto-close #1027?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't squash-merging use the PR description as the commit message? I don't think it includes messages from the intermediate commits on the branch.

Fix pattern match in handleZeroaryOps to use Core.seqTy instead of raw
tcons string matching for Sequence.empty element type extraction. Add
error logging when the op-type annotation is missing or malformed.

- Use structured Core.seqTy pattern in FormatCore.lean
- Add ToCSTM.logError for the fallback path
- Add comment in Grammar.lean explaining why seq_empty needs explicit
  type syntax
- Clarify seqEmptyFunc doc comment in Factory.lean
- Add test exercising Sequence.empty with bool, nested Sequence, and
  Map element types
@atomb atomb dismissed stale reviews from MikaelMayer and aqjune-aws via f22f601 May 7, 2026 22:59
// Unlike other seq_* ops, seq_empty has no value arguments from which DDM can
// infer the element type, so the type argument must be explicit in surface syntax.
fn seq_empty (A : Type) : Sequence A =>
"Sequence.empty" "<" A ">" "(" ")";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need also the roundtrip of pretty-printing it that way

// the type parameter from). The Factory definition exists for programmatic use.
// Unlike other seq_* ops, seq_empty has no value arguments from which DDM can
// infer the element type, so the type argument must be explicit in surface syntax.
fn seq_empty (A : Type) : Sequence A =>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR's functionality is also implemented in #1100 but could be merged before.

MikaelMayer added a commit that referenced this pull request May 18, 2026
Cherry-pick the verification tests (seqEmptyPgm, seqEmptyTypesPgm)
that exercise Sequence.empty with the SMT solver, covering basic usage
and various element types (bool, nested Sequence, Map).
MikaelMayer pushed a commit that referenced this pull request May 18, 2026
Cherry-pick the verification tests (seqEmptyPgm, seqEmptyTypesPgm)
that exercise Sequence.empty with the SMT solver, covering basic usage
and various element types (bool, nested Sequence, Map).

Co-authored-by: Aaron Tomb <aarotomb@amazon.com>
@MikaelMayer

Copy link
Copy Markdown
Contributor

Everything unique in this PR was imported into #1100 which was just merged

keyboardDrummer pushed a commit to keyboardDrummer/Strata that referenced this pull request May 20, 2026
…structs (strata-org#1165)

Fixes strata-org#1158
Supersedes strata-org#1036

## Summary

Fixes several issues with `Core.formatProgram` that prevented round-trip
parsing:

1. **`inline function` formatted without space** — The `inline` grammar
op now includes a trailing space, preventing `inlinefunction`
concatenation. The formatter also now emits the `inline` attribute when
present on functions.

2. **Quantifier variable names** — The formatter now uses the
`prettyName` field from `LExpr.quant` instead of generating `__qN`
names, preserving the original variable names through formatting.

3. **Overflow predicates** — Added grammar entries, formatter handling,
parser entries, and Factory operations for all bitvector overflow
predicates (`SNegOverflow`, `UNegOverflow`, `SAddOverflow`,
`SSubOverflow`, `SMulOverflow`, `SDivOverflow`, `UAddOverflow`,
`USubOverflow`, `UMulOverflow`). `SNegOverflow` and `UNegOverflow` use
distinct grammar entries (`Bv.SNegOverflow` / `Bv.UNegOverflow`) to
preserve their different semantics through roundtrip. The translate
direction correctly dispatches overflow ops by type parameter
(bv1/bv8/bv16/bv32/bv64), logging an error for unsupported widths.

4. **Array assignment (`lhsArray`)** — Implemented `m[k] := v`
translation in both directions: the parser decomposes nested LHS into
base identifier + indices and builds `map_update` expressions; the
formatter detects the `map_update(var, idx, val)` pattern and produces
`lhsArray` syntax.

5. **`Sequence.empty`** — Added grammar entry with explicit type
annotation syntax (`Sequence.empty<int>()`), plus translate and format
handling, resolving the 0-ary polymorphic function limitation for this
operation. Includes solver verification tests (cherry-picked from strata-org#1036)
covering basic usage and various element types.

6. **Datatype roundtrip** — Verified that datatype formatting roundtrips
correctly (the extra-parentheses issue noted earlier is resolved).

7. **Roundtrip test infrastructure** — Added `RoundtripTest.lean` that
verifies parse → format → re-parse → re-format produces stable output
for types, functions, procedures, inline functions, parameterized type
arguments, datatypes, array assignments, and `Sequence.empty`.

8. **`getLFuncCall` reuse** — Refactored `lappToExpr` and
`decomposeMapUpdate` to use `getLFuncCall` for decomposing nested
applications, reducing code duplication with the existing utility.

## Testing

- All existing tests pass (with expected output updates for the
formatting improvements)
- New roundtrip tests verify types, functions, procedures, inline
functions, parameterized type arguments, datatypes, lhsArray, and
Sequence.empty
- Solver verification tests for Sequence.empty (from strata-org#1036) pass

## Remaining limitations

- Bitvector operations with widths beyond 1/8/16/32/64 log an error and
fall back to bv64 (requires grammar-level type entries per width)

---------

Co-authored-by: Aaron Tomb <aarotomb@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make creating empty sequences and maps easier

4 participants