Skip to content

Conversation

@fabiomadge
Copy link
Collaborator

Summary

This PR removes two misplaced LL(1) resolvers that were flagged by Coco/R as unnecessary. These resolvers don't actually resolve any parsing conflicts and can be safely removed.

Changes

  • Remove IF(AcceptReferrersAndBacktick()) resolver in ConstAtomExpression (line 3767)
  • Remove IF(IsIdentifier(la.kind)) resolver in FieldLocationSuffix (line 4512)

Impact

  • ✅ Eliminates 2 "Misplaced resolver: no LL(1) conflict" warnings from Coco/R
  • ✅ Cleaner grammar with no unnecessary complexity
  • ✅ Identical parsing behavior (no functional changes)
  • ✅ Reduces build warnings from 2 to 0

Testing

  • Grammar compiles successfully with 0 warnings
  • All existing functionality preserved
  • No changes to parsing behavior

Type of Change

  • Chore/maintenance (code cleanup, no functional changes)

This is a pure maintenance change that improves code quality without affecting functionality.

…rals

- Add scientific notation support (e.g., 1.23e5, 5e-2, 123E+4)
- Add trailing dot shorthand (e.g., 1. for 1.0)
- Add leading dot shorthand (e.g., .5 for 0.5)
- Support combining dot shorthands with scientific notation (.125e3)
- Update grammar to handle all new real literal forms
- Add comprehensive tests for parsing and error cases
- Update documentation with consistent terminology
- Make leading dot pattern more restrictive to avoid conflict with tuple access
- Leading dots now require at least 2 digits (.25) or scientific notation (.5e2)
- This prevents .0 and .1 from being tokenized as real numbers
- Tuple access like p.0 and p.1 now works correctly
- Updated tests to use compatible patterns (.10, .25, .50, etc.)

The core scientific notation and trailing dot functionality remains intact.
- Remove leading dot pattern from grammar to avoid conflict with large tuple access
- Leading dots like .5 conflicted with tuple field access like p.10, p.11, etc.
- Prioritize existing core language feature (tuple access) over new convenience feature
- Scientific notation and trailing dots remain fully functional
- Update tests, documentation, and release notes to reflect this change

Core functionality preserved:
- Scientific notation: 1.23e5, 5e-2, 123E+4 ✓
- Trailing dots: 1., 5.e2 ✓
- Large tuple access: p.10, p.11, p.12, etc. ✓

Trade-off: Leading dots (.5, .25) removed to maintain language compatibility.
- Remove leading dot handling code from Dec parsing rule (dead code)
- Fix documentation to clarify that trailing dots apply to both decimal and scientific forms
- Update error tests to remove cases that are no longer parse errors
- Remove references to .5e3 pattern that was removed
- Improve clarity of real literal documentation structure

Changes:
- Dec rule: Remove StartsWith('.') check and leading dot normalization
- Documentation: Clarify trailing dots work with both decimal and scientific notation
- Error tests: Remove multiple dots case (now parsed as member access)
- All functionality preserved, just cleaner code and docs
…otation

Scientific notation is not a separate thing but rather how numbers can be written.
The correct term is 'exponential notation' when referring to the e/E syntax.

Changes:
- Documentation: 'scientific notation' → 'exponential notation'
- Test names: BasicScientificNotation → BasicExponentialNotation
- Comments: Updated throughout codebase
- Release notes: Updated terminology

Functionality unchanged - this is purely a terminology correction for accuracy.
Better organize the documentation to clarify that:
- Real literals are decimal numbers (the base concept)
- Scientific notation is a way of writing real literals (not a separate form)
- Trailing dot shorthand applies to both regular decimals and scientific notation

Changes:
- Remove confusing 'three separate forms' structure
- Present scientific notation as a way to write real literals
- Clarify that trailing dots work with both decimal and scientific notation
- Keep 'scientific notation' terminology (which is correct and standard)

No functional changes - purely documentation clarity improvements.
Remove examples and documentation of trailing dots with scientific notation
(like 5.e2) since they're pointless - scientific notation already handles
large/small numbers efficiently and clearly.

Focus documentation on where trailing dots are actually useful:
- Whole number constants: 1., 123., 1000.
- Readability improvement for integer-valued reals

Changes:
- Remove 5.e2 examples from documentation and tests
- Focus on 1., 123. examples where shorthand adds value
- Clarify that trailing dots are for decimal fractions, not scientific notation
- Keep functionality intact - just don't advertise pointless usage

The feature still works with scientific notation, we just don't promote it.
Ensure we test patterns like 5.e2, 42.E-1, 1.e0 even though we don't
advertise them in documentation (since they're longer than 5e2, 4.2e1, 1e0).

Test philosophy:
- ✅ Test everything that should work (comprehensive coverage)
- ✅ Document only what's useful (focused guidance)
- ✅ Include comments explaining why patterns aren't recommended

This ensures users won't get parse errors if they use these patterns,
but we guide them toward better alternatives.
The comment said '4.2e0 or just 4.2' but 42.E-1 equals 4.2, so the
correct shorter form is 42e-1, not 4.2e0.

This makes the example clearer and mathematically correct.
Change the scientific notation example from '5e0 (which equals 5.0)'
to '5e2 (which equals 500.0)' to be consistent with test examples
and to show a more meaningful use of scientific notation.

5e0 = 5.0 is a trivial example, while 5e2 = 500.0 better demonstrates
the purpose of scientific notation.
Change 'regular decimal literals' to 'real literals' for better accuracy
and consistency. The trailing dot shorthand works with real literals in
general, and 'real literals' matches the section terminology.

Also improves sentence flow by removing redundant 'regular' qualifier.
This adds support for trailing dots (e.g., '1.' meaning '1.0') by
handling them in the Dec production of the parser rather than the
tokenizer. This avoids conflicts with the range operator '..' that
would occur if trailing dots were handled at the tokenizer level.

Examples that now work:
- var x := 1.;     // 1.0
- var y := 42.;    // 42.0
- var z := 5.e2;   // 500.0 (already worked)

Note: There is a pre-existing issue where 's[1..]' fails to parse
because the tokenizer interprets '1..' as '1.' + '.' instead of
'1' + '..'. This affects many existing Dafny files and should be
addressed in a future fix.
- Reverted tokenizer change that broke s[1..] range slicing
- Added parser-level lookahead to handle trailing dots (1.)
- Extended realnumber token to support 5.e2 pattern
- All functionality now works: range slicing, trailing dots, scientific notation
The error message format changed to use shorter paths and single-line format.
All 11 expected parse errors are still correctly detected.
…essed

- Remove incorrect statement that 1-2 integration failures are 'normal'
- Emphasize that ALL integration test failures need to be fixed
- Update CI debugging guidance accordingly
The development guide is for personal use only and should not be in the repository.
Personal development files should be managed manually, not automatically ignored.
- ScientificNotationErrors.dfy: Remove blank line before final summary
- ScientificNotation.dfy: Add blank line at beginning to match actual output
- Both tests now have correct format expectations
- Simplify test to avoid complex verification that may timeout in CI
- Fix expectation format: blank line + correct verification count (1 vs 10)
- Test now passes locally and should pass in CI
- Add back comprehensive test coverage for all scientific notation features
- Test basic scientific notation (1.23e2, 1.23E2, 1.23e+2, 1.23e-2)
- Test integer scientific notation (5e2, 5e-1, 5e0)
- Test trailing dot literals (1., 123., 0., 1_000.)
- Test arithmetic operations with scientific notation
- Test edge cases (zero with exponents, small values)
- Test underscore support (1_2.3_4e1_0, 5_0.0e-4)
- Test type inference and expression contexts
- Optimized assertions to avoid complex verification chains
- 8 methods total, all pass locally
- Add 4th alternative to realnumber token: '.' digit {['_'] digit} [('e'|'E') ['+'|'-'] digit {['_'] digit}]
- Support leading dot literals: .5 → 0.5, .123 → 0.123, .0 → 0.0
- Support leading dot with scientific notation: .5e2 → 50.0, .123e-2 → 0.00123
- Add leading dot preprocessing in Dec production (prepend '0' to strings starting with '.')
- Add comprehensive leading dot tests to ScientificNotation.dfy (9 methods total)
- Add LeadingDotErrors.dfy test for invalid cases (., .e5)
- All tests pass locally

Complete decimal literal feature set now available:
- Scientific notation: 1.23e5, 5e-2, 1.23E+4
- Trailing dot: 1., 123., 1_000.
- Leading dot: .5, .123, .5e2
- All combinations with underscores and case variations
…cess

ISSUE: Leading dot pattern '.' digit {['_'] digit} was too broad and conflicted
with tuple member access syntax (.0, .1, .2), causing parse errors in existing code.

SOLUTION: Restrict leading dot support to avoid conflicts:
- '.' digit digit {['_'] digit} - Require 2+ digits (e.g., .50, .123)
- '.' digit ('e'|'E') - Allow single digit with scientific notation (e.g., .5e2)

This preserves meaningful leading dot functionality while ensuring tuple member
access continues to work correctly.

SUPPORTED LEADING DOT FORMATS:
✅ .50, .123, .00 (2+ digits)
✅ .5e2, .1E-3 (scientific notation)
✅ Tuple access: tuple.0, tuple.1, tuple.2 (no conflict)

TESTS:
- Add TupleAccessCompatibility test method
- Update LeadingDotLiterals to use supported formats
- Fix LeadingDotErrors.dfy line number expectations
- All integration tests now pass
- Remove leading dot patterns from realnumber token definition
- Remove leading dot handling from Dec production
- Delete LeadingDotErrors test files (feature not implemented)
- Update ScientificNotation.dfy to focus on scientific notation and trailing dots
- Preserve tuple member access compatibility (tuple.5, tuple.10, etc.)
- Keep scientific notation (1.23e5, 5e-2) and trailing dot (1., 123.) support

This addresses the fundamental conflict between leading dot literals (.5)
and tuple member access (tuple.5) identified in our investigation.
- Change 'Decimal fractions can use trailing dot shorthand' to
  'Real number literals can use trailing dot shorthand'
- More accurate since trailing dots work with scientific notation too (1.e2)
- Consistent with broader 'real number literals' terminology used throughout
- Better reflects the actual grammar implementation
…flicts

These IF() resolvers were flagged by Coco/R as 'Misplaced resolver: no LL(1) conflict'
indicating they don't actually resolve any parsing conflicts and can be safely removed.

Changes:
- Remove IF(AcceptReferrersAndBacktick()) resolver in ConstAtomExpression
- Remove IF(IsIdentifier(la.kind)) resolver in FieldLocationSuffix

This cleanup eliminates 2 misplaced resolver warnings while maintaining
identical parsing behavior.
@fabiomadge
Copy link
Collaborator Author

Technical Details

What are LL(1) Resolvers?

LL(1) resolvers in Coco/R are used to resolve parsing conflicts when the grammar is ambiguous. They use semantic predicates to choose between alternatives when the parser can't decide based on lookahead alone.

Why Remove These?

Coco/R flagged these resolvers as "Misplaced" because:

  1. No actual conflict exists - The grammar alternatives can be distinguished without semantic predicates
  2. Unnecessary complexity - The resolvers add code complexity without providing value
  3. Build warnings - Each misplaced resolver generates a warning during grammar compilation

Before This Change:

-- line 3767 col 8 : warning : Misplaced resolver: no LL(1) conflict.
-- line 4512 col 8 : warning : Misplaced resolver: no LL(1) conflict.

After This Change:

Build succeeded.
    0 Warning(s)
    0 Error(s)

Verification

  • ✅ Grammar compiles cleanly with 0 warnings
  • ✅ All existing tests pass
  • ✅ Parser behavior is identical (verified by running comprehensive test suite)

This is a safe, low-risk maintenance improvement that reduces technical debt.

@fabiomadge
Copy link
Collaborator Author

Closing this PR as it incorrectly contains feature changes. Will create a new clean chore PR with only the LL1 resolver cleanup.

@fabiomadge fabiomadge closed this Jun 29, 2025
@fabiomadge fabiomadge deleted the chore/remove-misplaced-ll1-resolvers branch June 29, 2025 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant