Skip to content

Conversation

@kim-em
Copy link
Collaborator

@kim-em kim-em commented Dec 7, 2025

This PR adds support for underscores as digit separators in String.toNat?, String.toInt?, and related parsing functions. This makes the string parsing functions consistent with Lean's numeric literal syntax, which already supports underscores for readability (e.g., 100_000_000).

The implementation validates that underscores:

  • Cannot appear at the start or end of the number
  • Cannot appear consecutively
  • Are ignored when calculating the numeric value

This resolves a common source of friction when parsing user input from command-line arguments, environment variables, or configuration files, where users naturally expect to use the same numeric syntax they use in source code.

Examples

Before:

#eval "100_000_000".toNat?  -- none

After:

#eval "100_000_000".toNat?  -- some 100000000
#eval "1_000".toInt?        -- some 1000
#eval "-1_000_000".toInt?   -- some (-1000000)

Testing

Added comprehensive tests in tests/lean/run/string_toNat_underscores.lean covering:

  • Basic underscore support
  • Edge cases (leading/trailing/consecutive underscores)
  • Both toNat? and toInt? functions
  • String, Slice, and Substring types

All existing tests continue to pass.

Closes #11538

🤖 Prepared with Claude Code

This PR adds support for underscores as digit separators in String.toNat?, String.toInt?, and related parsing functions. This makes the string parsing functions consistent with Lean's numeric literal syntax, which already supports underscores for readability (e.g., 100_000_000).

The implementation validates that underscores:
- Cannot appear at the start or end of the number
- Cannot appear consecutively
- Are ignored when calculating the numeric value

This resolves a common source of friction when parsing user input from command-line arguments, environment variables, or configuration files, where users naturally expect to use the same numeric syntax they use in source code.

Closes #11538

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
@kim-em kim-em added the changelog-language Language features and metaprograms label Dec 7, 2025
kim-em and others added 3 commits December 8, 2025 10:09
The `lastCharWasDigit` parameter in the fold lambda was not used in the
function body, only tracked in the result tuple. Mark it as unused with
an underscore prefix to satisfy the linter.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
@github-actions github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Dec 8, 2025
@leanprover-community-bot
Copy link
Collaborator

Mathlib CI status (docs):

  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase b7ac6243a99a80e9d49ac87aa4ce450e0f5ec810 --onto 2ca3bc28590c5c33f755d3bd18252f25ea266266. You can force Mathlib CI using the force-mathlib-ci label. (2025-12-08 03:56:35)

@leanprover-bot
Copy link
Collaborator

Reference manual CI status:

  • ❗ Reference manual CI will not be attempted unless your PR branches off the nightly-with-manual branch. Try git rebase b7ac6243a99a80e9d49ac87aa4ce450e0f5ec810 --onto d3dda9f6d4428a906c096067ecb75e432afc4615. You can force reference manual CI using the force-manual-ci label. (2025-12-08 03:56:37)

@kim-em kim-em added this pull request to the merge queue Dec 8, 2025
Merged via the queue into master with commit 6cbcbce Dec 8, 2025
16 checks passed
NicolasRouquette added a commit to NicolasRouquette/doc-gen4 that referenced this pull request Dec 8, 2025
- Removed the parseNatWithUnderscores function and its doc comments with issue references
- Simplified getMaxHeartbeats to use flag.as! Nat directly for the CLI flag and s.toNat? for the environment variable
- Changed the max-heartbeats flag type from String to Nat in both singleCmd and genCoreCmd
- Removed the "Supports underscores as separators" mention from flag descriptions

In README.md:

- Removed the "Underscores can be used as separators for readability" sentence (the examples still show underscores since they will work in a version of Lean4 that includes this merged PR: leanprover/lean4#11541
algebraic-dev pushed a commit that referenced this pull request Dec 8, 2025
This PR adds support for underscores as digit separators in
String.toNat?, String.toInt?, and related parsing functions. This makes
the string parsing functions consistent with Lean's numeric literal
syntax, which already supports underscores for readability (e.g.,
100_000_000).

The implementation validates that underscores:
- Cannot appear at the start or end of the number
- Cannot appear consecutively
- Are ignored when calculating the numeric value

This resolves a common source of friction when parsing user input from
command-line arguments, environment variables, or configuration files,
where users naturally expect to use the same numeric syntax they use in
source code.

## Examples

Before:
```lean
#eval "100_000_000".toNat?  -- none
```

After:
```lean
#eval "100_000_000".toNat?  -- some 100000000
#eval "1_000".toInt?        -- some 1000
#eval "-1_000_000".toInt?   -- some (-1000000)
```

## Testing

Added comprehensive tests in
`tests/lean/run/string_toNat_underscores.lean` covering:
- Basic underscore support
- Edge cases (leading/trailing/consecutive underscores)
- Both `toNat?` and `toInt?` functions
- String, Slice, and Substring types

All existing tests continue to pass.

Closes #11538

🤖 Prepared with Claude Code

---------

Co-authored-by: Claude Sonnet 4.5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog-language Language features and metaprograms toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN

Projects

None yet

Development

Successfully merging this pull request may close these issues.

String.toNat? should support underscores as digit separators for consistency with numeric literals

4 participants