Skip to content

Improved Character Range and Special Sequence Support #527

Open
@cdwmhcc

Description

@cdwmhcc

🆒 Character Range Issues

Description Current Generated Pattern Expected Pattern
Number Range charIn('1-9') /[1\-9]/ /[1-9]/
Alternatives charIn('123456789') /[123456789]/ /[1-9]/
Ideal API charIn('1-9') n/a /[1-9]/

Whitespace Character Class Issues

Description Current Generated Pattern Expected Pattern
Escaped \s in String charIn('abc\\s') /[abc\\s]/ /[abc\s]/
Alternatives charIn('abc').or(whitespace) /(?:[abc]|\s)/ /[abc\s]/
Ideal API Option 1 charIn('abc\\s') n/a /[abc\s]/
Ideal API Option 2 charIn('abc${whitespace}') n/a /[abc\s]/

Complex Lookbehind or lookahead Structure Issues

Description Current Generated Pattern Expected Pattern
Lookbehind exactly('').after(anyOf(exactly('').at.lineStart(), charIn('-_(:')) /(?<=(?:^|[\-_(:]))/ /(?<=(?:^|[-_(:]))/
Ideal API after(anyOf(lineStart, charIn('-_(:')) n/a /(?<=(?:^|[-_(:]))/

ℹ️ Additional info

  1. Character Range Interpretation:

    • The library interprets '1-9' literally as the characters "1", "-", and "9" instead of the range from 1 to 9
    • Proper character ranges need to be enumerated manually
  2. Escaped Character Handling:

    • Escape sequences like \\s in strings are not correctly translated to regex character classes
    • The library creates unnecessary alternation when combining regular characters with special classes

Suggested Improvements

  1. Implement proper character range parsing in charIn(): - between two characters should create a range
  2. Support proper escape sequence handling in character classes
  3. Introduce more concise helper functions for common patterns (e.g., lineStart, after)

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions