Open
Description
Using a non-NFC, single-scalar code point like U+F900 as the start of a character class range causes an error:
1 | let r = #/[\u{F900}-\u{FDCF}]/#
| `- error: cannot parse regular expression: invalid bound for character class range
Tested with swift 5.10 and 6.0 (Xcode 16b2 16A5171r):
swift-driver version: 1.90.11.1 Apple Swift version 5.10 (swiftlang-5.10.0.13 clang-1500.3.9.4)
Target: arm64-apple-macosx14.0
swift-driver version: 1.110 Apple Swift version 6.0 (swiftlang-6.0.0.4.52 clang-1600.0.21.1.3)
Target: arm64-apple-macosx14.0
This seems to be because U+F900 is not in NFC, normalizing to U+8C48. I find this surprising, because while this code point is not in NFC, this character class range isn't ambiguous as other non-NFC cases might be (e.g. using a decomposed combination or U+F900 as a literal instead of with the \u
escape).
I am trying to port older code that uses NSRegularExpression, and this seems to be a blocker to moving away from the old APIs (short of expanding ranges like this into non-range classes of thousands of individual scalars).
Metadata
Metadata
Assignees
Labels
No labels