Open
Description
All identified problems (most have been addressed in Pomsky 0.10):
- .NET doesn't support code points (in hexadecimal notation) outside the BMP – must be converted to two UTF-16 surrogates
- make it work in string literals (e.g.
'𐌰'
) - make it work for hexadecimal code points above U+FFFF (e.g.
U+10330
) instead of producing an error
- make it work in string literals (e.g.
- .NET doesn't support arbitrary code points (
.
orC
) outside the BMP #89 -
\pL
as shorthand for\p{L}
doesn't work -
\p{LC}
doesn't work- polyfill?
- scripts and boolean properties don't work at all
- needs investigation to see if all blocks are supported
- check if block names are correctly normalized: underscores must be removed, but dashes preserved
-
\v
and\h
aren't supported - .NET:
\w
(and by extension\b
and\B
) don't conform to Unicode #88 - need to check if backreferences like
\80
are too high (doc) - any further bugs may surface during fuzzing
To Reproduce
The regex-test
crate should be was expanded to run .NET tests and run in CI (currently only on Ubuntu).
Expected behavior
.NET flavor works reliably, using unsupported features produces an error.