Editorial: Fix incorrect use of UnicodeMatchPropertyValue#3587
Editorial: Fix incorrect use of UnicodeMatchPropertyValue#3587
Conversation
|
I have another question, if you search "scx" in https://unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt, you will find there is no record for it. And according to the spec,
All RegExp that have the form of |
|
Nice observation, @Jack-Works! Unicode property Script_Extensions (scx) is unusual in being set-valued rather than scalar-valued, and as such need special consideration in our spec. I have added editorial corrections to this PR, and opened #3590 for a potential followup. |
spec.html
Outdated
| 1. If _p_ is `Script_Extensions`, then | ||
| 1. Assert: _vs_ is a property value or property value alias for property “Script” listed in <a href="https://unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt"><code>PropertyValueAliases.txt</code></a>. | ||
| 1. Let _v_ be the Set containing the “short name”, “long name”, and any other aliases corresponding with value _vs_ for property “Script” in <a href="https://unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt"><code>PropertyValueAliases.txt</code></a>. | ||
| 1. Return the CharSet containing all Unicode code points whose character database definition includes the property “Script_Extensions” with value having a non-empty intersection with _v_. |
There was a problem hiding this comment.
Do you need to call MaybeSimpleCaseFolding here?
There was a problem hiding this comment.
Hmm, that would affect any code point that case-folds across script (or to/from Common). I don't know if there are any, but it's easy enough to accommodate. Done.
Fixes #3586
Closes #3590
Also includes commits with incidental fixes in nearby algorithms and steps.