Skip to content

UnicodeSet parser does not support all code points #3893

Open
@skius

Description

@skius

The unescaping code in icu_unicodeset_parser only works for scalar values (Rust char's), when all code points should be supported (any u32 below or equal char::MAX). Should be relatively straightforward to fix by replacing chars with u32s and a val <= char::MAX as u32 check instead of char::try_from in parse_escaped_char.

This currently fails, but should pass: icu_unicodeset_parser::parse(r"[^\uD800-\uE0FF]")

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-transliteratorComponent: transliteratorT-bugType: Bad behavior, security, privacygood first issueGood for newcomershelp wantedIssue needs an assignee

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions