Skip to content

Character names #1397

Open
Open
@js-choi

Description

@js-choi

Hey, I just spoke with @sffc at an Ecma TC39 meeting, and he referred me to here.

ICU4X will eventually need to support lookup of Unicode character names (the Name property of code points, as well as character name aliases and named character sequences).

I’ve been working on trying to compress the collection of Unicode character names into a succinct data structure, occupying as little space as possible (my goal is 100–200 kB) while maintaining efficient bidirectional random access. To that end, in my spare time I’ve been working on an (unfinished) article describing the data structure, as well as an (unfinished) reference JavaScript implementation.

I don’t anticipate my work being done until several months from now at the earliest. But @sffc tells me that work on character names in ICU4X hasn’t started yet either. When ICU4X does reach the point when character names start to get implemented, I’d love to help out however I can. In the meantime, I will continue to work on the JavaScript library before porting it to Rust.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-designArea: Architecture or designC-unicodeComponent: Props, sets, triesS-epicSize: Major project (create smaller child issues)T-enhancementType: Nice-to-have but not requiredhelp wantedIssue needs an assignee

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions