Character names

Hey, I just spoke with @sffc at an Ecma TC39 meeting, and he referred me to here. 

ICU4X will eventually need to support lookup of Unicode character names (the `Name` property of code points, as well as character name aliases and named character sequences).

I’ve been working on trying to compress the collection of Unicode character names into a [succinct data structure](https://en.wikipedia.org/wiki/Succinct_data_structure), occupying as little space as possible (my goal is 100–200 kB) while maintaining efficient bidirectional random access. To that end, in my spare time I’ve been working on an [(unfinished) article describing the data structure](https://gist.github.com/js-choi/320275d05d6f252f6bd55199f76489a6), as well as an [(unfinished) reference JavaScript implementation](https://github.com/js-choi/uniname/).

I don’t anticipate my work being done until several months from now at the earliest. But @sffc tells me that work on character names in ICU4X hasn’t started yet either. When ICU4X does reach the point when character names start to get implemented, I’d love to help out however I can. In the meantime, I will continue to work on the JavaScript library before porting it to Rust.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Character names #1397

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Character names #1397

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions