Description
Hey, I just spoke with @sffc at an Ecma TC39 meeting, and he referred me to here.
ICU4X will eventually need to support lookup of Unicode character names (the Name
property of code points, as well as character name aliases and named character sequences).
I’ve been working on trying to compress the collection of Unicode character names into a succinct data structure, occupying as little space as possible (my goal is 100–200 kB) while maintaining efficient bidirectional random access. To that end, in my spare time I’ve been working on an (unfinished) article describing the data structure, as well as an (unfinished) reference JavaScript implementation.
I don’t anticipate my work being done until several months from now at the earliest. But @sffc tells me that work on character names in ICU4X hasn’t started yet either. When ICU4X does reach the point when character names start to get implemented, I’d love to help out however I can. In the meantime, I will continue to work on the JavaScript library before porting it to Rust.