-
Notifications
You must be signed in to change notification settings - Fork 244
Open
Labels
trackingThis issue tracks a ticket in another projectThis issue tracks a ticket in another project
Milestone
Description
The point of this issue is to track the implementation of Unicode algorithms in Rust, specifically wrt compatibility with ICU4X.
The Unicode Technical Reports are listed at https://www.unicode.org/reports/. Some identify as Unicode Standard Annex (UAX) or Unicode Technical Standard (UTS).
| . | Name | Status |
|---|---|---|
| UAX 9 | Unicode Bidirectional Algorithm | ✅ implemented in unicode-bidi, can use icu::properties data |
| UTS 10 | Unicode Collation Algorithm | ✅ implemented in icu::collator |
| UAX 11 | East Asian Width | unicode-width, cannot use icu::properties data |
| UAX 14 | Unicode Line Breaking Algorithm | icu::segmenter, outdated |
| UAX 15 | Unicode Normalization Forms | ✅ implemented in icu::normalizer |
| UTS 18 | Unicode Regular Expressions | regex, cannot use icu::properties data |
| UAX 24 | Unicode Script Property | ✅ implemented in icu::properties.unicode-script, cannot use icu::properties data, limited interop |
| UAX 29 | Unicode Text Segmentation | icu::segmenter, outdated |
| UAX 31 | Unicode Identifiers and Syntax | ❌ partial implementation in unicode-xid and unicode-script, cannot use icu::properties data |
| UAX 34 | Unicode Named Character Sequences | ❌ not implemented, data not in icu::properties |
| UTS 35 | Unicode Locale Data Markup Language (LDML) | icu::calendar, icu::datetime, icu::decimal, icu::list, icu::locale, icu::pattern, icu::plurals, icu::time |
| UTR 36 | Unicode Security Considerations | probably superseded by UAX 31, UAX 39, UAX 55 |
| UTS 37 | Unicode Ideographic Variation Database | does not specify algorithms |
| UAX 38 | Unicode Han Database (Unihan) | does not specify algorithms |
| UTS 39 | Unicode Security Mechanisms | unicode-security, cannot use icu::properties data |
| UAX 41 | Common References for Unicode Standard Annexes | does not specify algorithms |
| UAX 42 | Unicode Character Database in XML | does not specify algorithms |
| UAX 44 | Unicode Character Database | does not specify algorithms |
| UAX 45 | U-Source Ideographs | does not specify algorithms |
| UTS 46 | Unicode IDNA Compatibility Processing | ✅ implemented in idna, uses icu |
| UAX 50 | Unicode Vertical Text Layout | harfbuzz, however the relevant properties are not used through harfbuzz-traits (e.g. icu)harfrust, does not support external Unicode data sources at all |
| UTS 51 | Unicode Emoji | icu::properties |
| UAX 53 | Unicode Arabic Mark Rendering | harfbuzz, however the relevant properties are not used through harfbuzz-traits (e.g. icu)harfrust, does not support external Unicode data sources at all |
| UTS 55 | Unicode Source Code Handling | ❌ not implemented |
| UAX 57 | Unicode Egyptian Hieroglyph Database (Unikemet) | does not specify algorithms |
| UTS 58 | Draft Unicode Link Detection and Serialization | ❌ not implemented |
| UTR 59 | Proposed Draft East Asian Spacing | ❌ not implemented |
| UAX 60 | Draft Data for Non Han Ideographic Scripts | does not specify algorithms |
| UTS 61 | Proposed Draft Unicode Set Notation |
Metadata
Metadata
Assignees
Labels
trackingThis issue tracks a ticket in another projectThis issue tracks a ticket in another project