Handle the `sr-Cyrl-ME` -> `sr-ME` collation-only fallback by robertbastian · Pull Request #7867 · unicode-org/icu4x

robertbastian · 2026-04-13T09:23:42Z

This handles the sr-Cyrl-ME -> sr-ME fallback the same way we handle the other collation-only fallbacks: by explicitly adding the data for sr-Cyrl-ME so we don't go through the default fallback mechanism for that locale.

Changelog

N/A

sffc

Would be nice for the test to actually run the Collator constructor but it is a bit tricky... good for a follow-up.

hsivonen · 2026-04-15T08:22:19Z

Do I understand correctly that explicitly asking for Cyrillic collation for Montenegrin now results in the Latin collation similarly to how asking for Cyrillic collation for Croatian results in the Latin collation ... but unlike Bosnian and Serbian, which do allow explicit request for the non-default script.

Why? What's the upstream CLDR issue motivating this change?

For reference, here's the Firefox/SpiderMonkey test case that documents the situation before this PR: https://searchfox.org/firefox-main/rev/23974e2d947e31e4ae42ae2758a4416c9a6d8671/js/src/tests/non262/Intl/Collator/bcms.js

Also, AFAICT, there is no technical reason why we couldn't merge the Latin and Cyrillic collation data for Bosnian-Croatian-Montenegrin-Serbian and make Latn vs. Cyrl a matter of script reordering on top.

Are users of Bosnian-Croatian-Montenegrin-Serbian actually better served by having the other script collate according to root as opposed to having the other script also collate according to language-specific rules?

robertbastian · 2026-04-15T09:58:14Z

Apparently these come from upstreaming ICU behaviour to CLDR: unicode-org/cldr#2664, unicode-org/cldr#3504.

I believe it was initially added to ICU in icu4c/source/data/icu-coll-deprecates.xml, which was commited as "Merge CLDR25 data into trunk". However, I cannot find any reference to sr_Cyrl_ME in CLDR 25, so I believe that file was handwritten. sr_ME is listed with the other sr variants there (and other multi-script languages), and it's the only one where the added script tag is not the likely one¹. It looks suspiciously like a typo.

The initial aliases from that file have since evolved through

and the ones that are just likely subtags have disappeared, leaving just sr-Cyrl-ME -> sr-ME. Along the way, helpful comments like

It is not at all clear why this is being done (we expect "sr_Latn_ME" normally).

have been added and removed. It still says

TODO: Find out and document this properly

today, but that work is not being tracked anywhere, and apparently wasn't enough to have someone look at this before upstreaming it into CLDR.

Note that Cyrl was the likely script for sr-ME until CLDR-2203, but that was way before CLDR 25 ↩

…nicode-org#7867)" This reverts commit 0f348cc.

Reverts #7867 ## Changelog N/A

robertbastian requested review from a team, Manishearth and sffc as code owners April 13, 2026 09:23

fallback

4360e86

robertbastian force-pushed the sr-ME branch from 5ea2fe7 to 4360e86 Compare April 13, 2026 09:25

sffc approved these changes Apr 14, 2026

View reviewed changes

robertbastian merged commit 0f348cc into unicode-org:main Apr 14, 2026
34 checks passed

robertbastian deleted the sr-ME branch April 14, 2026 15:37

sffc mentioned this pull request Apr 14, 2026

Test the collator fallback behavior in datagen #7873

Open

robertbastian added a commit to robertbastian/icu4x that referenced this pull request Apr 15, 2026

Revert "Handle the sr-Cyrl-ME -> sr-ME collation-only fallback (u…

7294b27

…nicode-org#7867)" This reverts commit 0f348cc.

robertbastian mentioned this pull request Apr 15, 2026

Revert the sr-Cyrl-ME -> sr-ME collation-only fallback #7876

Merged

robertbastian added a commit that referenced this pull request Apr 15, 2026

Revert the sr-Cyrl-ME -> sr-ME collation-only fallback (#7876)

478e8c9

Reverts #7867 ## Changelog N/A

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle the `sr-Cyrl-ME` -> `sr-ME` collation-only fallback#7867

Handle the `sr-Cyrl-ME` -> `sr-ME` collation-only fallback#7867
robertbastian merged 1 commit intounicode-org:mainfrom
robertbastian:sr-ME

robertbastian commented Apr 13, 2026 •

edited

Loading

Uh oh!

sffc left a comment

Uh oh!

Uh oh!

hsivonen commented Apr 15, 2026

Uh oh!

robertbastian commented Apr 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

robertbastian commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelog

Uh oh!

sffc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hsivonen commented Apr 15, 2026

Uh oh!

robertbastian commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Footnotes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

robertbastian commented Apr 13, 2026 •

edited

Loading

robertbastian commented Apr 15, 2026 •

edited

Loading