Skip to content

Conversation

@younies
Copy link
Member

@younies younies commented Jan 15, 2026

Desription

  • Enhanced the script_locale_group function in cldr_cache.rs to include additional test cases for language script grouping, ensuring accurate mapping for languages like N'Ko and Bambara.
  • Adjusted various timezone data files to reflect updated sizes and identifiers, including fingerprints.csv, timezone_names_cities_override_v1.rs.data, and others.

- Adjusted various timezone data files to reflect updated sizes and identifiers, including `fingerprints.csv`, `timezone_names_cities_override_v1.rs.data`, and others.
- Enhanced the `script_locale_group` function in `cldr_cache.rs` to include additional test cases for language script grouping, ensuring accurate mapping for languages like N'Ko and Bambara.
- Removed redundant exemplar city entries in the `timezone_names_locations_root_v1.rs.data` file to streamline data structure.
@Manishearth Manishearth removed their request for review January 15, 2026 21:49
Copy link
Member

@robertbastian robertbastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the "fix" here? this PR increases data size

timezone/names/cities/override/v1, <lookup>, 855B, 167 identifiers
timezone/names/cities/override/v1, <total>, 268467B, 261786B, 151 unique payloads
timezone/names/cities/override/v1, <lookup>, 871B, 169 identifiers
timezone/names/cities/override/v1, <total>, 283519B, 276751B, 153 unique payloads
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+15'052B

timezone/names/generic/short/v1, zh-SG, -> ta-MY
timezone/names/locations/override/v1, <lookup>, 1114B, 222 identifiers
timezone/names/locations/override/v1, <total>, 1000284B, 971921B, 208 unique payloads
timezone/names/locations/override/v1, <total>, 1037724B, 1009365B, 208 unique payloads
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+37'440B

@robertbastian robertbastian removed their assignment Jan 16, 2026
Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to impact mixed-script languages and causes fallback to occur across script boundaries, which I don't think is desirable. For example, with yue, currently we go

"yue" -> "yue-Hant-HK" -> "und-Hant" -> "zh-Hant-HK" -> "zh-Hant"

But now we go

"yue" -> "yue-Hant-HK" -> "und-Hant" -> "zh-Hant-HK" -> "zh"

which results in Simplified Chinese instead of Traditional Chinese going into the "und-Hant" data key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants