-
Notifications
You must be signed in to change notification settings - Fork 211
Don't output conflicting fields in datagen #6477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
provider/source/data/debug/datetime/patterns/date/chinese/v1/ej/en-001.json
Outdated
Show resolved
Hide resolved
@@ -11,7 +11,7 @@ | |||
"elements": [ | |||
"LLLL U", | |||
"MM.y", | |||
"r(U) MMMM", | |||
"r(U) LLLL", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data is strange. availableFormats has:
"Gy": "U",
"GyMMM": "LLL U",
"GyMMMd": "d MMM U",
"GyMMMEd": "E, d MMM U",
"GyMMMM": "r(U) MMMM",
"GyMMMMd": "r(U) MMMM d",
"GyMMMMEd": "r(U) MMMM d, E",
...
"y": "U",
"yyyy": "U",
"yyyyM": "MM.y",
"yyyyMd": "dd.MM.y",
"yyyyMEd": "E, dd.MM.y",
"yyyyMMM": "LLL U",
"yyyyMMMd": "d MMM U",
"yyyyMMMEd": "E, d MMM U",
"yyyyMMMM": "LLLL U",
"yyyyMMMMd": "r(U) MMMM d",
"yyyyMMMMEd": "r(U) MMMM d, E",
So, we get variant patterns for LLLL U
(without era) and r(U) MMMM
(with era), but only for the long form: the medium (abbreviated) form has LLL U
for both.
Not sure what's right. Seems like more of a CLDR problem than an ICU4X problem at the current time.
@@ -15,6 +15,6 @@ | |||
"ccc, dd.MM.y г.", | |||
"EEEE, d MMMM y г. GGG", | |||
"E, d MMM y г. GGG", | |||
"E, dd MMM y г. GGG" | |||
"ccc, dd MMM y г. GGG" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is the motivating issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
general shape lgtm. looks like the algorithm needs some work still from your comments
@@ -19,6 +21,12 @@ use icu_provider::prelude::*; | |||
|
|||
use super::DatagenCalendar; | |||
|
|||
type Trio<'a> = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: some docs on what this is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed it to VariantPatterns
, made it a struct, and gave it some docs.
for pattern_item in pattern_items.iter_mut() { | ||
let PatternItem::Field(field) = pattern_item else { | ||
continue; // nothing to do | ||
while let Err(e) = names.load_for_pattern(&DebugProvider, &date_time_pattern) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: can we guarantee that this terminates? PErhaps we should document why this terminates OR have some kind of debug tracker that ensures this only runs for X iterations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think it terminates, assuming that the code is written correctly. The loop only evaluates if there was a ConflictingField error. Each time the loop evaluates, it should reduce the ConflictingField errors by one, because it replaces the conflicting field with a non-conflicting field.
I'll add a comment about this.
@@ -213,6 +213,9 @@ impl SourceDataProvider { | |||
}) | |||
.map(|trio| { | |||
enforce_consistent_field_lengths(trio, |previous_field, field| { | |||
if attributes.as_str() == "ej" { | |||
return; // too noisy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: what's going on here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to filter by quality instead of a blanket filter on "ej"
.
@@ -11,7 +11,7 @@ | |||
"elements": [ | |||
"LLLL U", | |||
"MM.y", | |||
"r(U) MMMM", | |||
"r(U) LLLL", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: Same issue as in Chinese.
@Manishearth Let me know if you want me to improve the commit history |
The warnings in full datagen are:
|
Yes, just squashing commits such that there is one for the code changes and one for the data changes would be great |
It's a bit hard to squash commits since there are merge commits involved. I'll rewrite the branch. |
Thank you so much! |
Fixes #6205