Correctly check the RD range for Date::try_from_fields and Date::try_add*#7676
Correctly check the RD range for Date::try_from_fields and Date::try_add*#7676Manishearth merged 18 commits intounicode-org:mainfrom
Conversation
| ) -> Result<Self, DateAddError> { | ||
| // We preemptively protect ourselves from overly-large values | ||
| // | ||
| // This is mostly needed for avoiding overflows on Duration arithmetic, |
There was a problem hiding this comment.
Open to different approaches here. It feels a bit strange to have all of this nice infrastructure range-checking years and we still need a coarse preemptive check because the duration math overflows.
One thing we could do is have surpasses_inner returning a YearOverflowError, and surpasses() returning a bool where the error is turned into true.
There was a problem hiding this comment.
I don't know this code enough to comment on that. However, if you're establishing an invariant here, propagate it in documentation to the relevant functions (I guess add_year_to, month_from_ordinal?)
There was a problem hiding this comment.
I think the "coarse preemptive check" should be the only check, until the final RD check at the end.
76b730c to
2390e3f
Compare
| ) -> Result<Self, DateAddError> { | ||
| // We preemptively protect ourselves from overly-large values | ||
| // | ||
| // This is mostly needed for avoiding overflows on Duration arithmetic, |
There was a problem hiding this comment.
I don't know this code enough to comment on that. However, if you're establishing an invariant here, propagate it in documentation to the relevant functions (I guess add_year_to, month_from_ordinal?)
| base_month, | ||
| DateFromFieldsOptions::from_add_options(options), | ||
| ) | ||
| .map_err(|e| { |
There was a problem hiding this comment.
Why? In this case the errors map cleanly. If the errors stop mapping cleanly we can introduce a new match the way we did before.
There was a problem hiding this comment.
this is less readable if you're trying to figure out how an error was produced. rust-analyzer cannot tell you which ?s implicitly use the From implementation
There was a problem hiding this comment.
That's just a general problem with Rust errors: that's an argument against using ? period. These errors are going to go through ? a half dozen more times before hitting the user. Changing that here won't do anything.
There are Rust error libraries that will attach context to ? for you but they're heavier so we don't use them.
There was a problem hiding this comment.
That's just a general problem with Rust errors: that's an argument against using
?period.
yep. doesn't mean it's not an argument here.
These errors are going to go through ? a half dozen more times before hitting the user.
that's not our problem. they don't go through another ? before being returned from our public API, and appearing in our unit tests where it's often needed to figure out where they're from
There was a problem hiding this comment.
Wait, hold on, I already fixed this instance of this pattern because the errors do not map cleanly.. I thought you were talking about the other instance of this pattern.
But still, I disagree that we should avoid ?. It's annoying in unit tests, but it's also annoying to wade through extra match statements.
There was a problem hiding this comment.
well it's not an extra match statement, the MonthError to LunisolarDateError conversion only happens in once, let's just keep it there and not move it to another file
There was a problem hiding this comment.
Okay, so you are talking about the other instance of this pattern.
It's an extra match statement within complicated code. Moving the match statement elsewhere is better. All of the calendar arithmetic/construction code is already messy.
I could introduce a .map_err(MonthError::into)? if you really want but I still think that is kind of silly. The Rust community has pretty universally settled on the style of using ? wherever it works and I don't plan on changing how I use ? unless you argue for and land a style rule for it.
| cal, | ||
| )) | ||
| )?; | ||
| // We early checked for a generous range of years, now we must check |
There was a problem hiding this comment.
ah, here we go. move this check into new_balanced as that calls ArithmeticDate::new_unchecked and violates its invariant
There was a problem hiding this comment.
We cannot, new_balanced must be able to produce slightly-out-of-range years.
I can't see a way to make the code work without it. I documented this new invariant on new_balanced, and I'm about to push a commit that documents it more. I also updated the documentation on new_unchecked with the new invariant.
When I wrote the tests my core desired property was that for arithmetic should work as long as the input and output were in RD range. The arithmetic algorithm needs to be able to temporarily "peek" out of range to do so, see the end_of_month calculation here (and basically all of the surpasses stuff used in until). So new_balanced needs to work for near-edge dates on either side of the range boundary, which means we cannot have it follow the strict RD range.
If we want to give up that property we need to more carefully figure out what the desired behavior is here. I find that property to be rather sensible and nice to have.
We could add new_unchecked_with_assertion that debug asserts in all of the other call sites that it is within RD range. But I don't really think it's necessary.
3f44cd3 to
867db09
Compare
| /// use icu::calendar::options::{DateAddOptions, Overflow}; | ||
| /// | ||
| /// // Hebrew year 5784 is a leap year, 5785 is not. | ||
| /// // Adar I (the leap month) is month 5 in a leap year. |
| #[inline] | ||
| fn year_info_from_extended(&self, extended_year: i32) -> Self::YearInfo { | ||
| debug_assert!(crate::calendar_arithmetic::VALID_YEAR_RANGE.contains(&extended_year)); | ||
| debug_assert!(crate::calendar_arithmetic::GENEROUS_YEAR_RANGE.contains(&extended_year)); |
There was a problem hiding this comment.
Thought: This check doesn't really achieve anything since GENEROUS_YEAR_RANGE is bigger than the RD range. I guess it's harmless
There was a problem hiding this comment.
Yeah, this was added deliberately because previously it was quite haphazard, it's nice to have a check right before we call the tricky calendrical APIs.
| let extended_year = if let Some(era) = era { | ||
| calendar.extended_year_from_era_year( | ||
| era.as_bytes(), | ||
| range_check(year, "era_year", CONSTRUCTOR_YEAR_RANGE)?, |
There was a problem hiding this comment.
Issue: I think this one should check GENEROUS_YEAR_RANGE because it seems really weird for era year and extended year ranges to be double-checked. It results in a smaller effective range. This is especially noticeable in a calendar where the era and extended years are not the same, like Ethiopian with input era "aa"; only ~15000 dates can be constructed instead of ~20000 dates when the era and extended year are the same.
Alternatively, don't check the extended year range if the input was an era year.
There was a problem hiding this comment.
This is existing documented behavior (on from_codes). Separate discussion to change if desired.
| let rd = C::to_rata_die_inner(year, month, day); | ||
|
|
||
| // We early checked for a generous range of years, now we must check | ||
| // to ensure we are actually in range for our core invariant. | ||
| if !VALID_RD_RANGE.contains(&rd) { | ||
| return Err(DateFromFieldsError::Overflow); | ||
| } |
There was a problem hiding this comment.
Issue: I don't love converting to RD just to check the RD range. I would rather have a calendar-specific check. This function is already generic in the calendar. (ok for follow-up)
There was a problem hiding this comment.
I don't really wish to maintain calendar-specific range checking code, it's too easy to get wrong.
There was a problem hiding this comment.
Fine for follow-up. to_rata_die is slow and we otherwise don't need it. Maintaining a const calendar-specific min and max date where we use Ord to check if in-range is fine and better.
| year, // == year_info.to_extended_year() + offset | ||
| "year", | ||
| (VALID_YEAR_RANGE.start() + offset)..=(VALID_YEAR_RANGE.end() + offset), | ||
| (CONSTRUCTOR_YEAR_RANGE.start() + offset)..=(CONSTRUCTOR_YEAR_RANGE.end() + offset), |
There was a problem hiding this comment.
Issue (here and elsewhere, similar to above): I really don't like checking the year against CONSTRUCTOR_YEAR_RANGE multiple times
There was a problem hiding this comment.
Preexisting behavior, often (but not always?) documented. @robertbastian introduced this I believe.
| resolved_year = | ||
| cal.year_info_from_extended_checked(resolved_year.to_extended_year() - 1)?; |
There was a problem hiding this comment.
Issue: I thought we were going to make sure the input to this function (new_balanced) was such that we never cause calendar-specific integer overflow. A precondition should be that year is in the generous range, ordinal_month does not exceed the generous range converted to months, and day does not exceed the generous range converted to days.
| ) -> Result<Self, DateAddError> { | ||
| // We preemptively protect ourselves from overly-large values | ||
| // | ||
| // This is mostly needed for avoiding overflows on Duration arithmetic, |
There was a problem hiding this comment.
I think the "coarse preemptive check" should be the only check, until the final RD check at the end.
|
Regarding the coarse presumtpive checks and the other checks: My main worry is that YearInfo calculations should never get a year that is too large for them to handle, and should be able to preemptively assert that they get a good year. I don't want calendar-specific code to need to get into the business of worrying too much about range. My first commit here introduced I also want to limit overly-slow APIs. #7077 is the issue for fixing that, and I'm hoping to work on it, but it's not a release blocker.
I made the change. It's a separate commit. If people later disagree about this architecture, please open a PR with a revert of that commit. |
eabc756 to
6994df0
Compare
Progress on #7076
This does a bunch of things:
GENEROUS_YEAR_RANGE, which is a year range that is slightly wider thanVALID_RD_RANGEfor all calendars, "good enough" for early checks.GENEROUS_YEAR_RANGEas a precondition intry_from_fields, and also checksVALID_RD_RANGEas a postcondition.try_from_fields.YearOverflowErrorused for internal functions which can only error when out of year rangeDateFieldsResolver::year_info_from_eratoextended_year_from_era_year. This splits its responsibilities more cleanly: it doesn't have to bother with generating a YearInfo (which is risky), it just adjusts the era year.DateFieldsResolver::year_info_from_extended_checkedwhich checks the generous range and uses it infrom_fieldsand arithmetic code.DateAddErrorbased on consensus in Consider narrower error type between try_new_from_codes and try_from_fields #7010.If there are issues in DateAddError or the test, I would prefer to work on them separately, since this PR is getting pretty large.