Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Fix for overflow at edge cases in normalize() by raising exception #61083

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -643,6 +643,7 @@ Datetimelike
- Bug in :attr:`is_year_start` where a DateTimeIndex constructed via a date_range with frequency 'MS' wouldn't have the correct year or quarter start attributes (:issue:`57377`)
- Bug in :class:`DataFrame` raising ``ValueError`` when ``dtype`` is ``timedelta64`` and ``data`` is a list containing ``None`` (:issue:`60064`)
- Bug in :class:`Timestamp` constructor failing to raise when ``tz=None`` is explicitly specified in conjunction with timezone-aware ``tzinfo`` or data (:issue:`48688`)
- Bug in :class:`Timestamp` where :meth:`normalize` overflows at edge cases without raising an exception (:issue:`60583`)
- Bug in :func:`date_range` where the last valid timestamp would sometimes not be produced (:issue:`56134`)
- Bug in :func:`date_range` where using a negative frequency value would not include all points between the start and end values (:issue:`56147`)
- Bug in :func:`tseries.api.guess_datetime_format` would fail to infer time format when "%Y" == "%H%M" (:issue:`57452`)
Expand Down
19 changes: 15 additions & 4 deletions pandas/_libs/tslibs/timestamps.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -1269,7 +1269,7 @@ cdef class _Timestamp(ABCTimestamp):
int64_t ppd = periods_per_day(self._creso)
_Timestamp ts

normalized = normalize_i8_stamp(local_val, ppd)
normalized = normalize_i8_stamp(self, local_val, ppd)
ts = type(self)._from_value_and_reso(normalized, reso=self._creso, tz=None)
return ts.tz_localize(self.tzinfo)

Expand Down Expand Up @@ -3438,9 +3438,9 @@ Timestamp.daysinmonth = Timestamp.days_in_month
# ----------------------------------------------------------------------
# Scalar analogues to functions in vectorized.pyx


@cython.overflowcheck(True)
@cython.cdivision(False)
cdef int64_t normalize_i8_stamp(int64_t local_val, int64_t ppd) noexcept nogil:
cdef normalize_i8_stamp(self, int64_t local_val, int64_t ppd):
"""
Round the localized nanosecond timestamp down to the previous midnight.

Expand All @@ -3454,4 +3454,15 @@ cdef int64_t normalize_i8_stamp(int64_t local_val, int64_t ppd) noexcept nogil:
-------
int64_t
"""
return local_val - (local_val % ppd)
cdef:
int64_t remainder
int64_t result
try:
remainder = local_val % ppd
result = local_val - remainder
except (OverflowError, OutOfBoundsDatetime) as err:
raise OutOfBoundsDatetime(
f"Cannot normalize {self} to midnight without overflow"
) from err

return result
14 changes: 14 additions & 0 deletions pandas/tests/scalar/timestamp/methods/test_normalize.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

from pandas._libs.tslibs import Timestamp
from pandas._libs.tslibs.dtypes import NpyDatetimeUnit
from pandas._libs.tslibs.np_datetime import OutOfBoundsDatetime


class TestTimestampNormalize:
Expand All @@ -19,3 +20,16 @@ def test_normalize_pre_epoch_dates(self):
result = Timestamp("1969-01-01 09:00:00").normalize()
expected = Timestamp("1969-01-01 00:00:00")
assert result == expected

def test_normalize_edge_cases(self):
# GH: 60583
expected_msg = (
r"Cannot normalize 1677-09-21 00:12:43\.145224193 to midnight "
"without overflow"
)
with pytest.raises(OutOfBoundsDatetime, match=expected_msg):
Timestamp.min.normalize()

result = Timestamp.max.normalize()
excepted = Timestamp("2262-04-11 00:00:00")
assert result == excepted
Loading