Fix ZeroDivisonError in add T bounds #761

chengzhuzhang · 2025-05-20T01:55:10Z

Description

Closes [Bug]: Error:ZeroDivisionError: float division by zero when adding missing bounds for daily frequency data #760

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
My changes generate no new warnings
Any dependent changes have been merged and published in downstream modules

If applicable:

I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass with my changes (locally and CI/CD build)
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have noted that this is a breaking change for a major release (fix or feature that would cause existing functionality to not work as expected)

codecov · 2025-05-20T01:57:54Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (14f876b) to head (4dc841a).

Additional details and impacted files

@@            Coverage Diff            @@
##              main      #761   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           16        16           
  Lines         1702      1702           
=========================================
  Hits          1702      1702

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

chengzhuzhang · 2025-05-20T17:54:44Z

@tomvothecoder this fix works okay for the e3sm_diags PR:E3SM-Project/e3sm_diags#982
though I don't know if there are other tests needed. Please review, thanks.

Copilot

Pull Request Overview

This PR fixes a ZeroDivisionError when computing hourly intervals by using total_seconds() instead of the seconds attribute on timedelta objects.

Uses diff.total_seconds() for accurate hour calculation
Maintains downstream daily sub-frequency logic

Comments suppressed due to low confidence (1)

xcdat/bounds.py:652

Add unit tests for edge cases where diff is zero, less than one hour, exactly 24 hours, and greater than 24 hours to verify daily_subfreq is computed correctly and no division-by-zero or invalid frequency occurs.

daily_subfreq = int(24 / hrs)  # type: ignore

Copilot · 2025-05-20T19:44:07Z

xcdat/bounds.py

                    diff = pd.to_timedelta(diff)

-                hrs = diff.seconds / 3600
+                hrs = diff.total_seconds() / 3600


If diff exceeds 24 hours, hrs will be >=24 and daily_subfreq = int(24 / hrs) becomes zero, leading to downstream errors. Consider adding a guard to ensure hrs is within (0, 24] or handle diff > 24h explicitly before computing daily_subfreq.

Suggested change

hrs = diff.total_seconds() / 3600

hrs = diff.total_seconds() / 3600

if not (0 < hrs <= 24):

raise ValueError(

f"Invalid time difference: {hrs} hours. Expected a value in the range (0, 24]."

)

pochedls · 2025-05-20T20:13:15Z

@chengzhuzhang – is there a compact explanation for why .total_seconds() works and .seconds() does not? It seems like .seconds gives you the seconds into a single day (i.e., <86400) where as .total_seconds can go beyond a day (see caution statement here).

I can see how .seconds could create problems (and we probably should have been using .total_seconds all along), but this seems to arise when xcdat is processing bounds for temporal resolution that is finer than daily resolution (e.g., hourly or 3-hourly). Shouldn't .seconds and .total_seconds give the same result in this case?

It might be helpful to add a little bit of context if we ever revisit this section of code.

chengzhuzhang · 2025-05-20T22:59:53Z

@pochedls yes, I should have written better description... I'm just copy over the caution statement you linked:

It is a somewhat common bug for code to unintentionally use this attribute when it is actually intended to get a total_seconds() value instead:

from datetime import timedelta

duration = timedelta(seconds=11235813)

duration.days, duration.seconds
(130, 3813)

duration.total_seconds()
11235813.0

In my case, I'm processing daily data, for example, I have a timedelta object like following:

>>> diff
datetime.timedelta(days=1)
>>> diff.seconds    # hr = 24/diff caused the ZeroDivisionError: 
0
>>> diff.total_seconds()
86400.0

pochedls · 2025-05-20T23:06:36Z

I realize it is daily data, so I would have thought the inferred frequency is daily and it would have skipped the problematic lines of code altogether.

chengzhuzhang · 2025-05-20T23:31:24Z

I realize it is daily data, so I would have thought the inferred frequency is daily and it would have skipped the problematic lines of code altogether.

Interesting! I only looked around lines of code that caused trouble and didn't go through the logic. Yes, it is interesting this was caught by freq = hours condition.

tomvothecoder · 2025-05-20T23:47:04Z

I realize it is daily data, so I would have thought the inferred frequency is daily and it would have skipped the problematic lines of code altogether.

Interesting! I only looked around lines of code that caused trouble and didn't go through the logic. Yes, it is interesting this was caught by freq = hours condition.

xCDAT's add_missing_bounds() will attempt to add time bounds by inferring the frequency. In this case, It looks like the inferred frequency is hour but we actually need day.

The call stack looks like:

add_missing_bounds() call to self._create_time_bounds():
- xcdat/xcdat/bounds.py
  
  Lines 193 to 194 in 4dc841a
  
  elif axis == "T":
  
  bounds = self._create_time_bounds(coord)
freq is inferred with _infer_freq()
- xcdat/xcdat/bounds.py
  
  Line 616 in 14f876b
  
  freq = _infer_freq(time) if freq is None else freq # type: ignore

_infer_freq() implementation:

xcdat/xcdat/temporal.py

Lines 2080 to 2111 in 14f876b

    
           def _infer_freq(time_coords: xr.DataArray) -> Frequency: 
        
               """Infers the time frequency from the coordinates. 
        
               This method infers the time frequency from the coordinates by 
        
               calculating the minimum delta and comparing it against a set of 
        
               conditionals. 
        
               The native ``xr.infer_freq()`` method does not work for all cases 
        
               because the frequency can be irregular (e.g., different hour 
        
               measurements), which ends up returning None. 
        
               Parameters 
        
               ---------- 
        
               time_coords : xr.DataArray 
        
                   A DataArray for the time dimension coordinate variable. 
        
               Returns 
        
               ------- 
        
               Frequency 
        
                   The time frequency. 
        
               """ 
        
               # TODO: Raise exception if the frequency cannot be inferred. 
        
               min_delta = pd.to_timedelta(np.diff(time_coords).min(), unit="ns") 
        
               if min_delta < pd.Timedelta(days=1): 
        
                   return "hour" 
        
               elif min_delta >= pd.Timedelta(days=1) and min_delta < pd.Timedelta(days=21): 
        
                   return "day" 
        
               elif min_delta >= pd.Timedelta(days=21) and min_delta < pd.Timedelta(days=300): 
        
                   return "month" 
        
               else: 
        
                   return "year"

Is there a bug in _infer_freq()? @pochedls @chengzhuzhang

chengzhuzhang · 2025-05-21T00:02:07Z

yep, probably the inferring mechanism (bullet # 3) is not very reliable. Following based on data from the MVCE.

>>> time_coords
<xarray.DataArray 'time' (time: 7305)> Size: 58kB
array([cftime.DatetimeGregorian(2001, 1, 1, 12, 0, 0, 0, has_year_zero=False),
       cftime.DatetimeGregorian(2001, 1, 2, 12, 0, 0, 0, has_year_zero=False),
       cftime.DatetimeGregorian(2001, 1, 3, 12, 0, 0, 0, has_year_zero=False),
       ...,
       cftime.DatetimeGregorian(2020, 12, 29, 12, 0, 0, 0, has_year_zero=False),
       cftime.DatetimeGregorian(2020, 12, 30, 12, 0, 0, 0, has_year_zero=False),
       cftime.DatetimeGregorian(2020, 12, 31, 12, 0, 0, 0, has_year_zero=False)],
      shape=(7305,), dtype=object)
Coordinates:
  * time     (time) object 58kB 2001-01-01 12:00:00 ... 2020-12-31 12:00:00
Attributes:
    axis:           T
    cell_methods:   time: mean
    standard_name:  time
>>> min_delta = pd.to_timedelta(np.diff(time_coords).min(), unit="ns")
>>> min_delta
Timedelta('0 days 23:45:00')

Change min_delta = pd.to_timedelta(np.diff(time_coords).min(), unit="ns") to min_delta = pd.to_timedelta(np.diff(time_coords).max(), unit="ns") would class this dataset under day

chengzhuzhang · 2025-05-21T00:14:28Z

Having a close look of the data. I think it might be data problem. The min time difference that is shorter than one day is actually caused by an odd time values around Feb of a leap year:

>>> time_coords[424].values
array(cftime.DatetimeGregorian(2002, 3, 1, 12, 0, 0, 0, has_year_zero=False),
      dtype=object)
>>> time_coords[423].values
array(cftime.DatetimeGregorian(2002, 2, 28, 12, 15, 0, 0, has_year_zero=False),
      dtype=object)

fix ZeroDivisonError in add T bounds

4dc841a

github-project-automation bot added this to xCDAT Development May 20, 2025

github-project-automation bot moved this to Todo in xCDAT Development May 20, 2025

github-actions bot added the type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors. label May 20, 2025

chengzhuzhang mentioned this pull request May 20, 2025

Update tropical_subseasonal set for eamxx output E3SM-Project/e3sm_diags#982

Merged

9 tasks

chengzhuzhang requested a review from tomvothecoder May 20, 2025 17:53

tomvothecoder requested a review from Copilot May 20, 2025 19:42

tomvothecoder moved this from Todo to In Review in xCDAT Development May 20, 2025

tomvothecoder added this to the FY25Q3 (04/01/25 - 06/30/25) milestone May 20, 2025

Copilot AI reviewed May 20, 2025

View reviewed changes

This was referenced May 21, 2025

[Enhancement]: Improve inference logic for time frequncy #762

Closed

[Bug]: Error:ZeroDivisionError: float division by zero when adding missing bounds for daily frequency data #760

Closed

chengzhuzhang closed this May 21, 2025

github-project-automation bot moved this from In Review to Done in xCDAT Development May 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix ZeroDivisonError in add T bounds #761

Fix ZeroDivisonError in add T bounds #761

Uh oh!

chengzhuzhang commented May 20, 2025

Uh oh!

codecov bot commented May 20, 2025

Uh oh!

chengzhuzhang commented May 20, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI May 20, 2025

Uh oh!

pochedls commented May 20, 2025

Uh oh!

chengzhuzhang commented May 20, 2025 •

edited

Loading

Uh oh!

pochedls commented May 20, 2025

Uh oh!

chengzhuzhang commented May 20, 2025

Uh oh!

tomvothecoder commented May 20, 2025

Uh oh!

chengzhuzhang commented May 21, 2025 •

edited

Loading

Uh oh!

chengzhuzhang commented May 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

-                hrs = diff.total_seconds() / 3600
+                hrs = diff.total_seconds() / 3600
+                if not (0 < hrs <= 24):
+                    raise ValueError(
+                        f"Invalid time difference: {hrs} hours. Expected a value in the range (0, 24]."
+                    )

Fix ZeroDivisonError in add T bounds #761

Fix ZeroDivisonError in add T bounds #761

Uh oh!

Conversation

chengzhuzhang commented May 20, 2025

Description

Checklist

Uh oh!

codecov bot commented May 20, 2025

Codecov Report

Uh oh!

chengzhuzhang commented May 20, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI May 20, 2025

Choose a reason for hiding this comment

Uh oh!

pochedls commented May 20, 2025

Uh oh!

chengzhuzhang commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pochedls commented May 20, 2025

Uh oh!

chengzhuzhang commented May 20, 2025

Uh oh!

tomvothecoder commented May 20, 2025

Uh oh!

chengzhuzhang commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chengzhuzhang commented May 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chengzhuzhang commented May 20, 2025 •

edited

Loading

chengzhuzhang commented May 21, 2025 •

edited

Loading