-
Notifications
You must be signed in to change notification settings - Fork 174
Fix SPI calculation for zero precipitation inputs #595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Previously, SPI calculation for time steps with 100% zero precipitation history resulted in +infinity (clipped to 3.09), incorrectly indicating extreme wetness. This change ensures that such cases return NaN (undefined), which is statistically correct for degenerate distributions. Updated tests/test_compute.py to allow NaN values for zero-precipitation inputs when the historical probability of zero is 1.0.
Addresses code review feedback on SPI zero-precipitation handling: - Add _replace_zeros_with_nan() helper to eliminate duplicated zero-to-NaN conversion logic between gamma_parameters() and transform_fitted_gamma() - Compute zero_mask once and reuse it, avoiding redundant equality operations (performance improvement) - Add detailed comment explaining all-zero column behavior: produces -infinity (extreme drought) not NaN, which is the correct interpretation for drought monitoring - Enhance test assertions to verify zero values produce finite results (not NaN or +infinity) - Add dedicated test for all-zeros edge case to guard against future regressions Reduces code duplication and improves maintainability while clarifying the semantic meaning of all-zero precipitation in drought indices. Refs #533
Reviewer's GuideRefactors SPI gamma-fitting to centralize zero-handling, ensures zeros are excluded from parameter fitting, and adjusts probability calculations so zero and all-zero precipitation inputs yield finite, climatologically consistent SPI values, with tests updated and expanded to cover these edge cases. Sequence diagram for SPI transform_fitted_gamma zero-handling and probability flowsequenceDiagram
actor Caller
participant TransformFittedGamma
participant _validate_array
participant _replace_zeros_with_nan
participant gamma_parameters
participant scipy_stats_gamma
participant scipy_stats_norm
Caller->>TransformFittedGamma: transform_fitted_gamma(values, alphas, betas, periodicity)
TransformFittedGamma->>_validate_array: _validate_array(values, periodicity)
_validate_array-->>TransformFittedGamma: validated_values
TransformFittedGamma->>_replace_zeros_with_nan: _replace_zeros_with_nan(validated_values)
_replace_zeros_with_nan-->>TransformFittedGamma: zero_mask, values_for_fitting
TransformFittedGamma->>TransformFittedGamma: zeros = zero_mask.sum(axis=0)
TransformFittedGamma->>TransformFittedGamma: probabilities_of_zero = zeros / n_samples
TransformFittedGamma->>TransformFittedGamma: set probabilities_of_zero == 1.0 to 0.0
alt alphas or betas not provided
TransformFittedGamma->>gamma_parameters: gamma_parameters(values_for_fitting, data_start_year, data_end_year, periodicity)
gamma_parameters->>_validate_array: _validate_array(values_for_fitting, periodicity)
gamma_parameters->>_replace_zeros_with_nan: _replace_zeros_with_nan(validated_values)
_replace_zeros_with_nan-->>gamma_parameters: zero_mask_fit, values_no_zeros
gamma_parameters-->>TransformFittedGamma: alphas, betas
end
TransformFittedGamma->>scipy_stats_gamma: gamma.cdf(values_for_fitting, alphas, scale=betas)
scipy_stats_gamma-->>TransformFittedGamma: gamma_probabilities
TransformFittedGamma->>TransformFittedGamma: gamma_probabilities[zero_mask] = 0.0
TransformFittedGamma->>TransformFittedGamma: probabilities = probabilities_of_zero + (1.0 - probabilities_of_zero) * gamma_probabilities
TransformFittedGamma->>scipy_stats_norm: norm.ppf(probabilities)
scipy_stats_norm-->>TransformFittedGamma: spi_values
TransformFittedGamma-->>Caller: spi_values (finite, -infinity for all-zero history)
Class diagram for compute module gamma fitting helpers and SPI transformationclassDiagram
class ComputeModule {
+gamma_parameters(values, data_start_year, data_end_year, periodicity)
+transform_fitted_gamma(values, data_start_year, data_end_year, periodicity, alphas, betas)
+_replace_zeros_with_nan(values)
}
class gamma_parameters {
+values
+data_start_year
+data_end_year
+periodicity
+returns alphas
+returns betas
}
class transform_fitted_gamma {
+values
+data_start_year
+data_end_year
+periodicity
+alphas
+betas
+uses zero_mask
+uses probabilities_of_zero
+returns spi_values
}
class _replace_zeros_with_nan {
+values
+returns zero_mask
+returns values_copy
}
ComputeModule ..> gamma_parameters : defines
ComputeModule ..> transform_fitted_gamma : defines
ComputeModule ..> _replace_zeros_with_nan : defines
transform_fitted_gamma --> _replace_zeros_with_nan : calls
gamma_parameters --> _replace_zeros_with_nan : calls
transform_fitted_gamma --> gamma_parameters : optionally calls
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've found 1 issue, and left some high level feedback:
- In
transform_fitted_gamma, the all-zeros case is currently handled indirectly by forcing NaNs through the gamma pipeline and then relying on theprobabilities_of_zero == 1.0adjustment; consider an explicit early branch for the all-zero time-step case to make the behavior clearer and less dependent on downstream side effects. - The
_replace_zeros_with_nanhelper always copies the input array, andgamma_parameters/transform_fitted_gammaalso call_validate_arraywhich may already copy; consider documenting or optimizing this to avoid unnecessary duplication for large inputs.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `transform_fitted_gamma`, the all-zeros case is currently handled indirectly by forcing NaNs through the gamma pipeline and then relying on the `probabilities_of_zero == 1.0` adjustment; consider an explicit early branch for the all-zero time-step case to make the behavior clearer and less dependent on downstream side effects.
- The `_replace_zeros_with_nan` helper always copies the input array, and `gamma_parameters`/`transform_fitted_gamma` also call `_validate_array` which may already copy; consider documenting or optimizing this to avoid unnecessary duplication for large inputs.
## Individual Comments
### Comment 1
<location> `tests/test_compute.py:158-167` </location>
<code_context>
+def test_transform_fitted_gamma_all_zeros_produces_finite_spi():
</code_context>
<issue_to_address>
**issue (testing):** Test name/docstring imply finite SPI, but the assertions also allow -inf; align the test with the intended behavior
The MR description says all-zero time steps should yield `-∞` SPI, while the test name/docstring talk about "finite SPI" and "negative infinity or large negative values", so the assertions (`not np.any(np.isnan(result))` and `np.all(result < 0)`) don’t uniquely specify what’s correct.
Please either:
- Rename the test and update the docstring to explicitly allow `-inf`, or
- Tighten the assertions to enforce the exact contract (e.g. `np.all(np.isneginf(result))` if `-inf` is required, or explicitly allow "finite or -inf" and assert that via a mask).
This will make the expected behavior for all-zero histories precise and prevent regressions from slipping through.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| def test_transform_fitted_gamma_all_zeros_produces_finite_spi(): | ||
| """ | ||
| Test that all-zero precipitation produces finite SPI values, not NaN. | ||
| When all precipitation values are zero, SPI should indicate extreme drought | ||
| (negative infinity or large negative values), not NaN. | ||
| """ | ||
| # one year of daily data (366 days) | ||
| n_years = 1 | ||
| n_days_per_year = 366 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (testing): Test name/docstring imply finite SPI, but the assertions also allow -inf; align the test with the intended behavior
The MR description says all-zero time steps should yield -∞ SPI, while the test name/docstring talk about "finite SPI" and "negative infinity or large negative values", so the assertions (not np.any(np.isnan(result)) and np.all(result < 0)) don’t uniquely specify what’s correct.
Please either:
- Rename the test and update the docstring to explicitly allow
-inf, or - Tighten the assertions to enforce the exact contract (e.g.
np.all(np.isneginf(result))if-infis required, or explicitly allow "finite or -inf" and assert that via a mask).
This will make the expected behavior for all-zero histories precise and prevent regressions from slipping through.
Replace exact equality check (== 1.0) with np.isclose() to handle floating-point arithmetic precision issues when detecting all-zero time steps. Prevents edge cases where probability_of_zero is 0.9999999... from being missed due to rounding errors. Refs #533
|



Summary
Fixes incorrect SPI (Standardized Precipitation Index) output when precipitation data contains zeros. Previously, zero precipitation values could produce NaN results due to issues in the gamma distribution fitting process. This MR ensures zeros are handled correctly and produce meaningful drought indicators.
Problem
When computing SPI using gamma distribution fitting (transform_fitted_gamma), zero precipitation values caused two issues:
Solution
Core fix: Proper zero-value handling in gamma fitting
All-zero time steps produce extreme drought (-∞)
When probability_of_zero == 1.0 (all historical values are zero for a time step), the SPI now correctly returns -∞ (extreme drought) rather than NaN or +∞. This is the correct climatological interpretation: a location with no recorded precipitation is in perpetual extreme drought.
Code quality improvements
Changes
Test plan
Behavior summary
Summary by Sourcery
Handle zero-precipitation values correctly in SPI gamma-distribution transformation and centralize zero-handling logic.
Bug Fixes:
Enhancements:
Tests: