Skip to content

(stale) add NSE function #217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 16 commits into from
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added .DS_Store
Binary file not shown.
104 changes: 104 additions & 0 deletions src/scores/continuous/standard_impl.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
from typing import Optional

import xarray as xr
import numpy as np
import pandas as pd

import scores.functions
import scores.utils
Expand Down Expand Up @@ -335,3 +337,105 @@ def multiplicative_bias(
fcst, obs = broadcast_and_match_nan(fcst, obs)
multi_bias = fcst.mean(dim=reduce_dims) / obs.mean(dim=reduce_dims)
return multi_bias


# add NSE code


def lst_to_array(fcst):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is run on obs data too, perhaps the arg name should be updated to something more general.

Can you also add a docstring and typehint?

We need to consider if we want to support lists as inputs to scores functions. Do you have a use case for this? If you don't have a use case for this, then I consider dropping support for lists and allow the user to convert their data to xarray or pandas. @tennlee do you have an opinion on scores supporting lists as inputs?

# Convert lists to xarray DataArrays
if not isinstance(fcst, xr.DataArray):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is problematic if the input is an xr.Dataset

fcst = xr.DataArray(fcst)
return fcst


def nse(fcst, obs, reduce_dims=None, preserve_dims=None, weights=None, angular=False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add type hints?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add a "*" after obs, to force the keyword arguments to be keyword-only.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make all args (apart from fcst and obs) to be keyword args?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are now using is_angular instead of angular in the other metrics. Can you update it to be is_angular?

"""
Calculate Nash-Sutcliffe Efficiency (NSE) for forecast and observed data.

Args:
fcst (FlexibleArrayType or list): Forecast or predicted variables.
obs (FlexibleArrayType or list): Observed variables.
reduce_dims (FlexibleDimensionTypes, optional): Dimensions to reduce along. Defaults to None.
preserve_dims (FlexibleDimensionTypes, optional): Dimensions to preserve. Defaults to None.
weights (xr.DataArray, optional): Weights to apply to error calculation. Defaults to None.
angular (bool, optional): Whether to treat data as angular (circular). Defaults to False.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
angular (bool, optional): Whether to treat data as angular (circular). Defaults to False.
is_angular (bool, optional): Whether to treat data as angular (circular). Defaults to False.

Comment on lines +354 to +359
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you add in type hints, you can remove the types from the docstring.


Returns:
FlexibleArrayType: Nash-Sutcliffe Efficiency (NSE) value.

Raises:
ValueError: If the input arrays are of different lengths or incompatible types.

References:
- references
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- references

- https://en.wikipedia.org/wiki/Nash–Sutcliffe_model_efficiency_coefficient

Examples:
# Case 1: NumPy arrays
>>> fcst_np = np.array([3, 4, 5, 6, 7])
>>> obs_np = np.array([2, 3, 4, 5, 6])
>>> nse(fcst_np, obs_np)
0.5

# Case 2: Pandas Series
>>> fcst_series = pd.Series([3, 4, 5, 6, 7])
>>> obs_series = pd.Series([2, 3, 4, 5, 6])
>>> nse(fcst_series, obs_series)
0.50

# Case 3: Xarray DataArray
>>> fcst_xr = xr.DataArray([3, 4, 5, 6, 7])
>>> obs_xr = xr.DataArray([2, 3, 4, 5, 6])
>>> nse(fcst_xr, obs_xr)
0.50

# Case 4: Lists
>>> fcst_list = [3, 4, 5, 6, 7]
>>> obs_list = [2, 3, 4, 5, 6]
>>> nse(fcst_list, obs_list)
0.50

# Case 5: Mixed types - NumPy array and Pandas Series
>>> fcst_mix = np.array([3, 4, 5, 6, 7])
>>> obs_mix = pd.Series([2, 3, 4, 5, 6])
>>> nse(fcst_mix, obs_mix)
0.50

"""
# convert to array
fcst = lst_to_array(fcst)

obs = lst_to_array(obs)
# if not isinstance(obs, xr.DataArray):
# obs = xr.DataArray(obs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete commented out code


# Check for input compatibility
if angular:
error = scores.functions.angular_difference(fcst, obs)
mean_obs = np.mean(obs)
diff_mean = scores.functions.angular_difference(obs, mean_obs)
else:
error = fcst - obs
mean_obs = np.mean(obs)
diff_mean = obs - mean_obs
# calculate nse
nse = 1 - np.sum((error) ** 2) / np.sum((diff_mean) ** 2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are redefining the name from out of scope (this object has the same name as the function)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this is not handling dimensions correctly. If you run this code trying to preserve a dimension, it will produce an error later on when scores.utils.gather_dimensions is called.


# Apply weights
if weights is not None:
error = error * weights
diff_mean = diff_mean * weights
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This weighting isn't doing anything as diff_mean isn't used anywhere


# Reduce dimensions
if preserve_dims or reduce_dims:
reduce_dims = scores.utils.gather_dimensions(
fcst.dims, obs.dims, reduce_dims=reduce_dims, preserve_dims=preserve_dims
)

if reduce_dims is not None:
_nse = nse.mean(dim=reduce_dims)
else:
_nse = nse

return _nse
Loading