Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compatibility with xr.DataTree #607

Merged
merged 15 commits into from
Mar 31, 2025
Merged

Conversation

mathause
Copy link
Member

  • Closes #xxx
  • Tests added
  • Fully documented, including CHANGELOG.rst

Adds compatibility with xr.DataTree. Works (at least locally) but is not finished and needs some ugly workarounds due to some design decisions in the new version... The new code path is only tested in upstream-dev ci.

I am not sure we should support both, datatree and xarray datatree, but this was actually the smaller part of the whole operation.

Copy link

codecov bot commented Jan 31, 2025

Codecov Report

Attention: Patch coverage is 79.68750% with 13 lines in your changes missing coverage. Please review.

Project coverage is 80.29%. Comparing base (615a0c7) to head (77d950a).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
mesmer/core/_datatreecompat.py 43.47% 13 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #607      +/-   ##
==========================================
- Coverage   80.44%   80.29%   -0.16%     
==========================================
  Files          49       50       +1     
  Lines        3079     3121      +42     
==========================================
+ Hits         2477     2506      +29     
- Misses        602      615      +13     
Flag Coverage Δ
unittests 80.29% <79.68%> (-0.16%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mathause mathause closed this Jan 31, 2025
@mathause mathause reopened this Jan 31, 2025
@mathause
Copy link
Member Author

The upstream packages seem flaky

Copy link
Collaborator

@veni-vidi-vici-dormivi veni-vidi-vici-dormivi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow thanks, that is a lot of work! I think it's okay not to support xarray-datatree at all then, since it is not recommended to use it anyway.

Copy link
Member Author

@mathause mathause left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. It was mostly annoying because 98 % I did not understand what is happening 🤷‍♂️

As long as we need to patch xr.map_over_datasets we can just as well support datatree (because we need to import from _datatreecompat anyway.

It's annoying that the new version has become so bothersome to work with 😒


# https://github.com/pydata/xarray/issues/10013
# tas_anoms = dt - ref.ds
tas_anoms = map_over_datasets(operator.sub, dt, ref.ds)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This takes all the fun out of using xr.DataTree and cannot be expected from a user. The other 'issues' (missing methods) will probably be fixed in time but this seems to be a design decision. 👎

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what we have to do is write a helper function calc_anomaly(dt, ref_period) with this inside.

@mathause
Copy link
Member Author

mathause commented Feb 3, 2025

I have not yet updated the example scripts and we need to fix the pin for xarray (so it's not only tested in upstream-dev ci).

@mathause
Copy link
Member Author

I opened pydata/xarray#10042 - there seems to be some willingness to enable skipping empty nodes. With this change we could avoid the skip_empty_nodes helper function and I'd say that's the moment we can "fully" support xr.DataTree.

But not super convinced this PR gets a fast turnaround 😒

@mathause
Copy link
Member Author

Ah yes and after 507cd1c this definitely requires xarray main, i.e. we require

xarray >=2023.07, < 2024.10, > 2025.1

(thats probably not a valid specifier)

@yquilcaille
Copy link
Collaborator

Should we merge this branch? I'm quite blocked now, because I'm trying to use the former datatree, but knowing that we will use this PR...

@yquilcaille
Copy link
Collaborator

yquilcaille commented Feb 26, 2025

Actually, there are no imports in the init.py of mesmer.core. Should we add a commit doing the following?

# flake8: noqa

from ._data import *
from .datatree import *
from .geospatial import *
from .grid import *
from .mask import *
from .regionmaskcompat import *
from .utils import *
from .volc import *
from .weighted import *

@mathause
Copy link
Member Author

Should we merge this branch? I'm quite blocked now, because I'm trying to use the former datatree, but knowing that we will use this PR...

I can merge (but see my next comment), or you can use the old version and I'll update it later.

@mathause
Copy link
Member Author

mathause commented Feb 27, 2025

Actually, there are no imports in the init.py of mesmer.core. Should we add a commit doing the following?

# flake8: noqa

from ._data import *
from .datatree import *
from .geospatial import *
from .grid import *
from .mask import *
from .regionmaskcompat import *
from .utils import *
from .volc import *
from .weighted import *

Users don't really need to access functions in core so there is no need to add these. Except for the ones in datatreecompat. We should make them top-level functions (mesmer.DataTree). (However, I would very much prefer that we don't need to release this and am still hoping for pydata/xarray#10042.)

@mathause mathause mentioned this pull request Mar 14, 2025
3 tasks
@mathause
Copy link
Member Author

This is not moving forward on xarrays site, so I am merging. I will merge +- as is. Open a PR for a calc_anomaly helper function, then drop support for the old datatree package (i.e. require xarray >= 2025.02).

@mathause
Copy link
Member Author

I also have to find out if map_over_datasets is only used internally. If not I'll make it a top-level function...

@mathause
Copy link
Member Author

And we can always remove the compatibility map_over_datasets if that get's resolved before we release.

@mathause mathause merged commit 40b7941 into MESMER-group:main Mar 31, 2025
9 of 10 checks passed
@mathause mathause deleted the xr.DataTree branch March 31, 2025 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants