-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds open_datatree and load_datatree to the tutorial module #10082
Adds open_datatree and load_datatree to the tutorial module #10082
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got two comments. Other than that, we'll probably have to refactor the way we call pooch
since we now basically duplicated the code of open_dataset
(doesn't have to be in this PR, though).
xarray/tests/test_tutorial.py
Outdated
cache_dir = tmp_path / tutorial._default_cache_dir_name | ||
ds = tutorial.open_dataset(self.testfile, cache_dir=cache_dir).load() | ||
ds = tutorial.open_dataset(testfile, cache_dir=cache_dir).load() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's probably better to just hard-code the dataset name into the test, there's no point in parametrizing this (to be clear, this part of the test suite is pretty old):
ds = tutorial.open_dataset(testfile, cache_dir=cache_dir).load() | |
ds = tutorial.open_dataset("tiny", cache_dir=cache_dir).load() |
url = external_urls[name] | ||
else: | ||
path = pathlib.Path(name) | ||
if not path.suffix: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do the hdf5
file work with both netcdf4
and h5netcdf
? Otherwise we might need to specialize, like we do with grib
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, imerghh_730.HDF5 and imerghh_830.HDF5, works for both engines. I think if we wanted to add hdf5
files without named dimensions we would want to specify the engine as h5netcdf
.
EDIT:
Since pydata/xarray-data#32 was merged, we do have to explicitly add the extension, e.g. xr.tutorial.open_datatree('imerghh_830.hdf5')
, otherwise it defaults to .nc
from xarray import DataArray, tutorial | ||
from xarray.tests import assert_identical, network | ||
from xarray import DataArray, DataTree, tutorial | ||
from xarray.testing import assert_identical |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this to use the xarray.testing
module's assert_identical
because xarray.tests
didn't support DataTree objects.
FYI these checks look like they are failing in main 😬 |
yep, that's a change to |
Sure! Is there a issue for this yet? I can get a PR started and you can jump in when you're free. |
there's #10084, but nothing else. I think you can just open the PR |
* ``"imerghh_730"``: GPM IMERG Final Precipitation L3 Half Hourly 0.1 degree x 0.1 degree V07 from 2021-08-29T07:30:00.000Z | ||
* ``"imerghh_830"``: GPM IMERG Final Precipitation L3 Half Hourly 0.1 degree x 0.1 degree V07 from 2021-08-29T08:30:00.000Z |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good to go, though I do think that for tutorial documentation on xarray.DataTree we might want to add some other possible datasets because these two don't really have a structure that fully requires/shows off the use of DataTree (as I discussed with @eni-awowale the other day).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that makes sense to me! I think we can modify the IMERG dataset as you suggested and maybe note that the modified version is derived from this original product.
* main: (85 commits) Adds open_datatree and load_datatree to the tutorial module (pydata#10082) Fix version in requires_zarr_v3 fixture (pydata#10145) Fix `open_datatree` when `decode_cf=False` (pydata#10141) [docs] `DataTree` cannot be constructed from `DataArray` (pydata#10142) Refactor datetime and timedelta encoding for increased robustness (pydata#9498) Fix test_distributed::test_async (pydata#10138) Refactor concat / combine / merge into `xarray/structure` (pydata#10134) Split `apply_ufunc` out of `computation.py` (pydata#10133) Refactor modules from `core` into `xarray.computation` (pydata#10132) Refactor compatibility modules into xarray.compat package (pydata#10131) Fix type issues from pandas stubs (pydata#10128) Don't skip tests when on a `mypy` branch (pydata#10129) Change `python_files` in `pyproject.toml` to a list (pydata#10127) Better `uv` compatibility (pydata#10124) explicitly cast the dtype of `where`'s condition parameter to `bool` (pydata#10087) Use `to_numpy` in time decoding (pydata#10081) Pin pandas stubs (pydata#10119) Fix broken Zarr test (pydata#10109) Update asv badge url in README.md (pydata#10113) fix and supress some test warnings (pydata#10104) ...
Adds
open_datatree
andload_datatree
to the tutorial modulewhats-new.rst
api.rst