Skip to content

Consider pulling real data from cloud in tutorials with zarr #662

Open
@nicholasloveday

Description

@nicholasloveday

Some of the tutorials use real data. These don't work as well in binder as you need to download the data and save it to disk.

An alternative approach would be to pull the data from the cloud into memory.

E.g.,

hres = xr.open_zarr('gs://weatherbench2/datasets/hres/2016-2022-0012-1440x721.zarr')
hres["2m_temperature"].sel(time="2020-01-01T00:00:00", prediction_timedelta=pd.Timedelta("1 days")).plot()

takes 1.5 seconds to pull the ECMWF forecast down from the cloud and plot it.

The dependencies that would need to be added for the tutorials are zarr and gcsfs

There is also reanalysis data and data driven models that can be pulled down from the cloud (see https://weatherbench2.readthedocs.io/en/latest/data-guide.html).

You can get data on the same grid so it makes verification with scores super easy!

Something to discuss

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions