In the past PI we have demonstrated how to create virtual icechunk stores for nasa data in netcdf/hdf format. With that we got increased performance and potentially large savings in time and storage due to the lack of data duplication.
But with those advantages comes more frictions, mostly due to the complexity of access to both the virtual store as well as the virtual chunk locations. This Task outlines steps to contribute to earthaccess to streamline this process for NASA datasets available in CMR.
While general 'magic'/'behind the scenes' authentication to arbitrary cloud buckets is a very bad idea, we think that in the context of earthaccess this makes sense (we will always have a set of 'trusted' sources here, not an ubounded list) and will enhance the user experience and hopefully drive adoption of virtual zarr within the NASA user community.
Roadmap
In the past PI we have demonstrated how to create virtual icechunk stores for nasa data in netcdf/hdf format. With that we got increased performance and potentially large savings in time and storage due to the lack of data duplication.
But with those advantages comes more frictions, mostly due to the complexity of access to both the virtual store as well as the virtual chunk locations. This Task outlines steps to contribute to earthaccess to streamline this process for NASA datasets available in CMR.
While general 'magic'/'behind the scenes' authentication to arbitrary cloud buckets is a very bad idea, we think that in the context of earthaccess this makes sense (we will always have a set of 'trusted' sources here, not an ubounded list) and will enhance the user experience and hopefully drive adoption of virtual zarr within the NASA user community.
Roadmap