You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ITS_LIVE is a NASA MEaSURES project that derives glacier velocity data from different optical and radar satellites and provides a comprehensive record of glacier velocities for the whole planet. This article talks a bit about how the project works and some rationale behind its different datasets : https://nasa-jpl.github.io/itslive-ieee/
We are in the process of updating our processing pipeline and now that IceChunk and VirtualiZarr exist, we are wondering what use can we have for these technologies so that we make our data production more efficient and we improve the data access patterns for researchers. At a high level we produce millions of NetCDF granules from Landsat and Sentinel, we provide annual mosaics from these granules and we also create Zarr cubes for time series analyses.
I think we have 2 key areas where we would like to explore the use of VirtualiZarr and IceChunk, one is in our Zarr cubes production and the way we update them, which I think we are not doing it super efficiently at the moment. This falls into how to efficiently update time-optimized Zarr stores. With spatially aligned chunking is simpler because is basically just placing a few chunks on top of the zarr store but in our case the rechunking and update steps are causing some issues.
The other area is improving access patterns, perhaps creating virtual Zarr stores that point to the original netcdfs for specific regions(almost the same as the data cubes but with IceChunk+VirtualiZarr) One of the reasons is that our cubes cover a small spatial footprint and in a lot of science use cases, a "balanced" chunking approach is required. For now we are not contemplating providing multiple chunking data cubes.
I drew a diagram with some thoughts on what we are doing that if someone is interested can check out and give some feedback. ITS_LIVE schematics
I wonder if this could be a Pangeo post... thanks in advance for the feedback!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
ITS_LIVE is a NASA MEaSURES project that derives glacier velocity data from different optical and radar satellites and provides a comprehensive record of glacier velocities for the whole planet. This article talks a bit about how the project works and some rationale behind its different datasets : https://nasa-jpl.github.io/itslive-ieee/
We are in the process of updating our processing pipeline and now that IceChunk and VirtualiZarr exist, we are wondering what use can we have for these technologies so that we make our data production more efficient and we improve the data access patterns for researchers. At a high level we produce millions of NetCDF granules from Landsat and Sentinel, we provide annual mosaics from these granules and we also create Zarr cubes for time series analyses.
I think we have 2 key areas where we would like to explore the use of VirtualiZarr and IceChunk, one is in our Zarr cubes production and the way we update them, which I think we are not doing it super efficiently at the moment. This falls into how to efficiently update time-optimized Zarr stores. With spatially aligned chunking is simpler because is basically just placing a few chunks on top of the zarr store but in our case the rechunking and update steps are causing some issues.
The other area is improving access patterns, perhaps creating virtual Zarr stores that point to the original netcdfs for specific regions(almost the same as the data cubes but with IceChunk+VirtualiZarr) One of the reasons is that our cubes cover a small spatial footprint and in a lot of science use cases, a "balanced" chunking approach is required. For now we are not contemplating providing multiple chunking data cubes.
I drew a diagram with some thoughts on what we are doing that if someone is interested can check out and give some feedback.
ITS_LIVE schematics
I wonder if this could be a Pangeo post... thanks in advance for the feedback!
cc @mliukis @alex-s-gardner @jhkennedy @TomNicholas
Beta Was this translation helpful? Give feedback.
All reactions