-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Consider recommendations from recent talks from the HDF Group and community on optimizing HDF5 data in the cloud, separating array data from metadata, and integrating with Zarr. Some of these may be relevant as background for our paper.
-
Cloud Ready HDF5 – Matt Larson, John Readey, and Aleksandar Jelenak, The HDF Group (HUG24) (August 7, 2024): https://www.youtube.com/watch?v=2Iqv-adMF-U
- Use larger chunk sizes (4-8 MB/chunk)
- Use CoHDF5
They mention that reading chunks with variable length data can result in many small remote reads (bad performance)
-
"Cloud-Optimized HDF5 Files – Aleksandar Jelenak, The HDF Group #HUG23 (September 5, 2023)": https://www.youtube.com/watch?v=bDH59YTXpkc
- Guidance https://ntrs.nasa.gov/api/citations/20240008354/downloads/Cloud_Optimized.pdf
- "Benchmarking Cloud Optimized HDF5 files locally and on AWS S3 - Aleksandar Jelenak, Call the Doctor": https://www.youtube.com/watch?v=lKNS9H0GKKg
- Evaluating Cloud-Optimized HDF5 for NASA’s ICESat-2 Mission
https://nsidc.github.io/cloud-optimized-icesat2/
And some other interesting talks from recent HDF5 User Group conferences:
- Using a HDF5 File as a Zarr v3 Shard - Mark Kittisopikul (Howard Hughes Medical Institute) HUG25: https://www.youtube.com/watch?v=c4b_yfIeHJc
- Down to the bytes: can we simplify alternative access to HDF5? - Thomas Kluyver (European XFEL) HUG25: https://www.youtube.com/watch?v=EgtAiYslNGg
- Shiver Me Timbers: The Design of Sharded Storage for HDF5 - Quincey Koziol (NVIDIA) HUG25: https://www.youtube.com/watch?v=UfwoiOG3S9E
- Uncharted Territory – Exploring New Frontiers for HDF5 – Quincey Koziol, NVIDIA - HUG24: https://www.youtube.com/watch?v=dFxhxMOxYY0
- Blast-off: GPU Accelerated HDF5
- We’re Breaking Up! Disaggregated HDF5 Containers on Object Storage Systems
- Not for Spotify: Streaming HDF5 Data
Metadata
Metadata
Assignees
Labels
No labels