Open
Description
A wishlist of features after the upcoming releases:
Near-Term
- Implement openPMD 2.0
- Custom hierarchies
see description Allow user to store non-openPMD information openPMD-standard#115
- Custom hierarchies
- New ADIOS2 schema with support for modifiable attributes
ADIOS2 schema 2022_07_26, based on ADIOS2 modifiable attributes #1310 - non-MPI parallel reads in DASK with ADIOS Open file with in-memory metadata ornladios/ADIOS2#3651
- New ADIOS2 JoinedArray support for particles Particles: Support Auto-Shape Counting #1374 Joined arrays in ADIOS2 #1382
Added support for Joined Arrays in the BP4 format and engine. ornladios/ADIOS2#3466 - Deprecation and later removal of
RecordComponent::SCALAR
Remove necessity for RecordComponent::SCALAR #1154 - Full support for steps in ADIOS2, full support for variable-based iteration encoding. This implies:
- Default enabled steps
- Support random-accessing steps
- Skip duplicate iterations at read time
- Support reopening closed iterations Support Reopen of Closed Iterations #1606
- Julia bindings Julia bindings #1025
- better object model for default attributes Scientific default values #1439
- distributed initialization of a read-access dataset in ADIOS2 within non-MPI contexts (e.g. DASK, ref Open file with in-memory metadata ornladios/ADIOS2#3651)
-
openpmd-pipe
: modularize into visitor pattern -
openpmd-pipe
: reuse in new CLI tools such asopenpmd-coarsen
(fields, particles [WIP] Script: Filter & Copy Particles #1390)
Mid-Term
- Support for joined arrays in backends other than ADIOS2
async for iteration in s.read_iterations()
for Python- More flexible reads in C++17 A Variant-Based LoadChunk #1372 and Python
- MPI-wise logging of IO actions
- Performance optimization: Long-running simulations (many iterations, reading and writing)
- Specify default attributes not upon construction, but upon closing, clean up the logic for specifying defaults, constructors and destructors of our object model
- Context: crashing simulations
- If in read, a standard attribute is missing, then warn and add a reasonable default (e.g., a 3-value axisLabel for a 3D-mesh)
- Support for PIConGPU-style dataset-specific JSON/TOML configuration
https://picongpu.readthedocs.io/en/0.6.0/usage/plugins/openPMD.html#cfg-file, also for iteration-specific configuration, e.g. for InitialBufferSize per file - Maybe Flag for writing attributes only from rank 0
- Python docstrings docstrings in python version / Python documentation #1328
- Maybe SoA <-> AoS flexibility (affects the standard)
- Probably requires struct-type fields
- Project structure: Separate MPI headers from serial headers
With this change: Provide openPMD-api via Linux package managers - Compression and plugins in HDF5
- HDF5 hardlinks + maybe as a fallback softlinks Generalize Record Definition openPMD-standard#283
Long-Term / Ideas
-
Synchronous mode: Avoid UB for store and load chunks
- In both C++ and Python, we would like to avoid that the user can interact with allocated but non-flushed (UB) data.
- For this, we could rename
storeChunk
/loadChunk
to...Async()
, which returns astd::future
(C++) orasyncio.Future
(Python).- We need to keep track which of these in-flight objects we created and will set them to valid on
series.flush()
. - If a future is awaited before flush was called, we throw a runtime exception, which allows to recover in interactive use. Futures also allow us to check if the futures are valid w/o having to catch exceptions.
- We need to keep track which of these in-flight objects we created and will set them to valid on
- The existing APIs would be sync.
-
Maybe Use ADIOS2 group feature in reading (-> faster parsing)
-
Maybe Chunk distribution algorithms
-
Maybe Async I/O (especially Python)