-
Notifications
You must be signed in to change notification settings - Fork 51
Description
Describe the bug
If JSON backend is used, and calling code makes N calls to storeChunk()
and then calls flush()
, the produced JSON file will be deleted and rewritten to N times. Expected behavior is one JSON serialization per flush()
call regardless of the number of storeChunk()
calls.
To Reproduce
Build & run WarpX against the laser accelerator example with default build options
- Get WarpX source
- Compile with default configuration (description here)
- Run WarpX against the 3d laser accelerator example
warpx.3d inputs_test_3d_laser_acceleration
- Observe that it takes 60 seconds (or longer) to generate a ~50MB JSON file with ~2 million doubles
Expected behavior
Expected behavior is one JSON serialization per flush()
call regardless of the number of storeChunk()
calls.
Software Environment
- version of openPMD-api: 0.16.1
- installed openPMD-api via: WarpX cmake build system dependency
- operating system: OSX Sequoia 15.6.1
- machine: Mac Book Pro 2024 / M4 Pro
- name and version of Python implementation: N/A
- version of HDF5: N/A
- version of ADIOS2: N/A
- name and version of MPI: N/A
Additional context
Removing the call to putJsonContents(file);
in JSONIOHandlerImpl::writeDataset
mostly solves the issue -- the JSON file is still re-serialized about 8 times per time the WarpX application goes to serialize, but that is a significant improvement over 1000+ it currently does. Adding a configuration option or flag for "only serialize to JSON on flush" would be acceptable for my use case.
Code flow:
- WarpX stores data in openPMD via
storeChunk
/storeChunkRaw
calls (I think the main caller is here https://github.com/BLAST-WarpX/warpx/blob/development/Source/Diagnostics/WarpXOpenPMD.cpp#L942) -- the data is not all in a single contiguous buffer for particle data in default mode for many simulations - Each call to
storeChunk
creates a WRITE_DATASET IOTask - WarpX calls openPMD
flush()
flush()
iterates through IOTasks and handles each one-by-one- When handling
WRITE_DATASET
IO tasks, it eventually invokesJSONIOHandlerImpl::writeDataset
- Each call to
JSONIOHandlerImpl::writeDataset
re-serializes entire full json file by invokingputJsonContents