Series Flush Type #484

ax3l · 2019-02-19T09:44:29Z

Add a new potential data flush mode to Series.

Although we implement a more general model of data passing and accumulating, allowing a "directly synchronous" mode will allow us to target the typical h5py Python user who just quickly wants to explore small data sets without frustration ("why is my data all zeros?").

We definitely have a continued need for our generalized, deferred, asynchronous chunk load/store mechanisms for our applications in parallel I/O and upcoming staging (think: streaming) modes. Just let us enable them explicitly as an "advanced" feature (therefore switch the default to the simple DIRECT mode). This is purely a user-interface change in order to target a wider audience.

The new DIRECT mode should also help somewhat in debugging.

Stylistic note: I think AccessType and FlushType might better be called ...Mode?

@franzpoeschel @C0nsultant what do you think? I hope that description is not too brief, otherwise let's chat :-)
PRs to this branch that implement that mode, also in parts, very welcome.

To Do

python bindings
bugfix of Segfault in Flush of Particle Record Component #490
Docs
- update first read / write
- update store/loadChunk doc strings
- add changelog
- breaking: add in upgrade guide
update examples
update and add more tests

ax3l · 2019-02-19T10:40:48Z

include/openPMD/RecordComponent.hpp

+     * Series::flush is called.
+     * @todo implement: if the shared pointers' `use_count` drops to one
+     *       before Series::flush is called, the load will be skipped during
+     *       Series::flush. (check if this works with our `shareRaw()`


this will become possible when #470 is implemented

ax3l · 2019-02-19T10:40:58Z

include/openPMD/RecordComponent.hpp

+     * Series::flush is called.
+     * @todo implement: if the shared pointers' `use_count` drops to one
+     *       before Series::flush is called, the store will be skipped during
+     *       Series::flush. (check if this works with our `shareRaw()`


this will become possible when #470 is implemented

ax3l · 2019-02-22T08:53:05Z

@C0nsultant do you think the right location to store the flush mode would be with AbstractIOHandler?

ax3l · 2019-02-22T14:09:14Z

test/ParallelIOTest.cpp

@@ -86,6 +86,7 @@ TEST_CASE( "hdf5_write_test", "[parallel][hdf5]" )
    uint64_t mpi_size = static_cast<uint64_t>(mpi_s);
    uint64_t mpi_rank = static_cast<uint64_t>(mpi_r);
    Series o = Series("../samples/parallel_write.h5", AccessType::CREATE, MPI_COMM_WORLD);
+    o.setFlush(FlushType::DEFER);  // @todo crashes in DIRECT flush mode


see bug #490

ax3l · 2019-02-22T14:09:21Z

test/ParallelIOTest.cpp

@@ -120,6 +123,7 @@ TEST_CASE( "hdf5_write_test_zero_extent", "[parallel][hdf5]" )
    uint64_t size = static_cast<uint64_t>(mpi_s);
    uint64_t rank = static_cast<uint64_t>(mpi_r);
    Series o = Series("../samples/parallel_write_zero_extent.h5", AccessType::CREATE, MPI_COMM_WORLD);
+    o.setFlush(FlushType::DEFER);  // @todo crashes in DIRECT flush mode


see bug #490

anokfireball · 2019-02-22T14:38:27Z

@C0nsultant do you think the right location to store the flush mode would be with AbstractIOHandler?

There (currently) is a 1:1 mapping of Series to IOHandlers, but in theory this could become n:1 in the future (i.e. multiple versions of a Series in the same directory with the same access properties). One could think of scenarios where the different versions have different flush strategies, but I would not worry much about that case as there's one too many if s involved for serious dedication.

Strictly speaking separation of concern, the IO backend (and in this case, the handler) is a perfectly suitable place for it. The flush() calls in the front- and backend have slightly different intentions. The one you care about (actually swapping bytes between RAM and NVstorage, without traversing and syncing the whole frontend tree) is the one the backend is concerned with.

ax3l · 2019-02-22T14:42:25Z

Thanks, sounds good.

I actually pushed a first working draft in this PR, for in-code comments/review :)

ax3l · 2019-02-23T13:01:58Z

@C0nsultant ready for review :)
Any idea why direct writing in 3_write_serial might be segfaulting? Do we need to flush some path creations? Seems the Record::flush_impl path creation is a bit late as well as the parent/writable registration for new record components if they are not scalar.

NEWS.rst

ax3l · 2019-05-13T12:21:44Z

@C0nsultant if you have the time, the logic in flushing here and in #490 needs to be refactored by us to be more flexible with flushes triggered from a record component (instead of top-down from a series-iteration-record-...) :-)

Add a new data flush mode to Series. Although we implement a more general model of data passing and accumulating, allowing a "directly synchronous" mode will allow us to target the typical h5py Python user who just quickly wants to explore small data sets without frustration. We definitely have a continued need for our generalized, deferred, asynchronous chunk load/store mechanisms for our applications in parallel I/O and upcoming staging (think: streaming) modes. Just let us enable them explicitly as an "advanced" feature.

Try to use deferred mode for most tests, but add some direct ones as well. Especially, with fixed defaults for particle position/positionOffset creation, we should add direct mode tests for CREATE of those as well.

Document the new Series flush type and how to upgrade from previous releases, as this is a breaking change.

dynamic casts can return nullptrs ;-) openPMD#471

ax3l added help wanted discussion labels Feb 19, 2019

ax3l requested review from anokfireball and franzpoeschel February 19, 2019 09:44

ax3l added api: breaking breaking API changes api: new additions to the API labels Feb 19, 2019

ax3l commented Feb 19, 2019

View reviewed changes

ax3l force-pushed the draft-flushType branch 4 times, most recently from adaf209 to b10925c Compare February 22, 2019 13:58

ax3l commented Feb 22, 2019

View reviewed changes

ax3l force-pushed the draft-flushType branch from b10925c to e41190a Compare February 22, 2019 15:25

ax3l changed the title ~~[Draft] Data Flush Mode~~ [WIP] Data Flush Mode Feb 22, 2019

ax3l removed the discussion label Feb 22, 2019

ax3l force-pushed the draft-flushType branch 4 times, most recently from af7c8b2 to 0ba666a Compare February 23, 2019 12:17

ax3l assigned anokfireball and franzpoeschel Feb 23, 2019

ax3l changed the title ~~[WIP] Data Flush Mode~~ Data Flush Mode Feb 23, 2019

ax3l force-pushed the draft-flushType branch 2 times, most recently from 6bf4c46 to c63f74b Compare February 23, 2019 12:21

ax3l changed the title ~~Data Flush Mode~~ Data Flush Type Feb 23, 2019

ax3l changed the title ~~Data Flush Type~~ Series Flush Type Feb 23, 2019

ax3l mentioned this pull request Mar 9, 2019

Version: 0.8.0-alpha #481

Merged

ax3l force-pushed the draft-flushType branch from a0f5e7b to 396c879 Compare March 9, 2019 15:12

ax3l commented Mar 9, 2019

View reviewed changes

NEWS.rst Outdated Show resolved Hide resolved

ax3l force-pushed the draft-flushType branch from 396c879 to fc99125 Compare March 9, 2019 22:20

ax3l force-pushed the draft-flushType branch from fc99125 to 6983e8d Compare July 1, 2019 14:01

ax3l force-pushed the draft-flushType branch from 6983e8d to 50f808b Compare August 28, 2019 17:38

ax3l force-pushed the draft-flushType branch from 50f808b to 6cf8552 Compare October 21, 2019 15:48

ax3l force-pushed the draft-flushType branch 2 times, most recently from 650b9c9 to d316e66 Compare November 5, 2019 00:52

This was referenced Nov 5, 2019

Docs: MPI Contracts #583

Merged

Incorrect reading of patches (probably, incorrect datatype) #589

Closed

ax3l mentioned this pull request Nov 19, 2019

flush_and_throw() #616

Open

ax3l force-pushed the draft-flushType branch from d316e66 to b51ee1f Compare January 25, 2020 19:00

ax3l added 8 commits January 27, 2020 13:23

Python: Flush Type

c018e99

Tests: FlushType

2ec8968

Try to use deferred mode for most tests, but add some direct ones as well. Especially, with fixed defaults for particle position/positionOffset creation, we should add direct mode tests for CREATE of those as well.

Examples: Flush Type

13f3411

MPI Benchmark: Flush Type

6445f73

Changelog & Upgrade: Flush Type

24a15e4

Document the new Series flush type and how to upgrade from previous releases, as this is a breaking change.

First Read/Write: Flush Type

8fa97a8

Debug: concrete_h5_file_position

2554eb9

dynamic casts can return nullptrs ;-) openPMD#471

ax3l force-pushed the draft-flushType branch from b51ee1f to 2554eb9 Compare January 27, 2020 21:25

ax3l mentioned this pull request Jan 28, 2020

Design limitations in the frontend resulting in unexpected behaviours (segfaults, premature destructor calls) #534

Open

ax3l mentioned this pull request May 26, 2020

Iteration::close #746

Merged

2 tasks

ax3l mentioned this pull request Sep 23, 2020

ADIOS2: Write/Put In Sync Mode #777

Closed

ax3l mentioned this pull request Jan 26, 2021

Python garbage collection: store_chunk does not keep data alive #833

Closed

This was referenced Mar 17, 2021

ParticleSpecies: Read to dask.dataframe #935

Merged

Expose internal buffers to writers #901

Merged

ax3l force-pushed the dev branch from b80dfee to 805c760 Compare May 7, 2024 17:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Series Flush Type #484

Series Flush Type #484

Uh oh!

ax3l commented Feb 19, 2019 •

edited

Loading

Uh oh!

ax3l Feb 19, 2019

Uh oh!

ax3l Feb 19, 2019

Uh oh!

ax3l commented Feb 22, 2019

Uh oh!

ax3l Feb 22, 2019

Uh oh!

ax3l Feb 22, 2019

Uh oh!

anokfireball commented Feb 22, 2019 •

edited

Loading

Uh oh!

ax3l commented Feb 22, 2019 •

edited

Loading

Uh oh!

ax3l commented Feb 23, 2019 •

edited

Loading

Uh oh!

Uh oh!

ax3l commented May 13, 2019

Uh oh!

Uh oh!

Series Flush Type #484

Are you sure you want to change the base?

Series Flush Type #484

Uh oh!

Conversation

ax3l commented Feb 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

To Do

Uh oh!

ax3l Feb 19, 2019

Choose a reason for hiding this comment

Uh oh!

ax3l Feb 19, 2019

Choose a reason for hiding this comment

Uh oh!

ax3l commented Feb 22, 2019

Uh oh!

ax3l Feb 22, 2019

Choose a reason for hiding this comment

Uh oh!

ax3l Feb 22, 2019

Choose a reason for hiding this comment

Uh oh!

anokfireball commented Feb 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ax3l commented Feb 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ax3l commented Feb 23, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ax3l commented May 13, 2019

Uh oh!

Uh oh!

ax3l commented Feb 19, 2019 •

edited

Loading

anokfireball commented Feb 22, 2019 •

edited

Loading

ax3l commented Feb 22, 2019 •

edited

Loading

ax3l commented Feb 23, 2019 •

edited

Loading