Skip to content

Conversation

@rhaas80
Copy link
Member

@rhaas80 rhaas80 commented Aug 8, 2024

@rhaas80's pull request 34 https://bitbucket.org/eschnett/carpet/pull-requests/34

  • CarpetIOHDF5: keep outND_var files open until all data is written

    this reduces file system open/close calls from once per variable per reflevel per iteration of output to once per variable per iteration of output or less (if one_file_per_group or one_file_per_proc is used).

    It does however keep more files open which can be an issue if the OS limit on open files is too small. Usually this limit is per process though and as long as one process does not open many files, the whole simulation can still open very many.

  • CarpetIOHDF5: be more carefull closing all open hdf5 files

Copy link
Member Author

@rhaas80 rhaas80 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transferred comments from bitbucket

if (CCTK_EQUALS(flush_to_disk, "immediate"))
error_count += CloseFile(cctkGH, file);

if (nioprocs > 1) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eschnett says

This would do the wrong thing if there are n processes, but only 1 I/O process. MPI_Allreduce for dist::comm() is superfluous only if nprocs==1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure I follow. If (I quote) “is superfluous only if nprocs==1“ and the check is if (nioprocs > 1) is the condition used (and nioprocs >= 1) isn’t if(nioprocs > 1) the same as if(nioprocs == 1).
I suspect that for this IO thorn there is either nioprocs == 1 or nioprocs == nprocs. In the first case then the ioproc already has “good” values in io_files at least that seems to be the assumption in the place where I copied this code from, namely:
I would think that AllReduce can be safe if io_files and io_bytes are initialized to 0 on al ranks initially. Should have tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks correct to me, or at least as correct as the it was when it was still part of the CloseFile function (which is where it was moved from).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants