Skip to content

Set subfiling vfd default enabled=ON #5518

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

lrknox
Copy link
Collaborator

@lrknox lrknox commented May 6, 2025

and option available when parallel is enabled and not on win32.

Also turned on sorting for brief lists in doxygen files.

lrknox added 2 commits May 6, 2025 10:38
Windows.
Subfiling VFD option appears only when parallel is enabled and not on
Windows.
Turned on sorting for brief lists in doxygen pages.
if (HDF5_ENABLE_SUBFILING_VFD)
if (WIN32)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this check is still needed. cmake_dependent_option just hides the option, but someone could still set it on Win32.

Copy link
Contributor

@byrnHDF byrnHDF May 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to know for sure - as I believe the documentation states:
"condition": A boolean expression involving other CMake options. It must evaluate to true or false.
Of course NOT WIN32 might not evaluate correctly. Then we would need to add if NOT WIN32 around the whole block.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that CMake hides the option, but I don't think that means it's actually gone. I think it's similar to just being marked with mark_as_advanced().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it hides the option but the condition forces a value if it is hidden

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran " ccmake -DMPIEXEC_PREFLAGS:STRING="-n;6" -DHDF5_ENABLE_SUBFILING_VFD=ON ../lrk-hdf5", then configured without any other changes. The subfiling vfd option was missing and libhdf5.settings reported Subfiling VFD OFF.

@jhendersonHDF
Copy link
Collaborator

The last I tried to do this, nvhpc had issues due to MPI_Init_thread() failing, so I guess we'll see if that's still the case.

CMakeLists.txt Outdated
@@ -835,32 +836,18 @@ set (HDF5_SRC_INCLUDE_DIRS
${HDF5_SRC_INCLUDE_DIRS}
${H5FD_SUBFILING_DIR}
)
option (HDF5_ENABLE_SUBFILING_VFD "Build Parallel HDF5 Subfiling VFD" OFF)
cmake_dependent_option(HDF5_ENABLE_SUBFILING_VFD "Build Parallel HDF5 Subfiling VFD" ON "HDF5_ENABLE_PARALLEL AND NOT WIN32" OFF)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it looks like this syntax for the condition is only supported as of CMake 3.22 and otherwise has to be a semicolon-separated list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are still on 3.18 then we need to use the semis

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix this.

@lrknox
Copy link
Collaborator Author

lrknox commented May 7, 2025

The last I tried to do this, nvhpc had issues due to MPI_Init_thread() failing, so I guess we'll see if that's still the case.

Yes, *** An error occurred in MPI_Init_thread
2155: *** on a NULL communicator
Test #2155: H5WATCH_ARGS-h5watch-w-err-dset2 ...........................................***Failed
1 of 110 failed tests. Also with this error message:
[fv-az2211-563:37654] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required
2162: executable either could not be found or was not executable by this user in
2162: file ../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
2162: 172

@jhendersonHDF
Copy link
Collaborator

Yes, *** An error occurred in MPI_Init_thread 2155: *** on a NULL communicator Test #2155: H5WATCH_ARGS-h5watch-w-err-dset2 ...........................................***Failed 1 of 110 failed tests. Also with this error message: [fv-az2211-563:37654] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required 2162: executable either could not be found or was not executable by this user in 2162: file ../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line 2162: 172

Yep, this is related to the description I added for #5415. One issue is that the tools VFD fallback mechanism attempts to initialize and use the Subfiling VFD for tests where the file can't be opened with the sec2 VFD (the file doesn't exist, it's in a different VFD's format, etc.) which normally wouldn't be a problem, except that (I believe) NVHPC's MPI seems to have issues when NULL is passed for the arguments to MPI_Init_thread. Removing the VFD fallback mechanism could "fix" this, but it's really just avoiding the problem.

There's also the question of testing. Subfiling is tested in the daily VFD build workflow currently, but we only run the subfiling-specific tests and none of the tools tests, for example, so is that enough testing?

The last issue is that turning on Subfiling by default means threads becomes a default requirement for parallel builds. Probably not really a big issue, but something to consider.

@lrknox lrknox added Component - Parallel Parallel HDF5 (NOT thread-safety) Component - Build CMake, Autotools Priority - 1. High These are important issues that should be resolved in the next release labels May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - Build CMake, Autotools Component - Parallel Parallel HDF5 (NOT thread-safety) Priority - 1. High These are important issues that should be resolved in the next release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants