Skip to content

Conversation

@dssgabriel
Copy link
Collaborator

Description

This PR is a significant rework of the CMake build system, primarily in preparation for a follow-up PR that will introduce CMake presets to streamline the development workflow.

Exhaustive changes list

  • Bump CMake minimum version to 3.25 (in preparation for CMake workflow presets)
  • Make MPI a required dependency by default
    Since the NCCL backend also requires MPI for setup, we must require MPI by default
  • Removed all-caps CMake variables overloading existing CMake options with different capitalization
    Align with CMake best practices and clearly separate usage:
    • Capitalized KokkosComm_ variables are reserved for CMake
    • All-caps KOKKOSCOMM_ variables are reserved for source code and pre-processor
  • Renamed options that build additional targets to use BUILD_ in their name
  • Ensure that all options enabling a feature/behavior use ENABLE_ in their name
  • Move the KokkosComm library target declaration and install command to the top-level CML
    • Also removed named file sets since KC is header-only, thus all headers must be installed regardless
  • Reworked packaging logic to align with CMake best practices:
    • Enable installing the CMake config-file package using an option
      • ON if the project is built as top-level, OFF for projects consuming KokkosComm
    • Removed the configure-time CMake-generated KokkosComm_config.hpp header
      • Clearer separation of concerns: CMake defines all the necessary variables at configure-time; this is not delegated to a generated header that must be included everywhere
      • Let CMake define all necessary PP variables instead
    • Enforce ExactVersion compatibility instead of SameMajorVersion
      • Since KC is still in development and pre-v1.0, the API is not stable from version to version
  • Removed all compile flags applied to the KokkosComm target
    • Rationale: because KokkosComm is header-only, it is an INTERFACE library in CMake jargon. This means that any targets that link against KokkosComm would inherit the compile flags/definitions/features/etc that we had set here. This is often undesirable and considered bad practice.
    • Note: since KokkosComm is a C++20 project and relies on some of its additions, we still enforce the cxx_std_20 feature (w/o GNU extensions) so that targets linking against KokkosComm inherit it.
  • Update docs with new CMake options
  • Update project description & bump KokkosComm version to 0.3.0 to reflect:

This will let us provide workflow presets (added in CMake 3.25) in a
follow-up PR.
Removed all-caps CMake vars that "overloaded" existing CMake options.
This aligns w/ CMake best practices and lets us have a clear separation
of usage. Proposed usage rules are:
- Capitalized `KokkosComm` names reserved for CMake;
- All-caps `KOKKOSCOMM` names reserved for source code (i.e. PP vars).

Additionnally, rename options that _build_ additional targets to use
`_BUILD_` instead of `_ENABLE_` (again, consistency w/ standard CMake
best practices).
Also ensured that all options that _enable_ some feature/behavior have
the word `_ENABLE_` in them.
If (ever) KokkosComm becomes a non-header-only library, this should
minimize code movement because we'll just be able to add more
`add_subdirectory` commands without having to worry about the KokkosComm
target being declared in a nested CML.
Since KokkosComm is a header-only library, even implementation details
headers must be installed on the system. Hence, there is no need for
named file sets as all of them will be installed anyway.
Rationale: because KokkosComm is header-only, it is an `INTERFACE`
library in CMake jargon. This means that any targets that link against
KokkosComm would inherit the compile flags/definitions/features/etc that
we had set here. This is often undesirable, and considered bad practice.
In a follow-up PR, we will provide CMake configure/build/test/workflow
presets to allow developpers to run custom builds of KokkosComm for,
e.g., debugging, testing, benchmarking or packaging/distributing
purposes.

Note: since KokkosComm is a C++20 project and relies on some of its
additions, we still enforce the `cxx_std_20` feature (w/o GNU
extensions) so that targets linking against KokkosComm inherit it.
CMake should be responsible for doing all the PP vars definitions,
rather than relying on a configure-time generated header. This
simplifies library packaging and sets a clear separation of concerns.
Rationale: MPI is required for running NCCL unit tests because it is
used for setting up the environment and NCCL objects/communicators.
Thus, we must require it even when the MPI backend isn't enabled.
@dssgabriel dssgabriel self-assigned this Oct 25, 2025
@dssgabriel dssgabriel added E-help-wanted Call for participation: help is requested and/or extra attention is needed A-core Area: KokkosComm core library implementation T-build Topic: something related to the build system C-maintenance Category: a PR that cleans something up labels Oct 25, 2025
@dssgabriel dssgabriel force-pushed the refactor/cmake-rework branch 3 times, most recently from 22859b1 to a0c3e6e Compare October 25, 2025 20:39
@dssgabriel dssgabriel force-pushed the refactor/cmake-rework branch from a0c3e6e to 4932028 Compare October 25, 2025 20:41
Copy link
Member

@cedricchevalier19 cedricchevalier19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Gabriel, I think it is quite an improvement.

If you have some arguments for removing the header config file, I am listening :).

CMakeLists.txt Outdated
Comment on lines 18 to 22
option(
KokkosComm_INSTALL_CMAKE_PACKAGE
"Install a CMake config-file package. Default: ${PROJECT_IS_TOP_LEVEL}. Values: { ON, OFF }."
${PROJECT_IS_TOP_LEVEL}
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe a use case for that?
I can understand why one can add Kokkos_Comm directly into their project to ease development but do we want them to install their software like this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is aimed at improving the workflow for users who install KokkosComm on their system (i.e., what we test in the linux-install CI workflow).

Producing a CMake config-file package and installing it in a directory that is in $PATH will let CMake's find_package() automatically find KokkosComm without having to pass -DKokkosComm_ROOT on the CMake configuration command.
Additionally, this is the recommended, standard way of packaging libraries in modern CMake (see, e.g., this talk from a CMake developer).

# Generate package version-file
write_basic_package_version_file(
${PROJECT_BINARY_DIR}/KokkosCommVersion.cmake
COMPATIBILITY ExactVersion
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function generates (in part) the CMake config-file package. The COMPATIBILITY parameter controls the version compatibility check when CMake calls find_package.
Since KokkosComm is still in active development and pre-v1.0, the API is not stable from version to version and may introduce breaking changes; hence, the change to an exact version compatibility requirement.

Comment on lines +90 to +107
# Definitions
if(KokkosComm_ENABLE_MPI)
target_compile_definitions(KokkosComm INTERFACE KOKKOSCOMM_ENABLE_MPI)
if(KokkosComm_IMPL_MPI_IS_MPICH)
target_compile_definitions(KokkosComm INTERFACE KOKKOSCOMM_IMPL_MPI_IS_MPICH)
elseif(KokkosComm_IMPL_MPI_IS_OPENMPI)
target_compile_definitions(KokkosComm INTERFACE KOKKOSCOMM_IMPL_MPI_IS_OPENMPI)
endif()
if(KokkosComm_IMPL_MPI_HAS_MPI_EXT_H)
target_compile_definitions(KokkosComm INTERFACE KOKKOSCOMM_IMPL_MPIEXT_H)
endif()
endif()
if(KokkosComm_ENABLE_NCCL)
target_compile_definitions(KokkosComm INTERFACE KOKKOSCOMM_ENABLE_NCCL)
endif()
if(KokkosComm_ENABLE_ABORT_ON_ERROR)
target_compile_definitions(KokkosComm INTERFACE KOKKOSCOMM_ABORT_ON_ERROR)
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a personal opinion, but I think having a KokkosCommConfig.hfile with all this logic is easier to follow. It can also help to check if the configuration is correct, without having to compile something.

I think having KokkosComm_IMPL_MPI_HAS_MPI_EXT_H defined from CMake is better than the test in the header file we previously had.

if (0 == rank) {
std::cerr << argv[0] << " (KokkosComm " << KOKKOSCOMM_VERSION_MAJOR << "." << KOKKOSCOMM_VERSION_MINOR << "."
<< KOKKOSCOMM_VERSION_PATCH << ")\n";
std::cerr << argv[0] << "\n";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is convenient to have the version of KokkosComm printed when you want to monitor a CI. Ideally having the git hash can be great too!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be retrieved through CMake and printed in the CI output, so I am unsure why it is necessary to have it available in the executable? Reliable and robust library packaging using CMake config-file packages is the current best solution for exposing such information, IMHO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is just more convenient to have just one log file to read. It also allows to test if these macros are properly defined.

@dssgabriel
Copy link
Collaborator Author

I am not sure I understand why the MPICH-Debug job fails...
The failing test is mpi/test_sendrecv: all contiguous tests are skipped because there is only one rank (but the program is correctly launched with 2 processes, so that's already weird), and non-contiguous tests directly fail (probably because we don't check that there are two ranks).

I do not understand why there is only one rank since we run the test with:

/usr/bin/mpiexec "-n" "2" "./test-sendrecv"

And why do all other tests pass correctly, test_isendirecv included?

Signed-off-by: Gabriel Dos Santos <[email protected]>
@dssgabriel dssgabriel force-pushed the refactor/cmake-rework branch from fb82dff to c988de5 Compare November 3, 2025 10:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-core Area: KokkosComm core library implementation C-maintenance Category: a PR that cleans something up E-help-wanted Call for participation: help is requested and/or extra attention is needed T-build Topic: something related to the build system

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants