Skip to content

aarch64 op: Enable two‑stage SVE detection in component configuration #13203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

vogma
Copy link
Contributor

@vogma vogma commented Apr 22, 2025

This pull request refactors the build configuration for the OpenMPI aarch64 op component, aligning it with the existing approach used by the avx component.

Currently, the avx component configuration systematically tests SIMD instruction support (e.g., AVX2, AVX512) by incrementally applying compiler flags until compilation succeeds, independent from the host CPU's capabilities. In contrast, the existing aarch64 configuration lacks this mechanism, performing only basic compilation checks without utilizing additional flags or function attributes.

To address this gap, this pull request enhances the aarch64 configuration by incorporating checks using compiler function attributes (specifically, __attribute__((__target__("+sve")))). This enables SVE code and includes it in OpenMPI regardless of compiler flags provided explicitly via the command line, just like the avx component already does.

If compilation via function attribute is successful, the attribute is automatically inserted via macro into the SVE function templates within op_aarch64_functions.c. Runtime detection of processor capabilities (NEON or SVE) remains unchanged. No API's have been changed, only the build systems is modified along with a macro definition in the source files.

There is a discussion that goes into more detail in the developer user group.

vogma added 2 commits April 21, 2025 14:16
- Introduce AC_CACHE_CHECK probes for ARM Scalable Vector Extension (SVE)
  using both a default compile test and a second test with __attribute__((__target__("+sve"))).
- Define variables op_cv_sve_support and op_cv_sve_add_flags
- Update AM_CONDITIONAL and AC_DEFINE to expose SVE support macros
  (OMPI_MCA_OP_HAVE_SVE, OMPI_MCA_OP_SVE_EXTRA_FLAGS).
- Extend final AS_IF to enable the component when either NEON or SVE is available.

Signed-off-by: Marco Vogel <[email protected]>
- Add a preprocessor guard around SVE-specific function attributes
- Encapsulate the +sve attribute behind OMPI_MCA_OP_SVE_EXTRA_FLAGS, ensuring
  that only builds which detected and enabled compiler SVE support will compile with
  SVE-targeted code paths.
- Simplifies later code by using SVE_ATTR in function declarations instead of
  repeating the attribute clause.

Signed-off-by: Marco Vogel <[email protected]>
Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

4355a6f: removed tabs in configuration.m4

  • check_signed_off: does not contain a valid Signed-off-by line

76528bf: remove unused configuration

  • check_signed_off: does not contain a valid Signed-off-by line

e9b166d: apply SVE_ATTR macro in C source for conditional +...

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

vogma and others added 4 commits April 22, 2025 11:59
- Ensures that SVE-specific attributes are only applied when OMPI_MCA_OP_SVE_EXTRA_FLAGS
  is set, avoiding illegal instructions on non-SVE builds

Signed-off-by: Marco Vogel <[email protected]>
Signed-off-by: Marco Vogel <[email protected]>
Turns out I forgot to run the pcvs checker when adding MPI_T event stubs
in PR open-mpi#13086 and missed a couple of events related functions.  Also it
looks like these were not included in PR open-mpi#8057.

With this patch, the PCVS MPI API checker passes for MPI 4.0 standard.

The PCVS MPI API checker is described here https://dl.acm.org/doi/abs/10.1145/3615318.3615329

Signed-off-by: Howard Pritchard <[email protected]>
@vogma vogma force-pushed the sve_op_refactoring branch from 7c531c4 to 0ff8517 Compare April 22, 2025 10:00
@vogma vogma closed this Apr 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants