Skip to content

Use SST_MPI_Comm_spawn_multiple in Airel PIN backend#2638

Open
plavin wants to merge 6 commits intosstsimulator:masterfrom
plavin:ariel-fixup
Open

Use SST_MPI_Comm_spawn_multiple in Airel PIN backend#2638
plavin wants to merge 6 commits intosstsimulator:masterfrom
plavin:ariel-fixup

Conversation

@plavin
Copy link
Contributor

@plavin plavin commented Mar 4, 2026

This PR updates Ariel to use MPI_Comm_spawn_multiple for launching applications when core is compiled with MPI support. This simplifies the build system as now the only parts of Ariel that depend on an MPI compiler are the test applications. This also makes launching applications more robust, as MPI applications are not supposed to call fork. Fixes #2624.

This PR also adds new functionality to the Ariel API. Two new functions are added:

void ariel_output_stats_begin_region(const char *name);
void ariel_output_stats_end_region(const char *name);

These will each cause a message to be output to stdout, with the region name and the current simulation timestamp. This can be used to correlate stat dumps with locations in the source app.

Justification for changes to fesimple.cc

The file ariel/frontend/pin3/fesimple.cc required extensive changes. One tricky part about tracing MPI applications with PIN is that the MPI library will typically spawn its own threads when MPI_Init is called. This causes two issues: (1) the program has more threads than specified by the user in their Ariel config meaning the shared memory tunnel is not big enough, and (2) if MPI_Init is called before all of the application threads are launched, the MPI threads will receive lower IDs than the application threads. This makes ignoring them harder.

The first solution, which is removed by this PR, was to try and place an OMP parallel region before MPI_Init, so that the application threads would always be numbered 0..N-1. But this obviously only works for OpenMP programs and meant that the Ariel API needed to be compiled with -fopenmp.

The new approach is to track which threads were MPI threads by checking if libmpi.so was in their callstack. This works but now we have to maintain a map of the thread IDs (typically [0,3,4,...,N+1] -> [0,...,N-1]). We then need to check this map when writing to the tunnel. PIN won't let us change how it numbers threads. In a future update, I hope to move this functionality to a class that wraps the tunnel so that the map can be queried in a single location instead of all over fesimple.cc .

Known Issues

  • There is a non-deterministic bug that will sometimes cause messages sent across the tunnel to appear in the wrong order to the arielcore. This will lead to errors such as FATAL: ArielComponent[arielcore.cc:486:refillQueue] Error: Ariel did not understand command (128) provided during instruction queue refill. or it may cause the program to hang indefinitely. This error seems to mostly affect test_Ariel_test_ivb_pin.
  • Calling MPI_Comm_spawn_multiple will cause an error if there are not enough slots in the allocation to run all the SST ranks and the application ranks in their own slots. To remedy this, we set OMPI_MCA_rmaps_base_oversubscribe=1 in the MPI testsuite file.
  • The EPA backend needs to be updated to use SST_MPI_Comm_spawn_multiple instead of fork. Trying to launch an MPI app with the EPA backend will cause an error.
  • The remap functionality in fesimple.cc may break if more application threads are launched than the the corecount parameter passed to the ariel.ariel component.

@gvoskuilen
Copy link
Contributor

This PR needs to be retargeted at the devel branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error "arielapi.c: MPI_Init called in arielapi.c but this file was compiled without MPI." although Ariel was compiled with MPI

2 participants