Skip to content

ukb_atlas demo random activation is not MPI-global and may fail or over-stimulate #61

@hrehmaan

Description

@hrehmaan

Hello @finsberg

I noticed a possible MPI-related issue in the ukb_atlas.py demo for the random endocardial activation.

In the demo, the code selects a fixed number of local endocardial facets on each MPI rank:

num_points = 900

lv_endo_facets = geo.ffun.find(geo.markers["LV"][0])
rv_endo_facets = geo.ffun.find(geo.markers["RV"][0])
endo_facets = np.concatenate([lv_endo_facets, rv_endo_facets])
np.random.shuffle(endo_facets)
endo_facets_stim = endo_facets[:num_points]

midpoints = dolfinx.mesh.compute_midpoints(geo.mesh, 2, endo_facets_stim)

delays = np.random.uniform(0, activation_duration, num_points)
  1. Over-stimulation when every rank has enough local facets
    On a finer mesh, the demo may not crash because each rank has at least 900 local endocardial facets. However, in that case each rank selects 900 points independently.

For example, with 10 MPI ranks, I observed:

[rank 0] stim_points=900, delays=900
[rank 1] stim_points=900, delays=900
...
[rank 9] stim_points=900, delays=900

This means the simulation is using about 900 × number_of_ranks global stimulus points. With 10 ranks, this becomes approximately 9000 stimulus points, not 900 total.

This makes the MPI result different from the serial result and makes the total amount of stimulation depend on the number of MPI ranks

  1. Crash when a rank has fewer than num_points local facets

On a coarse mesh, or with enough MPI ranks, some ranks may have fewer than 900 local endocardial facets. In that case, midpoints has fewer entries than delays, because delays is always generated with length num_points.

For example, I observed:

[rank 0] stim_points=900, delays=900
[rank 1] stim_points=900, delays=900
...
[rank 9] stim_points=900, delays=900
rank 3] endo_facets=1426, stim_points=900, delays=900
Running model
Get steady states
[rank 2] endo_facets=876, stim_points=876, delays=900
Traceback (most recent call last):
  File "/root/shared/Fenics_work/base_run/test_ukb_atlas/ukb_atlas.py", line 263, in <module>
Running model
Get steady states
[rank 6] endo_facets=1478, stim_points=900, delays=900
Running model
Get steady states
[rank 0] endo_facets=1373, stim_points=900, delays=900
    stim_expr = beat.stimulation.generate_random_activation(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/shared/github_contributions/fenicsx-beat/src/beat/stimulation.py", line 331, in generate_random_activation
    assert len(points) == len(delays), "Points and delays must have the same length"
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: Points and delays must have the same length
Running model
Get steady states
[rank 1] endo_facets=930, stim_points=900, delays=900
Running model
Get steady states
[rank 4] endo_facets=1459, stim_points=900, delays=900
Running model
Get steady states
[rank 5] endo_facets=1477, stim_points=900, delays=900

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions