-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Traditionally, the Python processes spawned from an mpicasa call split their roles when casampi.private.start_mpi is executed: rank 0 becomes the MPIclient, while non-rank 0 processes are placed in a holding pattern as MPIServers.
It appears that builds using conda-forge mpi4py>=4.0+openmpi=5.0.4 alter this behavior, as non-rank 0 processes no longer assume their server roles after the casampi.private.start_mpi call. I have confirmed that this issue arises solely due to the mpi4py version bump in both CASA version 6.6.1 and 6.6.6.
(pipe1669py38) rxue@xenon:~/Workspace/nvme/nrao/tickets/PIPE-1669/working$ casa6mpi_xvfb pipeline_flag -c ../scripts/test_working.py
====================== ALLOCATED NODES ======================
xenon: slots=1 max_slots=0 slots_inuse=0 state=UP
Flags: DAEMON_LAUNCHED:LOCATION_VERIFIED
aliases: xenon
=================================================================
====================== ALLOCATED NODES ======================
xenon: slots=18 max_slots=0 slots_inuse=0 state=UP
Flags: DAEMON_LAUNCHED:LOCATION_VERIFIED:SLOTS_GIVEN
aliases: xenon
=================================================================
====================== ALLOCATED NODES ======================
xenon: slots=18 max_slots=0 slots_inuse=0 state=UP
Flags: DAEMON_LAUNCHED:LOCATION_VERIFIED:SLOTS_GIVEN
aliases: xenon
=================================================================
======================== JOB MAP ========================
Data for JOB prterun-xenon-3573009@1 offset 0 Total slots allocated 18
Mapping policy: BYCORE:OVERSUBSCRIBE Ranking policy: FILL Binding policy: NONE
Cpu set: N/A PPR: N/A Cpus-per-rank: N/A Cpu Type: CORE
Data for node: xenon Num slots: 18 Max slots: 0 Num procs: 8
Process jobid: prterun-xenon-3573009@1 App: 0 Process rank: 0 Bound: N/A
Process jobid: prterun-xenon-3573009@1 App: 0 Process rank: 1 Bound: N/A
Process jobid: prterun-xenon-3573009@1 App: 0 Process rank: 2 Bound: N/A
Process jobid: prterun-xenon-3573009@1 App: 0 Process rank: 3 Bound: N/A
Process jobid: prterun-xenon-3573009@1 App: 0 Process rank: 4 Bound: N/A
Process jobid: prterun-xenon-3573009@1 App: 0 Process rank: 5 Bound: N/A
Process jobid: prterun-xenon-3573009@1 App: 0 Process rank: 6 Bound: N/A
Process jobid: prterun-xenon-3573009@1 App: 0 Process rank: 7 Bound: N/A
=============================================================
Using configuration file ~/.casa/config.py
Using configuration file ~/.casa/config.py
Using configuration file ~/.casa/config.py
Using configuration file ~/.casa/config.py
Using configuration file ~/.casa/config.py
Using configuration file ~/.casa/config.py
Using configuration file ~/.casa/config.py
Using configuration file ~/.casa/config.py
Using user-supplied startup.py at ~/.casa/startup.py
Using user-supplied startup.py at ~/.casa/startup.py
Using user-supplied startup.py at ~/.casa/startup.py
Using user-supplied startup.py at ~/.casa/startup.py
Using user-supplied startup.py at ~/.casa/startup.py
Using user-supplied startup.py at ~/.casa/startup.py
Using user-supplied startup.py at ~/.casa/startup.py
Using user-supplied startup.py at ~/.casa/startup.py
No event loop hook running.
No event loop hook running.
No event loop hook running.
No event loop hook running.
No event loop hook running.
No event loop hook running.
No event loop hook running.
No event loop hook running.
/zfs/nvme/Workspace/nrao/tickets/PIPE-1669/working
/zfs/nvme/Workspace/nrao/tickets/PIPE-1669/working
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Start a pipeline processing session:
casalog.num_cpus: 36
casalog.total_memory: 131798098
casalog.omp_num_threads: 1
is_mpi_enabled: False
/zfs/nvme/Workspace/nrao/tickets/PIPE-1669/working
/zfs/nvme/Workspace/nrao/tickets/PIPE-1669/working
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Start a pipeline processing session:
casalog.num_cpus: 36
casalog.total_memory: 131798098
casalog.omp_num_threads: 1
is_mpi_enabled: False
/zfs/nvme/Workspace/nrao/tickets/PIPE-1669/working
/zfs/nvme/Workspace/nrao/tickets/PIPE-1669/working
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Start a pipeline processing session:
casalog.num_cpus: 36
casalog.total_memory: 131798098
casalog.omp_num_threads: 1
is_mpi_enabled: False
/zfs/nvme/Workspace/nrao/tickets/PIPE-1669/working
/zfs/nvme/Workspace/nrao/tickets/PIPE-1669/working
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Start a pipeline processing session:
casalog.num_cpus: 36
casalog.total_memory: 131798098
casalog.omp_num_threads: 1
is_mpi_enabled: False
/zfs/nvme/Workspace/nrao/tickets/PIPE-1669/working
/zfs/nvme/Workspace/nrao/tickets/PIPE-1669/working
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Start a pipeline processing session:
casalog.num_cpus: 36
casalog.total_memory: 131798098
casalog.omp_num_threads: 1
is_mpi_enabled: False
/zfs/nvme/Workspace/nrao/tickets/PIPE-1669/working
/zfs/nvme/Workspace/nrao/tickets/PIPE-1669/working
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Start a pipeline processing session:
casalog.num_cpus: 36
casalog.total_memory: 131798098
casalog.omp_num_threads: 1
is_mpi_enabled: False
/zfs/nvme/Workspace/nrao/tickets/PIPE-1669/working
/zfs/nvme/Workspace/nrao/tickets/PIPE-1669/working
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Start a pipeline processing session:
casalog.num_cpus: 36
casalog.total_memory: 131798098
casalog.omp_num_threads: 1
is_mpi_enabled: False
/zfs/nvme/Workspace/nrao/tickets/PIPE-1669/working
/zfs/nvme/Workspace/nrao/tickets/PIPE-1669/working
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Start a pipeline processing session:
casalog.num_cpus: 36
casalog.total_memory: 131798098
casalog.omp_num_threads: 1
is_mpi_enabled: False
openmpi version: ('Open MPI', (5, 0, 4))
--------------------------------------------------------------------------------
openmpi version: ('Open MPI', (5, 0, 4))
--------------------------------------------------------------------------------
openmpi version: ('Open MPI', (5, 0, 4))
--------------------------------------------------------------------------------
openmpi version: ('Open MPI', (5, 0, 4))
--------------------------------------------------------------------------------
openmpi version: ('Open MPI', (5, 0, 4))
--------------------------------------------------------------------------------
openmpi version: ('Open MPI', (5, 0, 4))
--------------------------------------------------------------------------------
openmpi version: ('Open MPI', (5, 0, 4))
--------------------------------------------------------------------------------
openmpi version: ('Open MPI', (5, 0, 4))
--------------------------------------------------------------------------------
CASA 6.6.1.15 -- Common Astronomy Software Applications
CASA 6.6.1.15 -- Common Astronomy Software Applications
CASA 6.6.1.15 -- Common Astronomy Software Applications
CASA 6.6.1.15 -- Common Astronomy Software Applications
CASA 6.6.1.15 -- Common Astronomy Software Applications
CASA 6.6.1.15 -- Common Astronomy Software Applications
2024-10-12 20:16:52 INFO ::casa::MPIServer-1 Using configuration file ~/.casa/config.py
2024-10-12 20:16:52 INFO ::casa::MPIServer-5 Using configuration file ~/.casa/config.py
2024-10-12 20:16:52 INFO ::casa::MPIServer-5 Using user-supplied startup.py at ~/.casa/startup.py
2024-10-12 20:16:52 INFO ::casa::MPIServer-1 Using user-supplied startup.py at ~/.casa/startup.py
2024-10-12 20:16:52 INFO ::casa::MPIServer-1
2024-10-12 20:16:52 INFO ::casa::MPIServer-1 Checking Measures tables in data repository sub-directory /home/rxue/Workspace/nvme/nrao/casa_dist/casarundata/geodetic
2024-10-12 20:16:52 INFO ::casa Using configuration file ~/.casa/config.py
2024-10-12 20:16:52 INFO ::casa Using user-supplied startup.py at ~/.casa/startup.py
2024-10-12 20:16:52 INFO ::casa::MPIServer-2 Using configuration file ~/.casa/config.py
2024-10-12 20:16:52 INFO ::casa::MPIServer-2 Using user-supplied startup.py at ~/.casa/startup.py
2024-10-12 20:16:52 INFO ::casa::MPIServer-2
2024-10-12 20:16:52 INFO ::casa
2024-10-12 20:16:52 INFO ::casa::MPIServer-5
2024-10-12 20:16:52 INFO ::casa::MPIServer-2 Checking Measures tables in data repository sub-directory /home/rxue/Workspace/nvme/nrao/casa_dist/casarundata/geodetic
2024-10-12 20:16:52 INFO ::casa::MPIServer-5 Checking Measures tables in data repository sub-directory /home/rxue/Workspace/nvme/nrao/casa_dist/casarundata/geodetic
2024-10-12 20:16:52 INFO ::casa Checking Measures tables in data repository sub-directory /home/rxue/Workspace/nvme/nrao/casa_dist/casarundata/geodetic
CASA 6.6.1.15 -- Common Astronomy Software Applications
CASA 6.6.1.15 -- Common Astronomy Software Applications
2024-10-12 20:16:52 INFO ::casa::MPIServer-7 Using configuration file ~/.casa/config.py
2024-10-12 20:16:52 INFO ::casa::MPIServer-4 Using configuration file ~/.casa/config.py
2024-10-12 20:16:52 INFO ::casa::MPIServer-6 Using configuration file ~/.casa/config.py
2024-10-12 20:16:52 INFO ::casa::MPIServer-7 Using user-supplied startup.py at ~/.casa/startup.py
2024-10-12 20:16:52 INFO ::casa::MPIServer-3 Using configuration file ~/.casa/config.py
2024-10-12 20:16:52 INFO ::casa::MPIServer-4 Using user-supplied startup.py at ~/.casa/startup.py
2024-10-12 20:16:52 INFO ::casa::MPIServer-6 Using user-supplied startup.py at ~/.casa/startup.py
2024-10-12 20:16:52 INFO ::casa::MPIServer-3 Using user-supplied startup.py at ~/.casa/startup.py
2024-10-12 20:16:52 INFO ::casa::MPIServer-7
2024-10-12 20:16:52 INFO ::casa::MPIServer-7 Checking Measures tables in data repository sub-directory /home/rxue/Workspace/nvme/nrao/casa_dist/casarundata/geodetic
2024-10-12 20:16:52 INFO ::casa::MPIServer-3
2024-10-12 20:16:52 INFO ::casa::MPIServer-6
2024-10-12 20:16:52 INFO ::casa::MPIServer-4
2024-10-12 20:16:52 INFO ::casa::MPIServer-3 Checking Measures tables in data repository sub-directory /home/rxue/Workspace/nvme/nrao/casa_dist/casarundata/geodetic
2024-10-12 20:16:52 INFO ::casa::MPIServer-6 Checking Measures tables in data repository sub-directory /home/rxue/Workspace/nvme/nrao/casa_dist/casarundata/geodetic
2024-10-12 20:16:52 INFO ::casa::MPIServer-4 Checking Measures tables in data repository sub-directory /home/rxue/Workspace/nvme/nrao/casa_dist/casarundata/geodetic
2024-10-12 20:16:52 INFO ::casa::MPIServer-1 IERSeop2000 (version date, last date in table (UTC)): 2024/10/05/15:00, 2024/09/05/00:00:00
2024-10-12 20:16:52 INFO ::casa IERSeop2000 (version date, last date in table (UTC)): 2024/10/05/15:00, 2024/09/05/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-5 IERSeop2000 (version date, last date in table (UTC)): 2024/10/05/15:00, 2024/09/05/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-7 IERSeop2000 (version date, last date in table (UTC)): 2024/10/05/15:00, 2024/09/05/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-3 IERSeop2000 (version date, last date in table (UTC)): 2024/10/05/15:00, 2024/09/05/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-4 IERSeop2000 (version date, last date in table (UTC)): 2024/10/05/15:00, 2024/09/05/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-6 IERSeop2000 (version date, last date in table (UTC)): 2024/10/05/15:00, 2024/09/05/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-2 IERSeop2000 (version date, last date in table (UTC)): 2024/10/05/15:00, 2024/09/05/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-5 IERSeop97 (version date, last date in table (UTC)): 2024/10/05/15:00, 2024/09/05/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-7 IERSeop97 (version date, last date in table (UTC)): 2024/10/05/15:00, 2024/09/05/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-4 IERSeop97 (version date, last date in table (UTC)): 2024/10/05/15:00, 2024/09/05/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-3 IERSeop97 (version date, last date in table (UTC)): 2024/10/05/15:00, 2024/09/05/00:00:00
2024-10-12 20:16:52 INFO ::casa IERSeop97 (version date, last date in table (UTC)): 2024/10/05/15:00, 2024/09/05/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-6 IERSeop97 (version date, last date in table (UTC)): 2024/10/05/15:00, 2024/09/05/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-1 IERSeop97 (version date, last date in table (UTC)): 2024/10/05/15:00, 2024/09/05/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-7 IERSpredict (version date, last date in table (UTC)): 2024/10/11/15:00, 2025/01/09/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-5 IERSpredict (version date, last date in table (UTC)): 2024/10/11/15:00, 2025/01/09/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-6 IERSpredict (version date, last date in table (UTC)): 2024/10/11/15:00, 2025/01/09/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-3 IERSpredict (version date, last date in table (UTC)): 2024/10/11/15:00, 2025/01/09/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-4 IERSpredict (version date, last date in table (UTC)): 2024/10/11/15:00, 2025/01/09/00:00:00
2024-10-12 20:16:52 INFO ::casa IERSpredict (version date, last date in table (UTC)): 2024/10/11/15:00, 2025/01/09/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-1 IERSpredict (version date, last date in table (UTC)): 2024/10/11/15:00, 2025/01/09/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-2 IERSeop97 (version date, last date in table (UTC)): 2024/10/05/15:00, 2024/09/05/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-6 TAI_UTC (version date, last date in table (UTC)): 2024/09/29/15:00, 2017/01/01/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-3 TAI_UTC (version date, last date in table (UTC)): 2024/09/29/15:00, 2017/01/01/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-7 TAI_UTC (version date, last date in table (UTC)): 2024/09/29/15:00, 2017/01/01/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-5 TAI_UTC (version date, last date in table (UTC)): 2024/09/29/15:00, 2017/01/01/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-4 TAI_UTC (version date, last date in table (UTC)): 2024/09/29/15:00, 2017/01/01/00:00:00
2024-10-12 20:16:52 INFO ::casa TAI_UTC (version date, last date in table (UTC)): 2024/09/29/15:00, 2017/01/01/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-1 TAI_UTC (version date, last date in table (UTC)): 2024/09/29/15:00, 2017/01/01/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-2 IERSpredict (version date, last date in table (UTC)): 2024/10/11/15:00, 2025/01/09/00:00:00
2024-10-12 20:16:52 INFO ::casa::MPIServer-2 TAI_UTC (version date, last date in table (UTC)): 2024/09/29/15:00, 2017/01/01/00:00:00
working? False
working? False
working? False
working? False
working? False
working? False
working? False
working? False
--------------------------------------------------------------------------As an interim solution, we should continue using mpi4py<4+openmpi=5.0.3 for the time being.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels