Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions cime_config/machines/config_batch.xml
Original file line number Diff line number Diff line change
Expand Up @@ -553,6 +553,7 @@

<batch_system MACH="aurora" type="pbspro">
<batch_submit>/lus/flare/projects/E3SM_Dec/tools/qsub/throttle</batch_submit>
<jobid_pattern>(\d+)\.aurora-pbs</jobid_pattern>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Occasionally, job submissions' stdout returns more output beyond a job's id: e.g.

ERROR: Couldn't match jobid_pattern '^(\d+)' within submit output:
 'auth: error returned: 15007
auth: Failed to receive auth token
No Permission.
qstat: cannot connect to server aurora-pbs-0001.hostmgmt.cm.aurora.alcf.anl.gov (errno=15007)
5435693.aurora-pbs-0001.hostmgmt.cm.aurora.alcf.anl.gov'

This regex update is to extract job-id from such longer strings to let CIME continue with its job stages.

<directives>
<directive> -l filesystems=home:flare </directive>
</directives>
Expand Down
78 changes: 43 additions & 35 deletions cime_config/machines/config_machines.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3603,66 +3603,68 @@
<MAX_MPITASKS_PER_NODE compiler="oneapi-ifxgpu">12</MAX_MPITASKS_PER_NODE>
<PROJECT_REQUIRED>FALSE</PROJECT_REQUIRED>
<mpirun mpilib="default">
<executable>mpiexec</executable>
<!--executable>numactl -m 2-3 mpiexec</executable--><!--for HBM runs-->
<arguments>
<arg name="total_num_tasks">-np {{ total_tasks }} --label</arg>
<arg name="ranks_per_node">-ppn {{ tasks_per_node }}</arg>
<arg name="ranks_bind">--cpu-bind $ENV{RANKS_BIND}</arg>
<arg name="threads_per_rank">-d $ENV{OMP_NUM_THREADS}</arg>
<arg name="gpu_maps">$ENV{GPU_TILE_COMPACT}</arg>
</arguments>
<executable>mpiexec</executable>
<!--executable>numactl -m 2-3 mpiexec</executable--><!--for HBM runs-->
<arguments>
<arg name="total_num_tasks">-np {{ total_tasks }} --label</arg>
<arg name="ranks_per_node">-ppn {{ tasks_per_node }}</arg>
<arg name="ranks_bind">--cpu-bind $ENV{RANKS_BIND}</arg>
<arg name="threads_per_rank">-d $ENV{OMP_NUM_THREADS} $ENV{RLIMITS}</arg>
<arg name="gpu_maps">$ENV{GPU_TILE_COMPACT}</arg>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we remove this since we are defining

--gpu-bind list:0.0:0.1:1.0:1.1:2.0:2.1:3.0:3.1:4.0:4.1:5.0:5.1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abagusetty what's the equivalent of OpenMPI's OMPI_COMM_WORLD_LOCAL_RANK on Aurora-mpich?
Kokkos raises

`Warning: unable to detect local MPI rank. Falling back to the first GPU available for execution. Raised by Kokkos::initialize()`

if $MPI_LOCALRANKID env-var is undefined. $PALS_LOCAL_RANKID appears to be empty also.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we remove this since we are defining

--gpu-bind list:0.0:0.1:1.0:1.1:2.0:2.1:3.0:3.1:4.0:4.1:5.0:5.1

I am assuming you are asking for the last line. Yes, for <arg name="gpu_maps">$ENV{GPU_TILE_COMPACT}</arg> this can be removed. Since gpu-bind takes care of binding tiles with MPI-processes and wouldnt need a script.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amametjanov Aurora MPICH doesnt have an equivalent. But PALS should work the same: PALS_LOCAL_RANKID. It depends on where you would be using this var. Notsure why is this empty. I believe this warnings were fixed a while back.

As long as PALS_LOCAL_RANKID is embedded after mpiexec launch command, PALS should be defined. For instance: mpiexec ... $PALS_LOCAL_RANK $EXE...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks. Straight removal of gpu_tile_compact.sh script was raising those warnings. I'll push a commit to appropriately export correct $MPI_LOCALRANKID.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to prefer the gpu-bind argument over the script?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-gpu-bind is a more of a official mpich option that allows topology aware bindings internally over the script. The script was a temp. WA just since we didnt have a GPU-binding mechanism for Aurora in the earlier days

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing mpiexec ... --genv MPI_LOCALRANKID=${PALS_LOCAL_RANKID} ... isn't helping: empty MPI_LOCALRANKID.
I added a Kokkos mod adding PALS_LOCAL_RANKID to the list of recognized env-vars at E3SM-Project/EKAT#372 . When that PR makes it into E3SM master, I can remove the call to gpu_tile_compact.sh in a separate PR. If that's okay, then maybe this PR can go without that mod.

</arguments>
</mpirun>
<module_system type="module" allow_error="true">
<init_path lang="sh">/usr/share/lmod/lmod/init/sh</init_path>
<init_path lang="csh">/usr/share/lmod/lmod/init/csh</init_path>
<init_path lang="python">/usr/share/lmod/lmod/init/env_modules_python.py</init_path>
<cmd_path lang="sh">module</cmd_path>
<cmd_path lang="csh">module</cmd_path>
<cmd_path lang="python">/usr/share/lmod/lmod/libexec/lmod python</cmd_path>
<modules>
<command name="load">cmake/3.30.5</command>
<command name="load">oneapi/release/2025.0.5</command>
</modules>
</module_system>
<RUNDIR>$CIME_OUTPUT_ROOT/$CASE/run</RUNDIR>
<EXEROOT>$CIME_OUTPUT_ROOT/$CASE/bld</EXEROOT>
<MAX_GB_OLD_TEST_DATA>0</MAX_GB_OLD_TEST_DATA>
<environment_variables>
<init_path lang="sh">/usr/share/lmod/lmod/init/sh</init_path>
<init_path lang="csh">/usr/share/lmod/lmod/init/csh</init_path>
<init_path lang="python">/usr/share/lmod/lmod/init/env_modules_python.py</init_path>
<cmd_path lang="sh">module</cmd_path>
<cmd_path lang="csh">module</cmd_path>
<cmd_path lang="python">/usr/share/lmod/lmod/libexec/lmod python</cmd_path>
<modules>
<command name="load">cmake/3.30.5</command>
<command name="load">oneapi/release/2025.0.5</command>
<command name="load">mpich-config/collective-tuning/1024</command>
</modules>
</module_system>
<RUNDIR>$CIME_OUTPUT_ROOT/$CASE/run</RUNDIR>
<EXEROOT>$CIME_OUTPUT_ROOT/$CASE/bld</EXEROOT>
<MAX_GB_OLD_TEST_DATA>0</MAX_GB_OLD_TEST_DATA>
<environment_variables>
<env name="NETCDF_PATH">/lus/flare/projects/E3SM_Dec/soft/netcdf/4.9.2c-4.6.1f/oneapi.eng.2024.07.30.002</env>
<env name="PNETCDF_PATH">/lus/flare/projects/E3SM_Dec/soft/pnetcdf/1.14.0/oneapi.eng.2024.07.30.002</env>
<env name="LD_LIBRARY_PATH">/lus/flare/projects/E3SM_Dec/soft/pnetcdf/1.14.0/oneapi.eng.2024.07.30.002/lib:/lus/flare/projects/E3SM_Dec/soft/netcdf/4.9.2c-4.6.1f/oneapi.eng.2024.07.30.002/lib:$ENV{LD_LIBRARY_PATH}</env>
<env name="PATH">/lus/flare/projects/E3SM_Dec/soft/pnetcdf/1.14.0/oneapi.eng.2024.07.30.002/bin:/lus/flare/projects/E3SM_Dec/soft/netcdf/4.9.2c-4.6.1f/oneapi.eng.2024.07.30.002/bin:$ENV{PATH}</env>
<env name="FI_CXI_DEFAULT_CQ_SIZE">131072</env>
<env name="FI_CXI_CQ_FILL_PERCENT">20</env>
<env name="RLIMITS"> </env>
</environment_variables>
<environment_variables compiler="oneapi-ifxgpu">
<env name="ONEAPI_DEVICE_SELECTOR">level_zero:gpu</env>
<env name="MPIR_CVAR_CH4_COLL_SELECTION_TUNING_JSON_FILE"></env>
<env name="MPIR_CVAR_COLL_SELECTION_TUNING_JSON_FILE"></env>
<env name="MPIR_CVAR_CH4_POSIX_COLL_SELECTION_TUNING_JSON_FILE"></env>
<env name="UR_L0_USE_DRIVER_INORDER_LISTS">1</env>
<env name="UR_L0_ENABLE_RELAXED_ALLOCATION_LIMITS">1</env>
<env name="UR_L0_USE_COPY_ENGINE_FOR_IN_ORDER_QUEUE">1</env>
<!--<env name="FI_PROVIDER">cxi</env>-->
<env name="FI_MR_CACHE_MONITOR">disabled</env>
<env name="FI_MR_CACHE_MONITOR">disabled</env>
<env name="FI_CXI_OVFLOW_BUF_SIZE">8388608</env>
<env name="PALS_PING_PERIOD">240</env>
<env name="PALS_RPC_TIMEOUT">240</env>
<env name="SYCL_PI_LEVEL_ZERO_SINGLE_THREAD_MODE">1</env>
<env name="SYCL_PI_LEVEL_ZERO_DISABLE_USM_ALLOCATOR">1</env>
<env name="SYCL_PI_LEVEL_ZERO_USM_RESIDENT">0x001</env>
<env name="UR_L0_USE_DRIVER_INORDER_LISTS">1</env>
<env name="UR_L0_USE_COPY_ENGINE_FOR_IN_ORDER_QUEUE">1</env>

<env name="MPIR_CVAR_ENABLE_GPU">1</env>
<env name="MPIR_CVAR_ENABLE_GPU">1</env>
<env name="romio_cb_read">disable</env>
<env name="romio_cb_write">disable</env>
<env name="SYCL_CACHE_PERSISTENT">1</env>
<env name="GATOR_INITIAL_MB">4000MB</env>
<env name="GATOR_DISABLE">0</env>
<env name="GPU_TILE_COMPACT">/lus/flare/projects/E3SM_Dec/tools/mpi_wrapper_utils/gpu_tile_compact.sh</env>
<env name="RANKS_BIND">list:1-8:9-16:17-24:25-32:33-40:41-48:53-60:61-68:69-76:77-84:85-92:93-100 --gpu-bind list:0.0:0.1:1.0:1.1:2.0:2.1:3.0:3.1:4.0:4.1:5.0:5.1 --mem-bind list:0:0:0:0:0:0:1:1:1:1:1:1</env>
<env name="ZES_ENABLE_SYSMAN">1</env>
<!-- default is ZE_FLAT_DEVICE_HIERARCHY=COMPOSITE: enable this to run 4 MPI/tile or 48 MPI/node
<env name="ZEX_NUMBER_OF_CCS">0:4,1:4,2:4,3:4:4:4,5:4</env>-->
<!-- <env name="ZE_FLAT_DEVICE_HIERARCHY">FLAT</env>
<env name="ZEX_NUMBER_OF_CCS">0:4,1:4,2:4,3:4:4:4,5:4,6:4,7:4,8:4,9:4,10:4,11:4</env>-->
<!-- default is ZE_FLAT_DEVICE_HIERARCHY=COMPOSITE: enable this to run 4 MPI/tile or 48 MPI/node
<env name="ZEX_NUMBER_OF_CCS">0:4,1:4,2:4,3:4:4:4,5:4</env>-->
<!-- <env name="ZE_FLAT_DEVICE_HIERARCHY">FLAT</env>
<env name="ZEX_NUMBER_OF_CCS">0:4,1:4,2:4,3:4:4:4,5:4,6:4,7:4,8:4,9:4,10:4,11:4</env>-->
</environment_variables>
<environment_variables compiler="oneapi-ifx">
<env name="LIBOMPTARGET_DEBUG">0</env><!--default 0, max 5 -->
Expand All @@ -3675,6 +3677,12 @@
<env name="KMP_AFFINITY">granularity=core,balanced</env>
<env name="OMP_STACKSIZE">128M</env>
</environment_variables>
<environment_variables DEBUG="TRUE">
<env name="RLIMITS">--rlimits CORE</env>
</environment_variables>
<resource_limits DEBUG="TRUE">
<resource name="RLIMIT_CORE">-1</resource>
</resource_limits>
<resource_limits>
<resource name="RLIMIT_STACK">-1</resource>
</resource_limits>
Expand Down