-
Notifications
You must be signed in to change notification settings - Fork 8
Use automatic script generation #322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use automatic script generation #322
Conversation
to set the module and environmental variables commands variable
Demonstration of the issue:Here's a print out of what is returned by from mache.spack.env import get_modules_env_vars_and_mpi_compilers
mpicc, _, _, modules = get_modules_env_vars_and_mpi_compilers("pm-cpu", "gnu", "mpich", "sh"
print(modules)
|
(Initial) TestingRunning the same command as above, with this branch produces: source /usr/share/lmod/8.3.1/init/sh
module rm \
cpe \
PrgEnv-gnu \
PrgEnv-intel \
PrgEnv-nvidia \
PrgEnv-cray \
PrgEnv-aocc \
gcc-native \
intel \
intel-oneapi \
nvidia \
aocc \
cudatoolkit \
climate-utils \
cray-libsci \
matlab \
craype-accel-nvidia80 \
craype-accel-host \
perftools-base \
perftools \
darshan &> /dev/null
module load \
PrgEnv-gnu/8.5.0 \
gcc-native/13.2 \
cray-libsci/24.07.0 \
craype-accel-host \
craype/2.7.32 \
cray-mpich/8.1.30 \
cmake/3.30.2
export MPICH_ENV_DISPLAY="1"
export MPICH_VERSION_DISPLAY="1"
export MPICH_MPIIO_DVS_MAXNODES="1"
export HDF5_USE_FILE_LOCKING="FALSE"
export FI_MR_CACHE_MONITOR="kdreg2"
export MPICH_COLL_SYNC="MPI_Bcast"
export GATOR_INITIAL_MB="4000MB"
export LD_LIBRARY_PATH="${CRAY_LD_LIBRARY_PATH}:${LD_LIBRARY_PATH}"
export MPICH_SMP_SINGLE_COPY_MODE="CMA"
export PKG_CONFIG_PATH="/global/cfs/cdirs/e3sm/3rdparty/protobuf/21.6/gcc-native-12.3/lib/pkgconfig:${PKG_CONFIG_PATH}"
export BLA_VENDOR="Generic"
if [ -z "${NERSC_HOST:-}" ]; then
# happens when building spack environment
export NERSC_HOST="perlmutter"
fiI'll work on deploying with this too see if it fixes our |
Testing (Perlmutter)This appears to fix the issue related to linking MPI and salloc --nodes 1 --qos interactive --time 00:15:00 --constraint cpu --account e3sm
source /global/common/software/e3sm/anaconda_envs/test_e3sm_unified_1.12.0rc3_pm-cpu.sh
python -c "from mpi4py import MPI"Output: |
xylar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewdnolan, this makes a ton of sense. I clearly didn't think about this and I'm sorry I missed it.
|
@xylar I was curious how I was able to successfully deploy (and test) Starting a fresh shell on $ which mpicc
/opt/aurora/25.190.0/spack/unified/0.10.1/install/linux-sles15-x86_64/oneapi-2025.2.0/mpich-develop-git.6037a7a-cym6jg6/bin/mpiccWithing $ source /lus/flare/projects/E3SMinput/soft/e3sm-unified/test_e3sm_unified_1.12.0rc3_aurora.sh
$ which mpicc
/opt/aurora/25.190.0/spack/unified/0.10.1/install/linux-sles15-x86_64/oneapi-2025.2.0/mpich-develop-git.6037a7a-cym6jg6/bin/mpiccSo, I was able to successfully test on Aurora because the default modules are the same as what's in unified. So even without loading them as part of the build script, we were able to properly link MPI and |
With #303, setting the
mod_env_commandsvariable returned byget_modules_env_vars_and_mpi_compilerswas broken.None of the module or environmental variables were parsed. Which was causing an MPI linking issue with
mpi4pyon (at least) pm-cpu and compy with the deployment of e3sm-unified 1.12.0rc3.Checklist
Testingcomment, if appropriate, in the PR documents testing used to verify the changes