Skip to content

Simple mpi-serial case on Casper failing in setup #143

@ekluzek

Description

@ekluzek

Hello, trying to run this test

SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc

With these externals in ctsm5.1.dev158

diff --git a/Externals.cfg b/Externals.cfg
index a17f8e2ec..b29af5c64 100644
--- a/Externals.cfg
+++ b/Externals.cfg
@@ -34,7 +34,7 @@ hash = 34723c2
 required = True
 
 [ccs_config]
-tag = ccs_config_cesm0.0.84
+tag = ccs_config_cesm0.0.87
 protocol = git
 repo_url = https://github.com/ESMCI/ccs_config_cesm.git
 local_path = ccs_config
@@ -44,11 +44,11 @@ required = True
 local_path = cime
 protocol = git
 repo_url = https://github.com/ESMCI/cime
-tag = cime6.0.175
+tag = cime6.0.198
 required = True
 
 [cmeps]
-tag = cmeps0.14.43
+tag = cmeps0.14.47
 protocol = git
 repo_url = https://github.com/ESCOMP/CMEPS.git
 local_path = components/cmeps

fails for me as follows.

qcmd -- ./create_test SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc -r . 
Waiting on job launch; 9351378.casper-pbs with qsub arguments:
    qsub  -l select=1:ncpus=1:mem=10GB -A P93300606 -q casper@casper-pbs -l walltime=01:00:00

Warning: no access to tty (Inappropriate ioctl for device).
Thus no job control in this shell.
Testnames: ['SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc']
Using project from .cesm_proj: P93300041
create_test will do up to 1 tasks simultaneously
create_test will use up to 45 cores simultaneously
Creating test directory /glade/work/erik/ctsm_worktrees/external_updates/cime/scripts/SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc.20240112_170555_qa7smi
RUNNING TESTS:
SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc
Starting CREATE_NEWCASE for test SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc with 1 procs
Finished CREATE_NEWCASE for test SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc in 186.715961 seconds (PASS)
Starting XML for test SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc with 1 procs
Finished XML for test SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc in 119.385811 seconds (PASS)
Starting SETUP for test SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc with 1 procs
Finished SETUP for test SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc in 1.602533 seconds (FAIL). [COMPLETED 1 of 1]
Case dir: /glade/work/erik/ctsm_worktrees/external_updates/cime/scripts/SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc.20240112_170555_qa7smi
Errors were:
ERROR: module command /glade/u/apps/casper/23.10/spack/opt/spack/lmod/8.7.24/gcc/7.5.0/m4jx/lmod/lmod/libexec/lmod python load ncarenv/23.10 cmake/3.26.3 intel/2023.2.1 mkl/2023.2.0 netcdf/4.9.2 ncarcompilers/0.5.0 parallelio/2.6.2 esmf/8.5.0 ncarcompilers/1.0.0 failed with message:
Lmod has detected the following error: The following module(s) are unknown:
"ncarcompilers/0.5.0"

Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
$ module --ignore_cache load "ncarcompilers/0.5.0"

Also make sure that all modulefiles written in TCL start with the string
#%Module

Waiting for tests to finish
FAIL SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc (phase SETUP)
Case dir: /glade/work/erik/ctsm_worktrees/external_updates/cime/scripts/SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc.20240112_170555_qa7smi
Due to presence of batch system, create_test will exit before tests are complete.
To force create_test to wait for full completion, use --wait
test-scheduler took 380.3022334575653 seconds
casper-login1 cime/scripts> cd SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc.20240112_170555_qa7smi/
Directory: /glade/work/erik/ctsm_worktrees/external_updates/cime/scripts/SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc.20240112_170555_qa7smi
casper-login1 scripts/SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc.20240112_170555_qa7smi> cat TestStatus
PASS SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc CREATE_NEWCASE
PASS SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc XML
FAIL SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc SETUP
casper-login1 scripts/SMS_D_Lm1_Mmpi-serial.CLM_USRDAT.I1PtClm50SpRs.casper_intel.clm-USUMB_nuopc.20240112_170555_qa7smi> ./case.setup 
ERROR: module command /glade/u/apps/casper/23.10/spack/opt/spack/lmod/8.7.24/gcc/7.5.0/m4jx/lmod/lmod/libexec/lmod python load ncarenv/23.10 cmake/3.26.3 intel/2023.2.1 mkl/2023.2.0 netcdf/4.9.2 ncarcompilers/0.5.0 parallelio/2.6.2 esmf/8.5.0 ncarcompilers/1.0.0 failed with message:
Lmod has detected the following error: The following module(s) are unknown: "ncarcompilers/0.5.0"

Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore_cache load "ncarcompilers/0.5.0"

Also make sure that all modulefiles written in TCL start with the string #%Module

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions