Skip to content

Conversation

@mjschmidt271
Copy link
Collaborator

@mjschmidt271 mjschmidt271 commented May 16, 2025

This PR enables building for Intel GPUs and SYCL and is associated with the Haero PR #491.

A few things to note:

  • I've modified the build-haero.sh script to work for different GPU types.
    • This required adding an extra argument for DEVICE_ARCH.
      • This works the same as in Haero's setup script or the generated config.sh, so take a look there if you'd like to know what argument to provide.
    • I've not used this script in a while, but I am not confident it will continue working on NVIDIA cards without providing the correct DEVICE_ARCH arg, so let me know if you have a good workaround.
      • Does this work for NVIDIA with the arg?
      • Does this work for AMD?
      • Does this work for NVIDIA without the arg?
  • The grumpy Intel compilers did not like passing a function pointer as a function argument, as was done in conversions.hpp:relative_humidity_from_vapor_mixing_ratio().
    • As far as I could tell, the optional function-pointer argument is never used in practice, so I just hard-coded the default value.
      • I left the original version, commented, for now.
    • If this functionality is important, I'm sure we can figure something out with a template or just an if-statement.
  • Also likely related to the Intel compiler's particular nature, the mode_averages unit tests are failing.
    • The failure message is below, as is a compile-time warning that is not especially informative.
  • We've picked up some compile-time warnings that I haven't seen before.
    • Output is below.
    • There's another one that's Kokkos-related, so I haven't included it.
    • Pretty sure at least 2 of them are from me 😬
Output from `mode_averages` unit test failure

19 - mode_averages (Failed)

mam4xx/src/tests/mode_averages_unit_tests.cpp:27: FAILED:
due to unexpected exception with message:
  The program was built for 1 devices
  Build program log for 'Intel(R) Data Center GPU Max 1550':
  Module <0x819ccc0>:  Unresolved Symbol <nan>
  Module <0x819ccc0>:  Unresolved Symbol <nan>
mam4xx/src/tests/mode_averages_unit_tests.cpp:27:1: warning: unused variable 'autoRegistrar1' [-Wunused-variable]
   27 | TEST_CASE("modal_averages", "modal_averages_unit_tests") {
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mam4xx/haero-build/include/catch2/catch.hpp:17702:26: note: expanded from macro 'TEST_CASE'
 17702 | #define TEST_CASE( ... ) INTERNAL_CATCH_TESTCASE( __VA_ARGS__ )
       |                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mam4xx/haero-build/include/catch2/catch.hpp:1055:9: note: expanded from macro 'INTERNAL_CATCH_TESTCASE'
 1055 |         INTERNAL_CATCH_TESTCASE2( INTERNAL_CATCH_UNIQUE_NAME( C_A_T_C_H_T_E_S_T_ ), __VA_ARGS__ )
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mam4xx/haero-build/include/catch2/catch.hpp:1051:35: note: expanded from macro 'INTERNAL_CATCH_TESTCASE2'
 1051 |         namespace{ Catch::AutoReg INTERNAL_CATCH_UNIQUE_NAME( autoRegistrar )( Catch::makeTestInvoker( &TestName ), CATCH_INTERNAL_LINEINFO, Catch::StringRef(), Catch::NameAndTags{ __VA_ARGS__ } ); } /* NOLINT */ \
      |                                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: (skipping 1 expansions in backtrace; use -fmacro-backtrace-limit=0 to see all)
mam4xx/haero-build/include/catch2/catch.hpp:469:55: note: expanded from macro 'INTERNAL_CATCH_UNIQUE_NAME_LINE'
  469 | #define INTERNAL_CATCH_UNIQUE_NAME_LINE( name, line ) INTERNAL_CATCH_UNIQUE_NAME_LINE2( name, line )
      |                                                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mam4xx/haero-build/include/catch2/catch.hpp:468:56: note: expanded from macro 'INTERNAL_CATCH_UNIQUE_NAME_LINE2'
  468 | #define INTERNAL_CATCH_UNIQUE_NAME_LINE2( name, line ) name##line
      |                                                        ^~~~~~~~~~
<scratch space>:77:1: note: expanded from here
   77 | autoRegistrar1
      | ^~~~~~~~~~~~~~
Compiler Warnings
mam4xx/src/tests/mam4_rename_unit_tests.cpp:20:1: warning: unused variable 'autoRegistrar1' [-Wunused-variable]
   20 | TEST_CASE("test_constructor", "mam4_rename_process") {
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mam4xx/src/tests/mam4_ndrop_unit_tests.cpp:232:12: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
  232 |   Real eta[nmodes];
      |            ^~~~~~
mam4xx/src/tests/mam4_ndrop_unit_tests.cpp:232:12: note: read of non-const variable 'nmodes' is not allowed in a constant expression
mam4xx/src/tests/mam4_ndrop_unit_tests.cpp:228:7: note: declared here
  228 |   int nmodes = AeroConfig::num_modes();
      |       ^
mam4xx/src/tests/mam4_ndrop_unit_tests.cpp:233:12: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
  233 |   Real smc[nmodes];
      |            ^~~~~~
mam4xx/src/tests/mam4_ndrop_unit_tests.cpp:233:12: note: read of non-const variable 'nmodes' is not allowed in a constant expression
mam4xx/src/tests/mam4_ndrop_unit_tests.cpp:228:7: note: declared here
  228 |   int nmodes = AeroConfig::num_modes();
mam4xx/src/validation/mo_setinv/setinv_test_single_level.cpp:42:21: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
   42 |     Real invariants[nfs];
      |                     ^~~
mam4xx/src/validation/mo_setinv/setinv_test_single_level.cpp:42:21: note: initializer of 'nfs' is not a constant expression
mam4xx/src/validation/mo_setinv/setinv_test_single_level.cpp:37:15: note: declared here
   37 |     const int nfs = input.get_array("nfs")[0];
mam4xx/src/validation/aerosol_optics/modal_aero_lw.cpp:167:35: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
  167 |     View1DHost refrtablw_host[N1][N3];
      |                                   ^~
mam4xx/src/validation/aerosol_optics/modal_aero_lw.cpp:167:35: note: read of non-const variable 'N3' is not allowed in a constant expression
mam4xx/src/validation/aerosol_optics/modal_aero_lw.cpp:166:9: note: declared here
  166 |     int N3 = nlwbands;
      |         ^
mam4xx/src/validation/aerosol_optics/modal_aero_lw.cpp:167:31: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
  167 |     View1DHost refrtablw_host[N1][N3];
      |                               ^~
mam4xx/src/validation/aerosol_optics/modal_aero_lw.cpp:167:31: note: read of non-const variable 'N1' is not allowed in a constant expression
mam4xx/src/validation/aerosol_optics/modal_aero_lw.cpp:164:9: note: declared here
  164 |     int N1 = ntot_amode;
mam4xx/src/validation/modal_aero_amicphys_subareas/set_subarea_rh.cpp:59:42: warning: braces around scalar initializer [-Wbraced-scalar-init]
   59 |           Real relhumsub[subarea_max] = {{0.0}};

@codecov
Copy link

codecov bot commented May 16, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.45%. Comparing base (7721254) to head (829d12e).
Report is 7 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #445   +/-   ##
=======================================
  Coverage   93.45%   93.45%           
=======================================
  Files         303      303           
  Lines       25151    25151           
  Branches     2784     2784           
=======================================
  Hits        23505    23505           
  Misses       1646     1646           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mjschmidt271 mjschmidt271 force-pushed the mjs/m4x/intel-sycl-build branch from 840fae3 to b3e7df8 Compare May 20, 2025 00:43
fix precision case error
@mjschmidt271
Copy link
Collaborator Author

Ok! It looks like I've finally gotten this ironed out.

As it turns out, our mam4xx/haero build is highly sensitive to setting the proper cuda architecture in the nvcc_wrapper found in kokkos/bin

  • This could be attributable to the modifications I've recently made or something I'm missing, but things appear to be working reliably now, nonetheless.
  • My current fix involves the same hack I use in the autotester workflow--using sed to replace the default value with the correct value provided via command line arg.
    • This is far from an elegant fix, so I'm open to feedback on other ideas.

In the long term, it may be best to have a conversation about bringing haero's machinery into mam4xx since setting up the builds for other GPU architectures was a little more challenging than expected. I've also introduced extra complexity with my patches, and I would very much like to avoid making it even more arcane.

However, I'd say there's also an even chance that we're past that hurdle and things should remain stable for a while 🤞

@singhbalwinder @odiazib @overfelt @jaelynlitz @jeff-cohere

Copy link
Collaborator

@jeff-cohere jeff-cohere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I think your comment about folding haero into mam4xx makes sense at this point. We built haero for a previous imagining of a C++ aerosol model, so having it separate from mam4xx hasn't made a whole lot of sense for a while (though it did give us a nice standalone testing environment that I think is worth keeping).

Copy link
Contributor

@odiazib odiazib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks great! I was also able to compile MAM4xx in Aurora using this branch. I only recommend a few changes in build-haero so that we can use this script on all the machines.

build-haero.sh Outdated
# NOTE: if CXX is set to nvcc_wrapper, then this must be the same path used
# in the `sed` command below
# This happens by default via the $nvcw variable
CXX="$(pwd)/.haero/ext/ekat/extern/kokkos/bin/nvcc_wrapper"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need build-haero to be independent of the GPU type. Perhaps we should pass a new input parameter for the GPU type.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah. originally meant that to be commented out. However, in line with your following comment, I did a big rewrite on build-haero, and it now works for all three types via command-line args. However, it does require an additional arg.

I won't say it's pretty, and it's getting to the point where it's feeling too complicated to be long-lived... but it does appear to be reasonably capable for the time being.

Tested on blake (nvidia H100), caraway (amd MI200), and aurora (intel ponte-vecchio), and things are mostly good. I did, however run into a few failing tests on AMD. Test names and output are below.

The following tests FAILED:
        512 - validate_chm_diags_ts_355 (Failed)
        544 - validate_gas_washout_merged (Failed)
        586 - validate_aer_rad_props_lw_ts_355 (Failed)
        588 - validate_aer_rad_props_sw_ts_355 (Failed)
Output from failing tests
Start testing: May 21 17:56 MDT
----------------------------------------------------------
512/646 Testing: validate_chm_diags_ts_355
512/646 Test: validate_chm_diags_ts_355
Command: "/projects/x86-64-zen-rocky8/utilities/python/3.10.12/gcc/8.5.0/base/jliu53k/bin/python3" "compare_mam4xx_mam4.py" "mam4xx_chm_diags_ts_355.py" "mam_chm_diags_ts_355.py" "True" "9e-2"
Directory: /home/mjschm/cara_mam4xx/build/src/validation/mo_chm_diags
"validate_chm_diags_ts_355" start time: May 21 17:56 MDT
Output:
----------------------------------------------------------
area
L1 5.304647879e+93
L2 5.304647879e+93
Linf 5.304647879e+93

L1 rel_error 1.0
L2 rel_error 1.0
Linf rel_error 1.0
df_nhx
L1 0.0
L2 0.0
Linf 0.0
df_noy
L1 0.0
L2 0.0
Linf 0.0
df_sox
L1 2.1127595761198383e-12
L2 2.1127595761198383e-12
Linf 2.1127595761198383e-12
drymass
L1 5.3344978817229e+97
L2 8.124504007622111e+96
Linf 2.087694766e+96

L1 rel_error 25.55209683235322
L2 rel_error 3.8916148758606943
Linf rel_error 1.0
mass
L1 5.3634051141167e+97
L2 8.167922012785719e+96
Linf 2.100738982e+96

L1 rel_error 25.531040077194607
L2 rel_error 3.888118458681374
Linf rel_error 1.0
mass_bc
L1 9.60847203014767e-10
L2 1.620904641972315e-10
Linf 4.939643042681813e-11
mass_dst
L1 5.645219727811883e-10
L2 9.138351106323364e-11
Linf 3.0207107116600906e-11
mass_mom
L1 2.642731209640951e-10
L2 4.486074748302841e-11
Linf 1.131394327747087e-11
mass_ncl
L1 1.26390509960511e-08
L2 2.1489251095492187e-09
Linf 5.552468649288639e-10

mass_pom
L1 6.112871615165164e-09
L2 9.78679018295419e-10
Linf 2.7110255655831923e-10

mass_so4
L1 2.722059208483874e-08
L2 3.873118373798847e-09
Linf 8.841201006732738e-10

mass_soa
L1 4.294473015494709e-08
L2 6.028376698333038e-09
Linf 1.0898451258843572e-09

mmr_nhx
L1 0.0
L2 0.0
Linf 0.0
mmr_noy
L1 0.0
L2 0.0
Linf 0.0
mmr_sox
L1 1.8649615594912017e-08
L2 3.95657473703572e-09
Linf 1.7420350123242822e-09

net_chem
L1 8.704525419491872e+98
L2 1.6843361180595725e+98
Linf 6.728531663e+97

L1 rel_error 12.936738437834519
L2 rel_error 2.5032744176886137
Linf rel_error 1.0
ozone_col
L1 188.36638731991184
L2 188.36638731991184
Linf 188.36638731991184

L1 rel_error 0.4220976935224353
L2 rel_error 0.4220976935224353
Linf rel_error 0.4220976935224353
ozone_strat
L1 194.8069847646648
L2 194.8069847646648
Linf 194.8069847646648

L1 rel_error 0.45500197248116503
L2 rel_error 0.45500197248116503
Linf rel_error 0.45500197248116503
ozone_trop
L1 22.84425401524706
L2 22.84425401524706
Linf 22.84425401524706

L1 rel_error 0.4819257021003174
L2 rel_error 0.4819257021003174
Linf rel_error 0.4819257021003174
vmr_brox
L1 0.0
L2 0.0
Linf 0.0
vmr_broy
L1 0.0
L2 0.0
Linf 0.0
vmr_clox
L1 0.0
L2 0.0
Linf 0.0
vmr_cloy
L1 0.0
L2 0.0
Linf 0.0
vmr_nox
L1 0.0
L2 0.0
Linf 0.0
vmr_noy
L1 0.0
L2 0.0
Linf 0.0
vmr_tcly
L1 0.0
L2 0.0
Linf 0.0
final pass array = [False  True  True  True False False  True  True  True  True  True  True
  True  True  True  True False False False False  True  True  True  True
  True  True  True]
Traceback (most recent call last):
  File "/home/mjschm/cara_mam4xx/build/src/validation/mo_chm_diags/compare_mam4xx_mam4.py", line 136, in <module>
    assert(np.all(pass_all_tests))
AssertionError
<end of output>
Test time =   0.40 sec
----------------------------------------------------------
Test Failed.
"validate_chm_diags_ts_355" end time: May 21 17:56 MDT
"validate_chm_diags_ts_355" time elapsed: 00:00:00
----------------------------------------------------------

544/646 Testing: validate_gas_washout_merged
544/646 Test: validate_gas_washout_merged
Command: "/projects/x86-64-zen-rocky8/utilities/python/3.10.12/gcc/8.5.0/base/jliu53k/bin/python3" "compare_mam4xx_mam4.py" "mam4xx_gas_washout_merged.py" "mam_gas_washout_merged.py" "True" "9e-8"
Directory: /home/mjschm/cara_mam4xx/build/src/validation/mo_sethet
"validate_gas_washout_merged" start time: May 21 17:56 MDT
Output:
----------------------------------------------------------
xgas
L1 8090334167.5109825
L2 566043812.1790568
Linf 91848861.27848816

L1 rel_error 0.3770747033609994
L2 rel_error 0.026382198577640813
Linf rel_error 0.004280896363924451
final pass array = [False]
Traceback (most recent call last):
  File "/home/mjschm/cara_mam4xx/build/src/validation/mo_sethet/compare_mam4xx_mam4.py", line 136, in <module>
    assert(np.all(pass_all_tests))
AssertionError
<end of output>
Test time =   0.26 sec
----------------------------------------------------------
Test Failed.
"validate_gas_washout_merged" end time: May 21 17:56 MDT
"validate_gas_washout_merged" time elapsed: 00:00:00
----------------------------------------------------------

586/646 Testing: validate_aer_rad_props_lw_ts_355
586/646 Test: validate_aer_rad_props_lw_ts_355
Command: "/projects/x86-64-zen-rocky8/utilities/python/3.10.12/gcc/8.5.0/base/jliu53k/bin/python3" "compare_mam4xx_mam4.py" "mam4xx_aer_rad_props_lw_ts_355.py" "mam_aer_rad_props_lw_ts_355.py" "True" "7e-11"
Directory: /home/mjschm/cara_mam4xx/build/src/validation/aerosol_optics
"validate_aer_rad_props_lw_ts_355" start time: May 21 17:56 MDT
Output:
----------------------------------------------------------
odap_aer
L1 8.149720146521368e-11
L2 1.2350049868414222e-11
Linf 5.907174489144795e-12


L1 rel_error 1.5365303214624192e-07
qqcw
L1 0.004725936932030295
L2 0.0029179972368972426
Linf 0.0028033703565597534

L1 rel_error 6.392993991358082e-11
L2 rel_error 3.947310146237891e-11
Linf rel_error 3.792249050885764e-11
final pass array = [False  True]
Traceback (most recent call last):
  File "/home/mjschm/cara_mam4xx/build/src/validation/aerosol_optics/compare_mam4xx_mam4.py", line 136, in <module>
    assert(np.all(pass_all_tests))
AssertionError
<end of output>
Test time =   0.32 sec
----------------------------------------------------------
Test Failed.
"validate_aer_rad_props_lw_ts_355" end time: May 21 17:56 MDT
"validate_aer_rad_props_lw_ts_355" time elapsed: 00:00:00
----------------------------------------------------------

588/646 Testing: validate_aer_rad_props_sw_ts_355
588/646 Test: validate_aer_rad_props_sw_ts_355
Command: "/projects/x86-64-zen-rocky8/utilities/python/3.10.12/gcc/8.5.0/base/jliu53k/bin/python3" "compare_mam4xx_mam4.py" "mam4xx_aer_rad_props_sw_ts_355.py" "mam_aer_rad_props_sw_ts_355.py" "True" "8e-11"
Directory: /home/mjschm/cara_mam4xx/build/src/validation/aerosol_optics
"validate_aer_rad_props_sw_ts_355" start time: May 21 17:56 MDT
Output:
----------------------------------------------------------
qqcw
L1 0.004725936932030295
L2 0.0029179972368972426
Linf 0.0028033703565597534

L1 rel_error 6.392993991358082e-11
L2 rel_error 3.947310146237891e-11
Linf rel_error 3.792249050885764e-11
tau
L1 5.680481426327253e-09
L2 8.136824862342645e-10
Linf 3.305621523036484e-10

L1 rel_error 4.1424822308460483e-07
L2 rel_error 5.933766854960738e-08
Linf rel_error 2.4106193460321693e-08
tau_w
L1 5.493229332899513e-09
L2 7.97048022772819e-10
Linf 3.2231628650791766e-10

L1 rel_error 5.9386263058373104e-09
L2 rel_error 8.616735381327773e-10
Linf rel_error 3.484500394680191e-10
tau_w_f
L1 1.8698060176701675e-09
L2 3.4550021959878085e-10
Linf 1.659684395124983e-10

L1 rel_error 2.5879668064638994e-09
L2 rel_error 4.782009959844717e-10
Linf rel_error 2.297141031314855e-10
tau_w_g
L1 3.2453379731259033e-09
L2 5.356132473482768e-10
Linf 2.3360460787991144e-10

L1 rel_error 3.8180446742657685e-09
L2 rel_error 6.301332321744433e-10
Linf rel_error 2.7482895044695463e-10
final pass array = [ True False False False False]
Traceback (most recent call last):
  File "/home/mjschm/cara_mam4xx/build/src/validation/aerosol_optics/compare_mam4xx_mam4.py", line 136, in <module>
    assert(np.all(pass_all_tests))
AssertionError
<end of output>
Test time =   0.39 sec
----------------------------------------------------------
Test Failed.
"validate_aer_rad_props_sw_ts_355" end time: May 21 17:56 MDT
"validate_aer_rad_props_sw_ts_355" time elapsed: 00:00:00
----------------------------------------------------------

End testing: May 21 17:56 MDT

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late update--also some fails on aurora:

The following tests FAILED:
         19 - mode_averages (Failed)
         64 - validate_stand_modal_aero_calcsize_sub_update_ptend (Failed)
         66 - validate_stand_calcsize_aero_model_wetdep_ts_379 (Failed)
Output from tests failing on Aurora
Start testing: May 22 00:28 UTC
----------------------------------------------------------
19/646 Testing: mode_averages
19/646 Test: mode_averages
Command: "/usr/bin/sh" "-c" "/home/mjschm/mam4xx/build/bin/test-launcher -- ./mode_averages --use-colour no"
Directory: /home/mjschm/mam4xx/build/src/tests
"mode_averages" start time: May 22 00:28 UTC
Output:
----------------------------------------------------------
Calling initialize_kokkos
 ExecSpace name: SYCL
 ExecSpace initialized: yes
 active avx set: 
 compiler id: IntelLLVM
 FPE support is enabled, current FPE mask: 0 (NONE)
 #host threads: 1


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mode_averages is a Catch v2.13.8 host application.
Run with -? for options

-------------------------------------------------------------------------------
modal_averages
-------------------------------------------------------------------------------
/home/mjschm/mam4xx/src/tests/mode_averages_unit_tests.cpp:27
...............................................................................

/home/mjschm/mam4xx/src/tests/mode_averages_unit_tests.cpp:27: FAILED:
due to unexpected exception with message:
  The program was built for 1 devices
  Build program log for 'Intel(R) Data Center GPU Max 1550':
  Module <0x3d8fd10>:  Unresolved Symbol <nan>
  Module <0x3d8fd10>:  Unresolved Symbol <nan>

===============================================================================
test cases: 1 | 1 failed
assertions: 1 | 1 failed

EKAT is not managing resources.
RUN: OMP_PROC_BIND=spread OMP_PLACES=threads ./mode_averages --use-colour no
FROM: /home/mjschm/mam4xx/build/src/tests
<end of output>
Test time =   1.50 sec
----------------------------------------------------------
Test Failed.
"mode_averages" end time: May 22 00:28 UTC
"mode_averages" time elapsed: 00:00:01
----------------------------------------------------------

64/646 Testing: validate_stand_modal_aero_calcsize_sub_update_ptend
64/646 Test: validate_stand_modal_aero_calcsize_sub_update_ptend
Command: "/opt/aurora/24.347.0/spack/unified/0.9.2/install/linux-sles15-x86_64/gcc-13.3.0/python-venv-1.0-a4pusmc/bin/python3" "compare_mam4xx_mam4.py" "mam4xx_stand_modal_aero_calcsize_sub_update_ptend.py" "mam_stand_modal_aero_calcsize_sub_update_ptend.py" "True" "3e-5"
Directory: /home/mjschm/mam4xx/build/src/validation/calcsize
"validate_stand_modal_aero_calcsize_sub_update_ptend" start time: May 22 00:28 UTC
Output:
----------------------------------------------------------
dgnumdry_m
L1 4.758090000438264e-12
L2 6.120652745099863e-13
Linf 1.8500000014169075e-13
ptend_q
L1 0.0016484514219205918
L2 0.0010593508120839688
Linf 0.0007477462949000001
L1 rel_error 0.0003181953256389337
L2 rel_error 0.000204483111928284
Linf rel_error 0.00014433508481784857
qqcw
L1 0.0
L2 0.0
Linf 0.0
final pass array = [ True False  True]
Traceback (most recent call last):
  File "/home/mjschm/mam4xx/build/src/validation/calcsize/compare_mam4xx_mam4.py", line 136, in <module>
    assert(np.all(pass_all_tests))
AssertionError
<end of output>
Test time =   0.13 sec
----------------------------------------------------------
Test Failed.
"validate_stand_modal_aero_calcsize_sub_update_ptend" end time: May 22 00:28 UTC
"validate_stand_modal_aero_calcsize_sub_update_ptend" time elapsed: 00:00:00
----------------------------------------------------------

66/646 Testing: validate_stand_calcsize_aero_model_wetdep_ts_379
66/646 Test: validate_stand_calcsize_aero_model_wetdep_ts_379
Command: "/opt/aurora/24.347.0/spack/unified/0.9.2/install/linux-sles15-x86_64/gcc-13.3.0/python-venv-1.0-a4pusmc/bin/python3" "compare_mam4xx_mam4.py" "mam4xx_stand_calcsize_aero_model_wetdep_ts_379.py" "mam_stand_calcsize_aero_model_wetdep_ts_379.py" "True" "1.5e-3"
Directory: /home/mjschm/mam4xx/build/src/validation/calcsize
"validate_stand_calcsize_aero_model_wetdep_ts_379" start time: May 22 00:28 UTC
Output:
----------------------------------------------------------
dgnumdry_m
L1 4.383719999403568e-12
L2 5.946998775934814e-13
Linf 1.860000004629556e-13
ptend_q
L1 14.419584504071418
L2 10.195808596672368
Linf 7.209525398
L1 rel_error 0.19365119759473054
L2 rel_error 0.1369270067826688
Linf rel_error 0.09682201501840247
qqcw
L1 91388.94299998647
L2 25167.88959375477
Linf 9074.523000000045
L1 rel_error 0.0010769036541811079
L2 rel_error 0.0002965718978886232
Linf rel_error 0.0001069318306772845
final pass array = [ True False  True]
Traceback (most recent call last):
  File "/home/mjschm/mam4xx/build/src/validation/calcsize/compare_mam4xx_mam4.py", line 136, in <module>
    assert(np.all(pass_all_tests))
AssertionError
<end of output>
Test time =   0.11 sec
----------------------------------------------------------
Test Failed.
"validate_stand_calcsize_aero_model_wetdep_ts_379" end time: May 22 00:28 UTC
"validate_stand_calcsize_aero_model_wetdep_ts_379" time elapsed: 00:00:00
----------------------------------------------------------

End testing: May 22 00:28 UTC

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can take a look at the tests that are failing in Aurora. I wonder if we should merge this PR @singhbalwinder @jeff-cohere . @mjschmidt271, can you create an issue for the failing tests in Aurora? For the tests in AMD, I have PR 442 that fixes a few tests for Frontier. For Caraway, I will need to build and run the tests on this machine.

const auto ws = wsat(T, p);
// KOKKOS_INLINE_FUNCTION Real relative_humidity_from_vapor_mixing_ratio(
// Real w, Real T, Real p,
// Real (*wsat)(Real, Real) = saturation_mixing_ratio_hardy) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also noted a compilation error in Aurora because of this function. However, we have compiled MAM4xx in EAMxx with SYCL, and this error did not show up. Thus, I wonder if we are only using relative_humidity_from_vapor_mixing_ratio in MAM4xx.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would make sense to me. However, unless we have reason for the flexibility to be available, I think we're ok without it.

@singhbalwinder - do you know if this calculation would ever need a different calculation for saturation mixing ratio?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just keep this commented out until we find out that we need it.

@mjschmidt271 mjschmidt271 merged commit eca4a54 into main May 27, 2025
14 checks passed
@mjschmidt271 mjschmidt271 deleted the mjs/m4x/intel-sycl-build branch May 27, 2025 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants