Skip to content

Conversation

@tcclevenger
Copy link
Contributor

@tcclevenger tcclevenger commented Aug 27, 2025

This is in an effort to continually update Kokkos with new releases, instead of large subversion jumps each time a component needs a new version.

[BFB]

@tcclevenger tcclevenger requested a review from bartgol August 27, 2025 19:49
@tcclevenger tcclevenger added the Testing Anything related to unit/system tests label Aug 27, 2025
Copy link
Contributor Author

@tcclevenger tcclevenger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bartgol I added comments for every place this differs from eamxx-sa. Is there an easy way for me to manually test a .github workflow?

Copy link
Contributor

@bartgol bartgol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that, unless the workflow is in master, it must run at least once to appear on the actions page. If you put back the PR trigger, it may run, and at that point, you could manually test it form the actions tab via workflow dispatch...

@tcclevenger
Copy link
Contributor Author

@bartgol Looks like it doesn't pick it up even with the PR tab.

@bartgol bartgol added the CI: workflow change approved Allow gh action PR testing on ghci-snl-* machines for PRs that alter a worfklow file label Aug 28, 2025
@tcclevenger tcclevenger added the CI: approved Allow gh actions PR testing on ghci-snl-* machines label Aug 28, 2025
@tcclevenger tcclevenger marked this pull request as draft August 28, 2025 16:13
@tcclevenger tcclevenger marked this pull request as ready for review September 2, 2025 13:52
@tcclevenger tcclevenger requested a review from bartgol September 2, 2025 13:52
@tcclevenger tcclevenger force-pushed the tcclevenger/eamxx/add_kokkos_dev_test branch 5 times, most recently from 8845fd6 to a98fe3e Compare September 2, 2025 20:23
@tcclevenger tcclevenger marked this pull request as draft September 3, 2025 01:15
@tcclevenger
Copy link
Contributor Author

Previously, C++20 was not being used (EAMxx overwrote user defined CMAKE_CXX_STANDARD var) and tests were passing. Now that we are using C++20, the builds are failing, so I changed to draft while I address the errors.

@tcclevenger tcclevenger force-pushed the tcclevenger/eamxx/add_kokkos_dev_test branch from c16f131 to 5da2ddc Compare September 3, 2025 15:04
Copy link
Contributor

@bartgol bartgol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thought I just had: why don't we make the new workflow do the following:

  • create a temporary branch like kokkos-develop-testing/<DATE>
  • update kokkos submod of ekat in this branch, and push to the remote
  • call the eamxx-v1 and eamxx-sa (or just one) workflow

the AI bot suggests this step for triggering another workflow (not sure if it works)

          - name: Trigger target workflow
            uses: actions/github-script@v6
            with:
              github-token: ${{ secrets.PAT_TOKEN }} # Use a PAT with 'repo' scope
              script: |
                await github.rest.actions.createWorkflowDispatch({
                  owner: context.repo.owner,
                  repo: context.repo.repo,
                  workflow_id: 'target-workflow.yml', // The file name of the target workflow
                  ref: 'main', // The branch on which the workflow should be triggered
                  inputs: {
                    # Optional inputs for the dispatched workflow
                    my_input_key: 'my_input_value',
                    another_input: 'another_value'
                  }
                });

But whatever the syntax, I'm thinking that calling the existing eamxx-sa/eamxx-v1 workflows (to run on the newly created "kokkos-update" branch) is more robust, as we only have to maintain the testing framework in one workflow.

@tcclevenger
Copy link
Contributor Author

One thought I just had: why don't we make the new workflow do the following:

* create a temporary branch like `kokkos-develop-testing/<DATE>`

* update kokkos submod of ekat in this branch, and push to the remote

* call the eamxx-v1 and eamxx-sa (or just one) workflow

the AI bot suggests this step for triggering another workflow (not sure if it works)

          - name: Trigger target workflow
            uses: actions/github-script@v6
            with:
              github-token: ${{ secrets.PAT_TOKEN }} # Use a PAT with 'repo' scope
              script: |
                await github.rest.actions.createWorkflowDispatch({
                  owner: context.repo.owner,
                  repo: context.repo.repo,
                  workflow_id: 'target-workflow.yml', // The file name of the target workflow
                  ref: 'main', // The branch on which the workflow should be triggered
                  inputs: {
                    # Optional inputs for the dispatched workflow
                    my_input_key: 'my_input_value',
                    another_input: 'another_value'
                  }
                });

But whatever the syntax, I'm thinking that calling the existing eamxx-sa/eamxx-v1 workflows (to run on the newly created "kokkos-update" branch) is more robust, as we only have to maintain the testing framework in one workflow.

I can try, the only difference is that new cmake options must be added in, so maybe if I can hack those in this will delete a lot of duplicated code.

@tcclevenger tcclevenger force-pushed the tcclevenger/eamxx/add_kokkos_dev_test branch from 8c1a746 to 3d33736 Compare September 8, 2025 20:05
@tcclevenger tcclevenger force-pushed the tcclevenger/eamxx/add_kokkos_dev_test branch from 9fa0823 to ad21ae7 Compare September 10, 2025 01:32
tcclevenger added a commit that referenced this pull request Sep 11, 2025
#7691)

In C++20 you cannot aggregate initialize a stuct which has a default
constructor defined. We shouldn't need these constructors since views
themselves have default constructors.

Also, allow EAMxx to take a CMAKE_CXX_STANDARD option (default to C++17).

Creating this PR so that #7643 won't have to run full CI while I test.

[BFB]
tcclevenger added a commit that referenced this pull request Sep 11, 2025
In C++20 you cannot aggregate initialize a stuct which has a default
constructor defined. We shouldn't need these constructors since views
themselves have default constructors.
    
Also, allow EAMxx to take a CMAKE_CXX_STANDARD option (default to C++17).
    
Creating this PR so that #7643 won't have to run full CI while I test.
    
[BFB]
@tcclevenger tcclevenger force-pushed the tcclevenger/eamxx/add_kokkos_dev_test branch from ad21ae7 to 2ab2f9a Compare September 11, 2025 00:39
@tcclevenger tcclevenger force-pushed the tcclevenger/eamxx/add_kokkos_dev_test branch 3 times, most recently from 9b8dc53 to 61eeff8 Compare September 19, 2025 17:18
@tcclevenger tcclevenger force-pushed the tcclevenger/eamxx/add_kokkos_dev_test branch from 61eeff8 to 66530c9 Compare September 19, 2025 17:33
Copy link
Contributor Author

@tcclevenger tcclevenger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bartgol This is ready to take a look. I opted to make changes inside the test-all-eamxx action instead of having a new update-kokkos-to-dev action which checks out the branch and edits the .cmake files with 3 needed options. I did this because I needed to modify test-all-eamxx anyways to alter the name of the cdash report to distinguish from the normal nightlies.

I manually ran and submitted to CDASH. Here's one of the tests: https://my.cdash.org/build/3099376

Note, CUDA fails because we need to update version 12.1 -> 12.2+ to satisfy Kokkos' version requirements.

@tcclevenger tcclevenger marked this pull request as ready for review September 19, 2025 20:16
@tcclevenger tcclevenger requested a review from bartgol September 19, 2025 20:16
Copy link
Contributor

@bartgol bartgol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I left just one comment on the test-all arg.

Also, our ghci container for CUDA will be soon updated to be ubi9 based, with cuda 12.4. It used to have 12.1 to mimic pm-gpu, but our pm-gpu defaults have since been updated, so the container should be updated as well. I already submitted the image to the AT2 folks, so it's ready to change any moment (prob this week).

@tcclevenger tcclevenger force-pushed the tcclevenger/eamxx/add_kokkos_dev_test branch 2 times, most recently from 009ed75 to 39b7825 Compare September 22, 2025 17:30
@bartgol bartgol added CI: workflow change approved Allow gh action PR testing on ghci-snl-* machines for PRs that alter a worfklow file and removed CI: workflow change approved Allow gh action PR testing on ghci-snl-* machines for PRs that alter a worfklow file labels Sep 22, 2025
@tcclevenger
Copy link
Contributor Author

Manually submitted to CDASH, and went as expected. Merging.

tcclevenger added a commit that referenced this pull request Sep 22, 2025
)

This is in an effort to continually update Kokkos with new releases, instead
of large subversion jumps each time a component needs a new version.

[BFB]
@tcclevenger tcclevenger merged commit 65476b0 into E3SM-Project:master Sep 22, 2025
22 of 28 checks passed
@tcclevenger tcclevenger deleted the tcclevenger/eamxx/add_kokkos_dev_test branch September 22, 2025 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI: approved Allow gh actions PR testing on ghci-snl-* machines CI: workflow change approved Allow gh action PR testing on ghci-snl-* machines for PRs that alter a worfklow file Testing Anything related to unit/system tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants