-
Notifications
You must be signed in to change notification settings - Fork 8
Sets Up Autotesting on SNL Machines #426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| name: 'Show workflow trigger' | ||
| description: 'Prints what triggered this workflow' | ||
|
|
||
| runs: | ||
| using: "composite" | ||
| steps: | ||
| - name: Print trigger info | ||
| uses: actions/github-script@v7 | ||
| with: | ||
| script: | | ||
| const eventName = context.eventName; | ||
| const actor = context.actor || 'unknown'; // Default to 'unknown' if actor is not defined | ||
| let eventAction = 'N/A'; | ||
|
|
||
| // Determine the event action based on the event type | ||
| if (eventName === 'pull_request') { | ||
| eventAction = context.payload.action || 'N/A'; | ||
| } else if (eventName === 'pull_request_review') { | ||
| eventAction = context.payload.review.state || 'N/A'; | ||
| } else if (eventName === 'workflow_dispatch') { | ||
| eventAction = 'manual trigger'; | ||
| } else if (eventName === 'schedule') { | ||
| eventAction = 'scheduled trigger'; | ||
| } | ||
| console.log(`The job was triggered by a ${eventName} event.`); | ||
| console.log(` - Event action: ${eventAction}`); | ||
| console.log(` - Triggered by: ${actor}`); |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,104 @@ | ||
| name: gcc-cuda | ||
|
|
||
| on: | ||
| workflow_call | ||
|
|
||
| jobs: | ||
| gcc-cuda: | ||
| runs-on: [self-hosted, m4xci-snl-cuda, cuda, gcc] | ||
| # will run other tests in the matrix even if one fails | ||
| # NOTE: prioritizes extra info over speed, so consider whether this makes sense | ||
| continue-on-error: false | ||
| strategy: | ||
| fail-fast: true | ||
| matrix: | ||
| build-type: [Debug, Release] | ||
| fp-precision: [single, double] | ||
| name: gcc-cuda / ${{ matrix.build-type }} - ${{ matrix.fp-precision }} | ||
| steps: | ||
| - name: Check out the repository | ||
| uses: actions/checkout@v4 | ||
| with: | ||
| persist-credentials: false | ||
| show-progress: false | ||
| submodules: recursive | ||
| - name: Cloning Haero | ||
| uses: actions/checkout@v3 | ||
|
||
| with: | ||
| repository: eagles-project/haero | ||
| submodules: recursive | ||
| path: haero_src | ||
| - name: Show action trigger | ||
| uses: ./.github/actions/show-workflow-trigger | ||
| - name: Get CUDA Arch | ||
| # NOTE: for now, only running on an H100 machine, but keep anyway | ||
| run: | | ||
| # Ensure nvidia-smi is available | ||
| if ! command -v nvidia-smi &> /dev/null; then | ||
| echo "nvidia-smi could not be found. Please ensure you have Nvidia drivers installed." | ||
| exit 1 | ||
| fi | ||
|
|
||
| # Get the GPU model from nvidia-smi, and set env for next step | ||
| gpu_model=$(nvidia-smi --query-gpu=name --format=csv,noheader | head -n 1) | ||
| case "$gpu_model" in | ||
| *"H100"*) | ||
| echo "H100 detected--setting Hopper90 architecture" | ||
| echo "Hopper=ON" >> $GITHUB_ENV | ||
| echo "CUDA_ARCH=90" >> $GITHUB_ENV | ||
| ARCH=90 | ||
| ;; | ||
| *"A100"*) | ||
| echo "A100 detected--setting Ampere80 architecture" | ||
| echo "Ampere=ON" >> $GITHUB_ENV | ||
| echo "CUDA_ARCH=80" >> $GITHUB_ENV | ||
| ;; | ||
| *"V100"*) | ||
| echo "V100 detected--setting Volta70 architecture" | ||
| echo "Volta=ON" >> $GITHUB_ENV | ||
| echo "CUDA_ARCH=70" >> $GITHUB_ENV | ||
| ;; | ||
| *) | ||
| echo "Unsupported GPU model: $gpu_model" | ||
| exit 1 | ||
| ;; | ||
| esac | ||
| - name: Building Haero (${{ matrix.build-type }}, ${{ matrix.fp-precision }} precision) | ||
| run: | | ||
| cmake -S haero_src -B haero_build \ | ||
| -DCMAKE_BUILD_TYPE=${{ matrix.build-type }} \ | ||
| -DCMAKE_INSTALL_PREFIX="haero_install" \ | ||
| -DCMAKE_C_COMPILER=gcc \ | ||
| -DCMAKE_CXX_COMPILER=g++ \ | ||
| -DHAERO_ENABLE_MPI=OFF \ | ||
| -DHAERO_ENABLE_GPU=ON \ | ||
| -DHAERO_PRECISION=${{ matrix.fp-precision }} | ||
| cd haero_build | ||
| make -j | ||
| make install | ||
| - name: Set nvcc_wrapper Arch | ||
| run: | | ||
| sed -i s/default_arch=\"sm_70\"/default_arch=\"sm_90\"/g `pwd`/haero_install/bin/nvcc_wrapper | ||
| echo "====================================" | ||
| grep -i "default_arch=" `pwd`/haero_install/bin/nvcc_wrapper | ||
| - name: Configuring MAM4xx (${{ matrix.build-type }}, ${{ matrix.fp-precision }} precision) | ||
| run: | | ||
| cmake -S . -B build \ | ||
| -DCMAKE_CXX_COMPILER=`pwd`/haero_install/bin/nvcc_wrapper \ | ||
| -DCMAKE_C_COMPILER=gcc \ | ||
| -DCMAKE_INSTALL_PREFIX=`pwd`/install \ | ||
| -DCMAKE_BUILD_TYPE=${{ matrix.build-type }} \ | ||
| -DMAM4XX_HAERO_DIR=`pwd`/haero_install \ | ||
| -DNUM_VERTICAL_LEVELS=72 \ | ||
| -DENABLE_COVERAGE=OFF \ | ||
| -DENABLE_SKYWALKER=ON \ | ||
| -DCMAKE_CUDA_ARCHITECTURES=90 \ | ||
|
||
| -G "Unix Makefiles" | ||
| - name: Building MAM4xx (${{ matrix.build-type }}, ${{ matrix.fp-precision }} precision) | ||
| run: | | ||
| cd build | ||
| make | ||
| - name: Running tests (${{ matrix.build-type }}, ${{ matrix.fp-precision }} precision) | ||
| run: | | ||
| cd build | ||
| ctest -V --output-on-failure | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| name: SNL-AT2 | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if we should name this 'AT' only?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The name of the software product that facilitates this is "AT2 (Autotester2)"--however, what we call it on our side doesn't make a bit of difference to me |
||
|
|
||
| on: | ||
| # Runs on PRs against main | ||
| pull_request: | ||
| branches: [ main ] | ||
| types: [opened, synchronize, ready_for_review, reopened] | ||
| paths: | ||
| # first, yes to these | ||
| - '.github/workflows/at2_snl.yml' | ||
| - 'src/mam4xx' | ||
| - 'src/tests' | ||
| - 'src/validation/**' | ||
| # second, no to these | ||
| - '!src/tests/data/**' | ||
| # not sure whether this should be disabled--keep for now | ||
| # - '!src/validation/mam_x_validation/**' | ||
|
|
||
| # Manual run | ||
| workflow_dispatch: | ||
|
|
||
| # # Add schedule trigger for nightly runs at midnight MT (Standard Time) | ||
| # schedule: | ||
| # - cron: '0 7 * * *' # Runs at 7 AM UTC, which is midnight MT during Standard Time | ||
|
|
||
| concurrency: | ||
| # Two runs are in the same group if they are testing the same git ref | ||
| # - if trigger=pull_request, the ref is refs/pull/<PR_NUMBER>/merge | ||
| # - for other triggers, the ref is the branch tested | ||
| group: ${{ github.workflow }}-${{ github.ref }} | ||
| cancel-in-progress: true | ||
|
|
||
| jobs: | ||
| gcc-cuda: | ||
| uses: | ||
| ./.github/workflows/at2_gcc-cuda.yml | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What versions of gcc and cuda are we using?