ci(feat): use AWS ephemeral runners for external contributors#1892
Open
ci(feat): use AWS ephemeral runners for external contributors#1892
Conversation
Signed-off-by: oliver könig <okoenig@nvidia.com>
Replace nemoci.azurecr.io with 766267172432.dkr.ecr.us-east-1.amazonaws.com. Remove all Azure CLI install, login, and ACR login steps from build-container and test-template actions. Drop environment: nemo-ci (Azure-backed) from all jobs. Route the CPU unit-test job from linux-amd64-cpu16 to the AWS runner selected by is-not-external-contributor. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: oliver könig <okoenig@nvidia.com>
Line numbers shifted due to new is-not-external-contributor job; regenerated baseline with detect-secrets scan. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: oliver könig <okoenig@nvidia.com>
Contributor
|
Please take a look at the feedback for the ephemeral runners on MBridge. It should all apply here as well. |
Remove the `is-not-external-contributor` job and the copied `check-nvidia-sso-membership` action — the pre-flight workflow already handles SSO membership checks and exposes `runner_prefix` as an output for exactly this purpose. Pass `nemo-ci-aws-gpu-x2` / `nemo-ci-aws-gpu-x2-ephemeral` as the `default_runner_prefix` / `non_nvidia_runner_prefix` inputs to pre-flight and use `needs.pre-flight.outputs.runner_prefix` in all GPU jobs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: oliver könig <okoenig@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Routes external contributors to isolated, ephemeral runners while NVIDIA maintainers keep the persistent ones, leveraging the SSO membership check already built into the pre-flight workflow.
Changes:
nemo-ci-aws-gpu-x2/nemo-ci-aws-gpu-x2-ephemeralasdefault_runner_prefix/non_nvidia_runner_prefixinputs to the pre-flight workflowneeds.pre-flight.outputs.runner_prefixin all GPU jobs (cicd-container-build, unit tests, e2e tests) — the pre-flight already handles SSO membership and emits the correct runner based on contributor statusis-not-external-contributorjob and the copiedcheck-nvidia-sso-membershipcomposite actionExample
```yaml
pre-flight:
uses: NVIDIA-NeMo/FW-CI-templates/.github/workflows/_cicd_preflight.yml@v0.80.1
with:
default_runner_prefix: nemo-ci-aws-gpu-x2
non_nvidia_runner_prefix: nemo-ci-aws-gpu-x2-ephemeral
```
Test plan
nemo-ci-aws-gpu-x2runner is selectednemo-ci-aws-gpu-x2-ephemeralrunner is selected