@@ -5,54 +5,75 @@ This directory contains GitHub Actions workflows for CI/CD automation.
55## Workflows Overview
66
77
8- | Workflow | Trigger | Description |
9- | -------------------------------------------------- | ------------------- | ----------------------------------------------------- |
10- | [ ci.yml] ( ci.yml ) | Push to ` main ` , PRs | Lint, format check, and tests with coverage |
11- | [ conventional-commit.yml] ( conventional-commit.yml ) | PRs | Validates PR titles follow conventional commit format |
12- | [ copyright-check.yml] ( copyright-check.yml ) | PRs | Validates NVIDIA copyright headers on Python files |
13- | [ dco-assistant.yml] ( dco-assistant.yml ) | PRs, Comments | Manages DCO signing via PR comments |
14- | [ release.yml] ( release.yml ) | Manual dispatch | Builds and publishes package to PyPI |
15- | [ secrets-detector.yml] ( secrets-detector.yml ) | PRs | Scans for accidentally committed secrets |
8+ | Workflow | Trigger | Description |
9+ | -------------------------------------------------- | ------------------------------------- | ---------------------------------------------------- |
10+ | [ ci-checks.yml] ( ci-checks.yml ) | Push to ` main ` , PRs, manual | Format, lint, typecheck, and unit tests (CPU) |
11+ | [ gpu-tests.yml] ( gpu-tests.yml ) | Push to ` main ` /` pull-request/* ` , manual | GPU E2E tests (A100) |
12+ | [ conventional-commit.yml] ( conventional-commit.yml ) | PRs | Validates PR titles follow conventional commit format |
13+ | [ copyright-check.yml] ( copyright-check.yml ) | Push to ` main ` /` pull-request/* ` | Validates NVIDIA copyright headers on Python files |
14+ | [ docs.yml] ( docs.yml ) | Push to ` main ` (docs paths) | Builds and deploys documentation to GitHub Pages |
15+ | [ release.yml] ( release.yml ) | Manual dispatch | Builds and publishes package to PyPI |
16+ | [ secrets-detector.yml] ( secrets-detector.yml ) | PRs | Scans for accidentally committed secrets |
1617
1718
19+ ## Pull Request Testing (copy-pr-bot)
20+
21+ GPU tests (` gpu-tests.yml ` ) run on NVIDIA self-hosted runners, which block ` pull_request ` -triggered jobs. They use the [ copy-pr-bot] ( https://docs.gha-runners.nvidia.com/platform/apps/copy-pr-bot/ ) pattern instead:
22+
23+ 1 . When a PR is opened by a trusted user with trusted changes, ` copy-pr-bot ` automatically copies the code to a ` pull-request/<number> ` branch
24+ 2 . The push to ` pull-request/<number> ` triggers the GPU workflow
25+ 3 . Untrusted PRs require a vetter to comment ` /ok to test <SHA> ` before GPU tests run
26+ 4 . Draft PRs do ** not** auto-sync (` auto_sync_draft: false ` ), saving GPU resources
27+
28+ Configuration: [ ` .github/copy-pr-bot.yaml ` ] ( ../copy-pr-bot.yaml )
29+
30+ CPU checks (` ci-checks.yml ` ) run on GitHub-hosted ` ubuntu-latest ` runners and use standard ` pull_request ` triggers.
31+
1832## Workflow Diagram
1933
2034``` mermaid
2135flowchart LR
2236 subgraph triggers [Triggers]
2337 push[Push to main]
24- pr[Pull Request ]
25- comment[PR Comment ]
38+ cpb[copy-pr-bot push to pull-request/* ]
39+ pr[Pull Request event ]
2640 manual[Manual Dispatch]
2741 end
2842
29- subgraph ci [CI Workflow]
43+ subgraph ci [CI Checks - GitHub-hosted runners]
44+ changes_ci[Detect Changes]
45+ format[Format]
3046 lint[Lint]
31- format[Format Check]
3247 typecheck[Typecheck]
33- test[Unit Tests]
34- test --> coverage[Coverage Report]
48+ unit[Unit Tests]
49+ ci_status[CI Status]
50+ changes_ci --> format & lint & typecheck & unit
51+ format & lint & typecheck & unit --> ci_status
52+ end
53+
54+ subgraph gpu [GPU Tests - on-prem runners]
55+ changes_gpu[Detect Changes]
56+ e2e[GPU E2E Tests]
57+ gpu_status[GPU CI Status]
58+ changes_gpu --> e2e --> gpu_status
3559 end
3660
3761 subgraph compliance [Compliance Workflows]
38- dco[DCO Assistant]
3962 conventional[Conventional Commit]
4063 secrets[Secrets Detector]
4164 copyright[Copyright Check]
4265 end
4366
4467 subgraph release [Release Workflow]
4568 buildWheel[Build Wheel]
46- bumpVersion[Bump Version]
4769 publishPyPI[Publish to PyPI]
4870 ghRelease[GitHub Release]
4971 slackNotify[Slack Notification]
5072 end
5173
52- push --> ci
53- pr --> ci
54- pr --> compliance
55- comment --> dco
74+ push --> ci & gpu
75+ cpb --> gpu & copyright
76+ pr --> ci & conventional & secrets
5677 manual --> release
5778
5879 buildWheel --> publishPyPI --> ghRelease --> slackNotify
@@ -63,17 +84,37 @@ flowchart LR
6384 release -.->|reuses| FW-CI-templates
6485```
6586
66- ## CI Workflow
87+ ## CI Checks Workflow
88+
89+ The ` ci-checks.yml ` workflow runs on every push to ` main ` and on pull requests:
90+
91+ - ** Detect Changes** : Uses ` dorny/paths-filter ` to skip jobs when only non-source files change
92+ - ** Format** : Verifies code formatting with ` ruff format --check `
93+ - ** Lint** : Runs ` ruff check ` linting
94+ - ** Typecheck** : Runs ` ty ` type checks
95+ - ** Unit Tests** : Runs pytest with coverage
96+ - ** CI Status** : Aggregation job -- single required check for branch protection
97+
98+ All jobs run on ` ubuntu-latest ` (GitHub-hosted).
99+
100+ ## GPU Tests Workflow
101+
102+ The ` gpu-tests.yml ` workflow runs on pushes to ` main ` and ` pull-request/* ` branches (via copy-pr-bot):
103+
104+ - ** GPU E2E Tests** : Runs end-to-end tests on ` linux-amd64-gpu-a100-latest-1 ` (A100) with a 60-minute job timeout and 45-minute step timeout
105+ - ** GPU CI Status** : Aggregation job -- single required check for branch protection
67106
68- The main CI workflow runs on every push to ` main ` and on pull requests:
107+ ### Runners
69108
70- - ** Lint** : Runs ` ruff check ` and ` ty ` type checks
71- - ** Format Check** : Verifies code formatting with ` ruff format --check `
72- - ** Test** : Runs pytest with coverage across Python 3.11. more to come.
109+ | Workflow | Job | Runner Label | Type |
110+ | --- | --- | --- | --- |
111+ | CI Checks | All jobs | ` ubuntu-latest ` | GitHub-hosted |
112+ | GPU Tests | GPU E2E Tests | ` linux-amd64-gpu-a100-latest-1 ` | NVIDIA self-hosted GPU (A100) |
113+ | GPU Tests | Detect Changes, GPU CI Status | ` linux-amd64-cpu4 ` | NVIDIA self-hosted CPU (4-core) |
73114
74115### Coverage
75116
76- Coverage reports are uploaded as artifacts.
117+ Coverage reports are uploaded as artifacts from both workflows .
77118
78119## Compliance Workflows
79120
0 commit comments