Skip to content

Commit 94404b1

Browse files
committed
chore(ci): remove redundant DRA test steps from inference workflow
The secure-accelerator-access conformance check is now self-contained: it creates its own DRA test resources (namespace, ResourceClaim, Pod), validates DRA access patterns, and cleans up automatically. Remove the separate "Deploy DRA GPU test" and "DRA GPU test cleanup" CI steps that previously applied docs/conformance/cncf/manifests/ dra-gpu-test.yaml as a prerequisite. The static manifest is retained for documentation but no longer needed in the CI pipeline. Verified: CI run 22286049313 shows TestSecureAcceleratorAccess PASS (3.61s) in the "Validate cluster" step, confirming the self-contained check works correctly on GPU hardware.
1 parent 770d132 commit 94404b1

File tree

1 file changed

+2
-36
lines changed

1 file changed

+2
-36
lines changed

.github/workflows/gpu-h100-inference-test.yaml

Lines changed: 2 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,6 @@ on:
3131
- '.github/actions/load-versions/**'
3232
- 'tests/manifests/**'
3333
- 'tests/chainsaw/ai-conformance/**'
34-
- 'docs/conformance/cncf/**'
3534
- 'recipes/components/dynamo-platform/**'
3635
- 'recipes/components/prometheus-adapter/**'
3736
- 'recipes/overlays/kind.yaml'
@@ -109,38 +108,14 @@ jobs:
109108
fi
110109
echo "Snapshot correctly detected ${GPU_COUNT}x ${GPU_MODEL}"
111110
112-
# --- Deploy DRA test pod (prerequisite for secure-accelerator-access check) ---
113-
114-
- name: Deploy DRA GPU test
115-
run: |
116-
kubectl --context="kind-${KIND_CLUSTER_NAME}" apply \
117-
-f docs/conformance/cncf/manifests/dra-gpu-test.yaml
118-
119-
echo "Waiting for DRA GPU test pod to complete..."
120-
if kubectl --context="kind-${KIND_CLUSTER_NAME}" -n dra-test \
121-
wait --for=jsonpath='{.status.phase}'=Succeeded pod/dra-gpu-test --timeout=120s; then
122-
echo "DRA GPU allocation test passed."
123-
else
124-
echo "::error::DRA GPU test pod did not succeed"
125-
kubectl --context="kind-${KIND_CLUSTER_NAME}" -n dra-test \
126-
logs pod/dra-gpu-test 2>/dev/null || true
127-
kubectl --context="kind-${KIND_CLUSTER_NAME}" -n dra-test \
128-
get pod/dra-gpu-test -o yaml 2>/dev/null || true
129-
exit 1
130-
fi
131-
132-
echo "=== DRA GPU test logs ==="
133-
kubectl --context="kind-${KIND_CLUSTER_NAME}" -n dra-test \
134-
logs pod/dra-gpu-test
135-
136111
# --- Install Karpenter before validation so cluster-autoscaling check passes ---
137112

138113
- name: Install Karpenter + KWOK (setup)
139114
run: bash kwok/scripts/validate-cluster-autoscaling.sh --setup
140115

141116
# --- Validate cluster (Go conformance checks run inside K8s Jobs) ---
142-
# Replaces previous bash assertion steps for: inference-gateway,
143-
# accelerator-metrics, pod-autoscaling, secure-accelerator-access.
117+
# Includes self-contained secure-accelerator-access check (creates its own
118+
# DRA test resources, validates, and cleans up automatically).
144119

145120
- name: Validate cluster
146121
run: |
@@ -258,12 +233,6 @@ jobs:
258233
- name: Cluster Autoscaling (Karpenter + KWOK)
259234
run: bash kwok/scripts/validate-cluster-autoscaling.sh --exercise
260235

261-
- name: DRA GPU test cleanup
262-
if: always()
263-
run: |
264-
kubectl --context="kind-${KIND_CLUSTER_NAME}" delete \
265-
-f docs/conformance/cncf/manifests/dra-gpu-test.yaml --ignore-not-found 2>/dev/null || true
266-
267236
# --- Evidence collection ---
268237

269238
- name: Collect AI conformance evidence
@@ -337,9 +306,6 @@ jobs:
337306
kubectl --context="kind-${KIND_CLUSTER_NAME}" -n monitoring get pods -o wide 2>/dev/null || true
338307
echo "=== DRA ResourceSlices ==="
339308
kubectl --context="kind-${KIND_CLUSTER_NAME}" get resourceslices -o wide 2>/dev/null || true
340-
echo "=== DRA test pod spec ==="
341-
kubectl --context="kind-${KIND_CLUSTER_NAME}" -n dra-test \
342-
get pod/dra-gpu-test -o yaml 2>/dev/null || true
343309
echo "=== Node status ==="
344310
kubectl --context="kind-${KIND_CLUSTER_NAME}" get nodes -o wide 2>/dev/null || true
345311

0 commit comments

Comments
 (0)