Skip to content

feat: add conformance evidence renderer and fix check false-positives#187

Merged
dims merged 1 commit intoNVIDIA:mainfrom
dims:feat/evidence-renderer-conformance-fixes
Feb 23, 2026
Merged

feat: add conformance evidence renderer and fix check false-positives#187
dims merged 1 commit intoNVIDIA:mainfrom
dims:feat/evidence-renderer-conformance-fixes

Conversation

@dims
Copy link
Copy Markdown
Collaborator

@dims dims commented Feb 23, 2026

Summary

  • Add CNCF AI Conformance evidence rendering to aicr validate via new --evidence-dir and --result flags
  • New pkg/evidence package generates per-check markdown evidence files from conformance validation results
  • Fix false-positive paths in 5 conformance checks:
    • cluster-autoscaling: baseline node count before test; only count Running/Succeeded pods
    • gang-scheduling: verify PodScheduled timestamps within co-scheduling window
    • pod-autoscaling: add scale-down verification after scale-up
    • robust-controller: use k8serrors type predicates instead of string matching
  • Update H100 GPU CI workflows to upload evidence artifacts alongside validation results

Test plan

  • Unit tests for all modified conformance checks pass with -race
  • New pkg/evidence renderer tests pass
  • Evidence metadata completeness tests in registration_test.go
  • make test passes (73.4% coverage)
  • make qualify passes (pre-existing lint config issue only)
  • H100 inference workflow produces evidence artifacts
  • H100 training workflow produces evidence artifacts

Add CNCF AI Conformance evidence rendering to `aicr validate`:
- New `--evidence-dir` flag generates per-check markdown evidence files
  when used with `--phase conformance`
- New `--result` flag renders evidence from a saved validation result
- New `pkg/evidence` package with templates, types, and renderer
- Each conformance check now declares evidence metadata (EvidenceFile,
  SubmissionRequirement, TestName) in the check registry
- Checks in --no-cluster mode report "skipped" instead of "pass"

Fix false-positive paths in 5 conformance checks:
- cluster-autoscaling: capture baseline Karpenter node count before
  creating test resources; only count Running/Succeeded pods as
  scheduled (not Failed/Unknown)
- gang-scheduling: verify PodScheduled timestamps are within a
  co-scheduling window (30s) to prove gang semantics
- pod-autoscaling: add scale-down verification after scale-up by
  patching HPA target to unreachable value and confirming replica
  reduction (with 0s stabilization window for fast feedback)
- robust-controller: replace brittle string matching with k8serrors
  type predicates (IsForbidden/IsInvalid) for webhook rejection
  detection, with explicit RBAC exclusion

Update H100 GPU CI workflows to upload conformance evidence artifacts
alongside validation results.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant