feat(evidence): add artifact capture for conformance evidence by dims · Pull Request #201 · NVIDIA/aicr

dims · 2026-02-24T04:52:30Z

Summary

Add diagnostic artifact capture mechanism so conformance checks record rich evidence (deployment status, metrics samples, test results) during execution, flowing through the pipeline into evidence markdown
Artifacts are ephemeral (json:"-") and transported via base64-encoded ARTIFACT: lines in test output, decoded in phases.go, rendered as labeled code blocks in evidence templates
All 9 submission requirement checks now record diagnostic artifacts covering deployment status, metrics presence, behavioral test results, and more

Design

Check → ctx.Artifacts.Record(label, data)
  → Cancel() emits t.Logf("ARTIFACT:<base64>")
    → phases.go extracts ARTIFACT: lines, populates CheckResult.Artifacts
      → evidence renderer outputs #### Label + fenced code block

Key constraints:

Artifact type lives in checks/ (leaf package, no import cycles)
Per-artifact: max 8KB data, max 20 per check, each base64 line under bufio.Scanner 64KB limit
Artifacts are ephemeral (json:"-" yaml:"-") — never persisted in saved results
Cancel() nil-guards both r.ctx and r.ctx.Artifacts

Files Changed

Area	Files	Change
Infrastructure	`checks/artifact.go`, `checks/registry.go`, `checks/runner.go`, `result.go`, `phases.go`	Artifact type, collector, transport, pipeline extraction
Evidence	`evidence/types.go`, `evidence/renderer.go`, `evidence/templates.go`	Pass artifacts through to markdown rendering
Static checks (4)	`dra_support_check.go`, `accelerator_metrics_check.go`, `ai_service_metrics_check.go`, `inference_gateway_check.go`	Record deployment/metrics/CRD evidence
Behavioral checks (5)	`robust_controller_check.go`, `secure_access_check.go`, `gang_scheduling_check.go`, `pod_autoscaling_check.go`, `cluster_autoscaling_check.go`	Record behavioral test results
Tests	`artifact_test.go`, `runner_test.go`, `renderer_test.go`, `dra_support_check_unit_test.go`	Round-trip, cap enforcement, thread safety, Cancel() emit, renderer with/without artifacts

Test plan

make test passes with race detector (73.9% coverage)
Artifact encode/decode round-trip test
Cap enforcement (count limit, data truncation)
Thread safety test (concurrent Record)
Cancel() with nil ctx/Artifacts doesn't panic
Cancel() emits artifacts via t.Logf
Evidence renderer with artifacts: labeled code blocks appear
Evidence renderer without artifacts: identical to current output
All existing conformance tests pass unchanged

mchmarny

Clean architecture, good test coverage, and the ARTIFACT: transport mirrors the existing CONSTRAINT_RESULT: pattern well. Four items to address — see inline comments.

pkg/validator/checks/conformance/accelerator_metrics_check.go

pkg/validator/checks/artifact.go

pkg/validator/checks/conformance/dra_support_check.go

pkg/validator/checks/conformance/cluster_autoscaling_check.go

pkg/validator/checks/conformance/helpers.go

Superseded by review with inline comments

Add an artifact capture mechanism so conformance checks record rich diagnostic evidence during execution, flowing it through the pipeline into evidence markdown. Single command, rich output. Infrastructure: - Artifact type, ArtifactCollector with thread-safe Record()/Drain(), base64 encode/decode, 8KB per-artifact / 20 per-check caps - Pipeline: runner.go Cancel() emits via t.Logf → phases.go extracts using Contains+SplitN (handles t.Logf source prefixes) → evidence renderer emits labeled code blocks in markdown - Artifacts are ephemeral (json:"-") — never persisted in saved results - Failed artifact decodes log a warning and preserve the line in Reason Conformance checks instrumented (9 checks): - dra_support_check: controller, kubelet plugin, ResourceSlices - accelerator_metrics_check: DCGM metrics sample, required metrics - ai_service_metrics_check: Prometheus query, custom metrics API - inference_gateway_check: GatewayClass, Gateway, CRDs, data plane - robust_controller_check: Dynamo operator, webhook, rejection test - secure_access_check: DRA test pod, access patterns, isolation test - gang_scheduling_check: KAI scheduler, GPU availability, gang results - pod_autoscaling_check: custom/external metrics API, HPA test - cluster_autoscaling_check: Karpenter, NodePools, autoscaling test Testing: - Artifact encode/decode round-trip, cap enforcement, thread safety - extractArtifacts() with realistic source-prefixed t.Logf lines - Evidence renderer with/without artifacts

…checks LoadValidationContext() used DiagnosticTimeout (2 minutes) as the parent context for all conformance checks. Behavioral checks like DRA secure access need time for pod creation, CUDA image pull, GPU allocation, and isolation verification — 2 minutes was insufficient, causing consistent TIMEOUT failures. Add CheckExecutionTimeout (10 minutes) for the check execution context, bounded below ValidateConformanceTimeout (15 minutes).

mchmarny · 2026-02-24T17:09:25Z

Thanks for resolving the previous comments, most of this looks good now.
There still seems to be a few test artifacts being hardcoded congratulatory strings rather than observed state:

cluster_autoscaling_check.go:127 — “HPA: scaling intent detected\nKarpenter: new node(s) provisioned” — these are assertions restated as evidence, not captured state
robust_controller_check.go:141 — “Result: PASS — webhook rejected invalid DynamoGraphDeployment”
pod_autoscaling_check.go:152 — “Scale-up: PASS — HPA computed desiredReplicas > currentReplicas”
secure_access_check.go:124 — “Result: PASS — pod without DRA claims cannot see GPU devices”

The DRA support check does it right now, it captures actual replica counts, image versions, and ResourceSlice counts.

May be better to capture actual HPA .status.desiredReplicas/.status.currentReplicas values, the actual pod scheduling timestamps, or the actual Karpenter node provisioning events rather than static pass/fail strings. Just na idea

dims · 2026-02-24T17:10:43Z

May be better to capture actual HPA .status.desiredReplicas/.status.currentReplicas values, the actual pod scheduling timestamps, or the actual Karpenter node provisioning events rather than static pass/fail strings. Just na idea

will iterate for sure! it's not done done yet :)

dims requested a review from a team as a code owner February 24, 2026 04:52

github-actions bot added area/validator size/XL area/cli labels Feb 24, 2026

dims force-pushed the worktree-conformance-artifacts branch from b56d2dc to 52262f0 Compare February 24, 2026 13:05

This comment was marked as resolved.

Sign in to view

dims force-pushed the worktree-conformance-artifacts branch from 52262f0 to 10db91c Compare February 24, 2026 13:21

mchmarny requested changes Feb 24, 2026

View reviewed changes

dims force-pushed the worktree-conformance-artifacts branch from 10db91c to f100479 Compare February 24, 2026 13:24

dims requested a review from a team as a code owner February 24, 2026 13:24

github-actions bot added the area/ci label Feb 24, 2026

dims force-pushed the worktree-conformance-artifacts branch 4 times, most recently from dd75d61 to c6df7b5 Compare February 24, 2026 14:54

github-actions bot added the area/bundler label Feb 24, 2026

dims force-pushed the worktree-conformance-artifacts branch from c6df7b5 to 2c2ae52 Compare February 24, 2026 15:25

dims force-pushed the worktree-conformance-artifacts branch from 2c2ae52 to 1d15151 Compare February 24, 2026 15:52

dims merged commit 766d7c1 into NVIDIA:main Feb 24, 2026
30 of 31 checks passed

dims mentioned this pull request Feb 25, 2026

feat(conformance): capture observed state in evidence artifacts #204

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(evidence): add artifact capture for conformance evidence#201

feat(evidence): add artifact capture for conformance evidence#201
dims merged 2 commits intoNVIDIA:mainfrom
dims:worktree-conformance-artifacts

dims commented Feb 24, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

mchmarny left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mchmarny commented Feb 24, 2026

Uh oh!

dims commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dims commented Feb 24, 2026

Summary

Design

Files Changed

Test plan

Uh oh!

This comment was marked as resolved.

Uh oh!

mchmarny left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mchmarny commented Feb 24, 2026

Uh oh!

dims commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants