Releases: NVIDIA/aicr
Releases · NVIDIA/aicr
v0.7.11
v0.7.10
Immutable
release. Only release title and notes can be modified.
Changelog
Features
- 044852e: feat(site): landing page refresh, dark mode, and version dropdown (@mchmarny)
- 5ce1740: feat(uat): AWS UAT pipeline with Chainsaw CUJ tests (#203) (@mchmarny)
- 3bfe565: feat(validator): add ComponentResult types for deployment materialization (@mchmarny)
- 56b61ae: feat(validator): add ComponentResult types for deployment materialization (@mchmarny)
- 03f77d4: feat(validator): implement component materialization with tests (@mchmarny)
- 8efd4be: feat(validator): integrate component materialization into deployment phase (@mchmarny)
- 4ff1fab: feat: integrate CNCF submission evidence collection into aicr validate (#214) (@yuanchen8911)
Tasks
Others
- 60501c4: fix(ci): add missing contents:read permission to PR comment job (@mchmarny)
- a477f43: fix(install): improve UX with supply chain security messaging (@mchmarny)
- 1bdf2d8: fix(validator): address lint issues in deployment materialization (@mchmarny)
- 7fb0222: test(chainsaw): add deployment materialization e2e tests (@mchmarny)
- 117db7e: test(chainsaw): update CUJ1 mock snapshot with full helm data (@mchmarny)
- 16368e1: test(kwok): add deployment materialization verification step (@mchmarny)
v0.7.9
Immutable
release. Only release title and notes can be modified.
v0.7.7
Immutable
release. Only release title and notes can be modified.
Changelog
Features
- d474757: feat(ci): add metrics-driven cluster autoscaling validation with Karpenter + KWOK (#168) (@dims)
- 1e3474d: feat(ci): binary attestation with SLSA Build Provenance v1 (#194) (@lockwobr)
- a3037f7: feat(collector): add Helm release and ArgoCD Application collectors (#196) (@mchmarny)
- 308b381: feat(validator): add Go-based CNCF AI conformance checks (#180) (@dims)
- 29b512b: feat(validator): replace helm CLI subprocess with Helm Go SDK for chart rendering (#186) (@xdu31)
- 770d132: feat(validator): self-contained DRA conformance check with EKS overlays (#182) (@dims)
- 2acf5d0: feat(validator): self-contained gang scheduling conformance check (#184) (@dims)
- f1411b6: feat(validator): upgrade conformance checks from static to behavioral validation (#185) (@dims)
- 811643c: feat: add HPA pod autoscaling evidence for CNCF AI Conformance (#191) (@yuanchen8911)
- edbe268: feat: add cluster autoscaling evidence for CNCF AI Conformance (#193) (@yuanchen8911)
- 361120e: feat: add conformance evidence renderer and fix check false-positives (#187) (@dims)
Tasks
- 3c1bd69: chore(ci): remove redundant DRA test steps from inference workflow (#183) (@dims)
- d320928: chore: upgrade Go to 1.26.0 (#190) (@mchmarny)
Others
- 39fbfbd: ci: harden workflows and reduce duplication (#188) (@mchmarny)
- fa0446c: fix(e2e): update deploy-agent test for current snapshot CLI (#198) (@mchmarny)
- e89fbe6: fix: guard against empty path in NewFileReader after filepath.Clean (@mchmarny)
- 4637a80: fix: pass cluster K8s version to Helm SDK chart rendering (#197) (@mchmarny)
- 399a2dc: fix: prevent snapshot agent Job from nesting agent deployment (#200) (@mchmarny)
- 504e77d: fix: resolve gosec lint issues and bump golangci-lint to v2.10.1 (@mchmarny)
- 46602b1: refactor(validator): remove Job-based checks from readiness phase, keep constraint-only gate (#195) (@xdu31)
- 1cf8020: test(recipe): add conformance recipe invariant tests (#181) (@dims)
v0.7.6
Immutable
release. Only release title and notes can be modified.
v0.7.5
v0.7.4
Immutable
release. Only release title and notes can be modified.
Changelog
Features
- 4defc33: feat(ci): add CNCF AI conformance validations to inference workflow (#162) (@dims)
- 06bccdf: feat(ci): add ClamAV malware scanning GitHub Action (#171) (@dims)
- 9da2501: feat(ci): add DRA GPU allocation test to H100 smoke test (#153) (@dims)
- e55f9a2: feat(ci): add HPA pod autoscaling validation to inference workflow (#163) (@dims)
- 0c435fb: feat(ci): add OSS community automation workflows (@mchmarny)
- 1f39bce: feat(ci): collect AI conformance evidence in H100 smoke test (#147) (@dims)
- 60023fd: feat(skyhook): temporarily remove skyhook tuning due to bugs (#154) (@ayuskauskas)
- 4ddf3b8: feat: add CNCF AI Conformance evidence collection (#158) (@yuanchen8911)
- 9a96d23: feat: add CUJ2 inference demo chat UI and update CUJ2 instructions (#151) (@yuanchen8911)
- 0c53d8e: feat: add DRA and gang scheduling test manifests for CNCF AI conformance (#150) (@yuanchen8911)
- f04d3e5: feat: add GPU training CI workflow with gang scheduling test (#155) (@dims)
- 0463e2d: feat: add expected-resources deployment check for validating Kubernetes resources exist (#149) (@xdu31)
- 2a922bc: feat: add support for workload-gate and workload-selector (#166) (@ayuskauskas)
- f176162: feat: add two-phase expected resource auto-discovery to validator (#164) (@xdu31)
Tasks
- 69c37d0: chore: improve consistency across GPU CI workflows (#160) (@dims)
- f9f1ec0: chore: update cuj1 (@mchmarny)
- 4962b9b: chore: update demos (@mchmarny)
- 84bf48c: chore: update demos (@mchmarny)
- 1701080: chore: update e2e demo (@mchmarny)
- 2a0f22c: chore: update e2e demo (@mchmarny)
- 0fa18e1: chore: update e2e demo (@mchmarny)
- 84d975e: chore: update e2e demo (@mchmarny)
- 54eceaa: chore: update s3c demo (@mchmarny)
Others
- 592e640: fix(ci): add pull_request trigger to vuln-scan workflow (@mchmarny)
- ca16886: fix(ci): break long lines in welcome workflow to pass yamllint (#148) (@dims)
- bcd26bd: fix(ci): combine path and size label workflows to prevent race condition (#161) (@yuanchen8911)
- 177e92e: fix(ci): harden workflows and improve CI/CD hygiene (@mchmarny)
- a40f754: fix(ci): lower vuln scan threshold to MEDIUM and add container image scanning (#172) (@dims)
- 60a2adc: fix(ci): re-enable CDI for H100 kind smoke test (#143) (@dims)
- 02e7c1c: fix(ci): run attestation and vuln scan concurrently in release workflow (#173) (@dims)
- 49f1333: fix(ci): use PR number in KWOK concurrency group (@mchmarny)
- 5be5a93: fix(ci): use pull_request_target for write-permission workflows (@mchmarny)
- f38d7b2: fix(docs): update bundle commands with correct tolerations in CUJ demos (#176) (@yuanchen8911)
- f93e618: fix: add kube-prometheus-stack as gpu-operator dependency (#170) (@yuanchen8911)
- 490aa0f: fix: add markdown rendering to chat UI and update CUJ2 documentation (#159) (@yuanchen8911)
- 20c3e4a: fix: enable DCGM exporter ServiceMonitor for Prometheus scraping (#157) (@yuanchen8911)
- ade7ff7: fix: move DRA controller nodeAffinity override to EKS overlay (#174) (@yuanchen8911)
- fa58fba: fix: remove admission.cdi from kai-scheduler values (#146) (@yuanchen8911)
- cbcc3e9: fix: remove nodeSelector from EBS CSI node DaemonSet scheduling (#175) (@yuanchen8911)
- f9cfa47: fix: remove trailing quote from skyhook no-op package version (#177) (@yuanchen8911)
- 4e15a0e: fix: skip --wait for KAI scheduler in deploy script (#169) (@yuanchen8911)
- 9a3cd3f: fix: update inference stack versions and enable Grove for dynamo workloads (#145) (@yuanchen8911)
- 03b5483: refactor: move examples/demos to project root demos directory (@mchmarny)
- ed4973b: refactor: move kai-scheduler and DRA driver to base overlay for CNCF AI conformance (#139) (@yuanchen8911)
- f15c6b3: refactor: rename PreDeployment to Readiness across codebase and docs (#156) (@xdu31)
- ea8f626: rename: eidos → aicr (AI Cluster Runtime) (@mchmarny)
v0.7.3
v0.7.2
v0.7.0
Immutable
release. Only release title and notes can be modified.
Changelog
Features
Others
- 63749cf: Feat/adding smi test (#117) (@iamkhaledh)
- c8b61c0: fix: disable CDI in GPU Operator for dynamo inference recipes (#134) (@yuanchen8911)
- 914f052: fix: remove fullnameOverride from dynamo-platform values (#135) (@yuanchen8911)