Skip to content

Commit ec66806

Browse files
committed
merge: resolve conflicts with main (keep homebrew install, our image)
2 parents 4a0f2b4 + c5a9115 commit ec66806

File tree

13 files changed

+167
-14
lines changed

13 files changed

+167
-14
lines changed

README.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,9 +23,14 @@ Every AICR recipe is:
2323
Install and generate your first recipe in under two minutes:
2424

2525
```bash
26-
# Install the CLI
26+
# Install the CLI (Homebrew)
27+
brew tap NVIDIA/aicr
28+
brew install aicr
29+
30+
# Or use the install script
2731
curl -sfL https://raw.githubusercontent.com/NVIDIA/aicr/main/install | bash -s --
2832

33+
2934
# Capture your cluster's current state
3035
aicr snapshot --output snapshot.yaml
3136

demos/cuj1.md renamed to demos/cuj1-eks.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ aicr bundle \
5353
--output bundle
5454
```
5555

56-
> Both options allow for comma delamination to supply multiple values. See the [bundle](../docs/user/cli-reference.md#aicr-bundle) section for more information.
56+
> Both options allow for comma-separated values to supply multiple values. See the [bundle](../docs/user/cli-reference.md#aicr-bundle) section for more information.
5757
5858
## Install Bundle into the Cluster
5959

demos/cuj2-eks.md

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# AICR - Critical User Journey (CUJ) 2 — EKS Inference
2+
3+
## Assumptions
4+
5+
* Assuming user is already authenticated to an EKS cluster with 2+ H100 (p5.48xlarge) nodes.
6+
* Values used in `--accelerated-node-selector`, `--accelerated-node-toleration`, `--system-node-toleration` flags are only for example purposes. Assuming user will update these to match their cluster.
7+
8+
## Snapshot
9+
10+
```shell
11+
aicr snapshot \
12+
--namespace aicr-validation \
13+
--node-selector nodeGroup=gpu-worker \
14+
--toleration dedicated=worker-workload:NoSchedule \
15+
--toleration dedicated=worker-workload:NoExecute \
16+
--output snapshot.yaml
17+
```
18+
19+
## Gen Recipe
20+
21+
```shell
22+
aicr recipe \
23+
--service eks \
24+
--accelerator h100 \
25+
--intent inference \
26+
--os ubuntu \
27+
--platform dynamo \
28+
--output recipe.yaml
29+
```
30+
31+
## Validate Recipe Constraints
32+
33+
```shell
34+
aicr validate \
35+
--recipe recipe.yaml \
36+
--snapshot snapshot.yaml \
37+
--no-cluster \
38+
--phase deployment \
39+
--output dry-run.json
40+
```
41+
42+
## Generate Bundle
43+
44+
```shell
45+
aicr bundle \
46+
--recipe recipe.yaml \
47+
--accelerated-node-selector nodeGroup=gpu-worker \
48+
--accelerated-node-toleration dedicated=worker-workload:NoSchedule \
49+
--accelerated-node-toleration dedicated=worker-workload:NoExecute \
50+
--system-node-selector nodeGroup=system-worker \
51+
--system-node-toleration dedicated=system-workload:NoSchedule \
52+
--system-node-toleration dedicated=system-workload:NoExecute \
53+
--output bundle
54+
```
55+
56+
> Both options allow for comma-separated values to supply multiple values. See the [bundle](../docs/user/cli-reference.md#aicr-bundle) section for more information.
57+
58+
## Install Bundle into the Cluster
59+
60+
```shell
61+
cd ./bundle && chmod +x deploy.sh && ./deploy.sh
62+
```
63+
64+
## Validate Cluster
65+
66+
```shell
67+
aicr validate \
68+
--recipe recipe.yaml \
69+
--toleration dedicated=worker-workload:NoSchedule \
70+
--toleration dedicated=worker-workload:NoExecute \
71+
--phase all \
72+
--output report.json
73+
```
74+
75+
## Deploy Inference Workload
76+
77+
Deploy an inference serving graph using the Dynamo platform:
78+
79+
```shell
80+
# Deploy the vLLM aggregation workload (includes KAI queue + DynamoGraphDeployment)
81+
kubectl apply -f demos/workloads/inference/vllm-agg.yaml
82+
83+
# Monitor the deployment
84+
kubectl get dynamographdeployments -n dynamo-workload
85+
kubectl get pods -n dynamo-workload -o wide -w
86+
87+
# Verify the inference gateway routes to the workload
88+
kubectl get gateway inference-gateway -n kgateway-system
89+
kubectl get inferencepool -n dynamo-workload
90+
```
91+
92+
## Chat with the Model
93+
94+
Once the workload is running, start a local chat server:
95+
96+
```shell
97+
# Start the chat server (port-forwards to the inference gateway)
98+
bash demos/workloads/inference/chat-server.sh
99+
100+
# Open the chat UI in your browser
101+
open demos/workloads/inference/chat.html
102+
```
103+
104+
## Success
105+
106+
* Bundle deployed with 16 components (inference recipe)
107+
* CNCF conformance: 9/9 requirements pass
108+
* DRA Support, Gang Scheduling, Secure GPU Access, Accelerator Metrics,
109+
AI Service Metrics, Inference Gateway, Robust Controller (Dynamo),
110+
Pod Autoscaling (HPA), Cluster Autoscaling
111+
* Dynamo inference workload serving requests via inference gateway

demos/valid.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,11 @@
55
Download the latest binary and verify version:
66

77
```shell
8+
# Homebrew (macOS/Linux)
9+
brew tap NVIDIA/aicr
10+
brew install aicr
11+
12+
# Or use the install script
813
curl -sfL https://raw.githubusercontent.com/NVIDIA/aicr/main/install | bash -s --
914
```
1015

docs/README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,11 @@ For engineers integrating AICR into CI/CD pipelines, GitOps workflows, or larger
155155
### Install CLI
156156

157157
```shell
158+
# Homebrew (macOS/Linux)
159+
brew tap NVIDIA/aicr
160+
brew install aicr
161+
162+
# Or use the install script
158163
curl -sfL https://raw.githubusercontent.com/NVIDIA/aicr/main/install | bash -s --
159164
```
160165

docs/user/README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,11 @@ This section is for users who:
2323
## Quick Start
2424

2525
```shell
26-
# Install aicr CLI
26+
# Install aicr CLI (Homebrew)
27+
brew tap NVIDIA/aicr
28+
brew install aicr
29+
30+
# Or use the install script
2731
curl -sfL https://raw.githubusercontent.com/NVIDIA/aicr/main/install | bash -s --
2832

2933
# Generate a recipe for your environment

docs/user/installation.md

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,14 @@ This guide describes how to install the AI Cluster Runtime (AICR) CLI tool (`aic
1313

1414
## Install aicr CLI
1515

16-
### Option 1: Automated Installation (Recommended)
16+
### Option 1: Homebrew (macOS/Linux)
17+
18+
```shell
19+
brew tap NVIDIA/aicr
20+
brew install aicr
21+
```
22+
23+
### Option 2: Automated Installation
1724

1825
Install the latest version using the installation script:
1926

@@ -38,7 +45,7 @@ This script:
3845

3946
> **Supply Chain Security**: AICR includes SLSA Build Level 3 compliance with signed SBOMs and verifiable attestations. See [SECURITY](../../SECURITY.md#supply-chain-security) for verification instructions.
4047
41-
### Option 2: Manual Installation
48+
### Option 3: Manual Installation
4249

4350
1. **Download the latest release**
4451

@@ -64,7 +71,7 @@ sudo chmod +x /usr/local/bin/aicr
6471
aicr --version
6572
```
6673

67-
### Option 3: Build from Source
74+
### Option 4: Build from Source
6875

6976
**Requirements:**
7077
- Go 1.26 or higher

recipes/overlays/h100-eks-ubuntu-training.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ spec:
5858
- nccl-all-reduce-bw
5959
constraints:
6060
- name: nccl-all-reduce-bw
61-
value: ">= 100"
61+
value: ">= 300"
6262
conformance:
6363
checks:
6464
- platform-health
@@ -70,4 +70,4 @@ spec:
7070
- pod-autoscaling
7171
- cluster-autoscaling
7272
- robust-controller
73-
- secure-accelerator-access
73+
- secure-accelerator-access

site/docs/getting-started/index.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,11 @@ For engineers integrating AICR into CI/CD pipelines, GitOps workflows, or larger
152152
### Install CLI
153153

154154
```shell
155+
# Homebrew (macOS/Linux)
156+
brew tap NVIDIA/aicr
157+
brew install aicr
158+
159+
# Or use the install script
155160
curl -sfL https://raw.githubusercontent.com/NVIDIA/aicr/main/install | bash -s --
156161
```
157162

site/docs/user/index.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,11 @@ This section is for users who:
3030
## Quick Start
3131

3232
```shell
33-
# Install aicr CLI
33+
# Install aicr CLI (Homebrew)
34+
brew tap NVIDIA/aicr
35+
brew install aicr
36+
37+
# Or use the install script
3438
curl -sfL https://raw.githubusercontent.com/NVIDIA/aicr/main/install | bash -s --
3539

3640
# Generate a recipe for your environment

0 commit comments

Comments
 (0)