Skip to content

Commit 3cb4813

Browse files
authored
fix: Quickstart Guide Fix (llm-d#921)
* update quickstart Signed-off-by: vezio <tyler.rimaldi@ibm.com> * update quickstart Signed-off-by: vezio <tyler.rimaldi@ibm.com> --------- Signed-off-by: vezio <tyler.rimaldi@ibm.com>
1 parent deadde4 commit 3cb4813

2 files changed

Lines changed: 66 additions & 3 deletions

File tree

README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,12 @@ The `experiment` command goes even further by providing an interface to variy in
137137

138138
See [workload/README.md](workload/README.md) for the full experiment file format and all pre-built experiments, as well as advanced functionality.
139139

140+
## Get started without accelerators
141+
142+
No GPU? No problem. The **[Quickstart](docs/quickstart.md)** walks you through the full `standup → smoketest → run → teardown` lifecycle on a local [Kind](https://kind.sigs.k8s.io/) cluster using a simulated inference engine — no accelerators, no cloud account, no cluster operator required. It uses the same `cicd/kind-sim` scenario that CI runs on every PR, so if it works locally it works in CI.
143+
144+
All you need is Docker (or Podman/Colima) with **4 CPUs / 8 GiB RAM** and Python 3.11+.
145+
140146
## Next Steps
141147

142148
| Topic | Where to look |

docs/quickstart.md

Lines changed: 60 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,9 @@ You need these installed before starting:
5151
| Docker or Podman | any recent version | `docker info` or `podman info` |
5252
| Python | 3.11+ | `python3 --version` |
5353
| `git` | any | `git --version` |
54+
| Container runtime resources | **4 CPUs / 8 GiB RAM** | `docker info \| grep -E "CPUs\|Total Memory"` |
55+
56+
> **Resource note:** The `cicd/kind-sim` scenario deploys ~7 pods on a single Kind node. With the default 2 CPUs that Docker Desktop, Colima, and Podman ship with, the harness pod (and sometimes the gateway) cannot schedule due to `Insufficient cpu`. Bump your container runtime to **4 CPUs** before creating the Kind cluster. See [Troubleshooting](#pods-stuck-in-pending-during-standup-or-run) if you hit this.
5457
5558
Everything else — `kubectl`, `helm`, `helmfile`, `kind`, `skopeo`, `crane`, `helm-diff`, `jq`, `yq`, `kustomize` — will be installed for you by `./install.sh` in [step 3](#3-install-llmdbenchmark), with one exception: `kind` itself, which we install first below because we want the cluster up before the installer runs.
5659

@@ -232,7 +235,49 @@ kind delete cluster --name llmd-quickstart
232235
- **Low disk space**: Kind needs free space in `/tmp` and `/var/lib/docker`. `docker system prune -a` frees cache space.
233236
- **Previous cluster still around**: `kind get clusters` then `kind delete cluster --name <name>`.
234237

235-
### Pods stuck in `Pending` during standup
238+
### Pods stuck in `Pending` during standup or run
239+
240+
- **Insufficient CPU or memory on the Kind node**: this is the most common issue on laptops. Run `kubectl describe pod -n "$NS" <pod>` and look for events like:
241+
242+
```
243+
Warning FailedScheduling 0/1 nodes are available: 1 Insufficient cpu, 1 Insufficient memory.
244+
```
245+
246+
The `cicd/kind-sim` scenario needs roughly **2.5 CPU** across all pods (decode, prefill, EPP, gateway, harness). If your container runtime (Docker Desktop, Colima, Podman) defaults to 2 CPUs, the harness pod won't fit alongside everything else.
247+
248+
**Check your current allocation:**
249+
250+
```bash
251+
# Docker Desktop / Colima / Podman — any of these will work:
252+
docker info 2>/dev/null | grep -E "CPUs|Total Memory"
253+
podman info 2>/dev/null | grep -E "cpus|memTotal"
254+
colima status 2>/dev/null
255+
256+
# Or check what Kubernetes actually sees:
257+
kubectl describe node | grep -A6 "Allocated resources"
258+
```
259+
260+
**Fix — increase CPUs to at least 4 (8 GiB RAM recommended):**
261+
262+
```bash
263+
# Docker Desktop: Settings > Resources > CPUs: 4, Memory: 8 GiB
264+
# (no CLI option — must be done through the GUI)
265+
266+
# Colima
267+
colima stop && colima start --cpu 4 --memory 8
268+
269+
# Podman
270+
podman machine stop && podman machine set --cpus 4 --memory 8192 && podman machine start
271+
```
272+
273+
After changing resources, **recreate the Kind cluster** (the kubelet captures allocatable resources at node boot):
274+
275+
```bash
276+
kind delete cluster --name llmd-quickstart
277+
kind create cluster --name llmd-quickstart
278+
```
279+
280+
Then re-run standup from scratch.
236281

237282
- **PVC stuck**: `kubectl get pvc -n "$NS"` — the `standard` Kind storage class should provision immediately. If it does not, you're probably out of disk; see above.
238283
- **Image pull backoff**: check `kubectl describe pod -n "$NS" <pod>` for the failing image and make sure your machine has network access to `ghcr.io`.
@@ -264,9 +309,21 @@ The `facebook/opt-125m` model is public and small. If the download fails, you mo
264309
- **No network access from inside Kind pods** (corporate proxy, air-gapped laptop): run `kubectl logs -n "$NS" job/download-model --tail=50` to see the actual error.
265310
- **HuggingFace rate limiting**: retry after a short wait, or set a `HUGGING_FACE_HUB_TOKEN` via `-v HUGGING_FACE_HUB_TOKEN=<token>`.
266311

267-
### Run phase hangs on `waiting for harness pod`
312+
### Run phase hangs on `waiting for harness pod` or reports `No pods deployed`
313+
314+
- `kubectl get pods -n "$NS"` — check if the harness pod is `Pending`. If `kubectl describe pod -n "$NS" <harness-pod>` shows `Insufficient cpu` or `Insufficient memory`, see [Pods stuck in Pending](#pods-stuck-in-pending-during-standup-or-run) above.
315+
- If a previous run failed and left a stale harness pod, clean it up before retrying:
316+
317+
```bash
318+
kubectl delete pod -n "$NS" -l app=llmdbench-harness-launcher --ignore-not-found
319+
```
320+
321+
- If you edited `harness.resources` in your scenario to reduce requests, you must re-run `plan` before `run` (no standup needed — the cluster infra is unchanged):
268322

269-
- `kubectl get pods -n "$NS"` — check if the harness pod is pending for resources. On very small Kind setups, the harness pod's default CPU/memory requests may not fit. Either give Docker more resources or lower the harness request in `config/scenarios/cicd/kind-sim.yaml` (`harness.resources`).
323+
```bash
324+
llmdbenchmark --spec cicd/kind-sim plan -p "$NS"
325+
llmdbenchmark --spec cicd/kind-sim run -p "$NS"
326+
```
270327

271328
### Anything else
272329

0 commit comments

Comments
 (0)