You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+6Lines changed: 6 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -137,6 +137,12 @@ The `experiment` command goes even further by providing an interface to variy in
137
137
138
138
See [workload/README.md](workload/README.md) for the full experiment file format and all pre-built experiments, as well as advanced functionality.
139
139
140
+
## Get started without accelerators
141
+
142
+
No GPU? No problem. The **[Quickstart](docs/quickstart.md)** walks you through the full `standup → smoketest → run → teardown` lifecycle on a local [Kind](https://kind.sigs.k8s.io/) cluster using a simulated inference engine — no accelerators, no cloud account, no cluster operator required. It uses the same `cicd/kind-sim` scenario that CI runs on every PR, so if it works locally it works in CI.
143
+
144
+
All you need is Docker (or Podman/Colima) with **4 CPUs / 8 GiB RAM** and Python 3.11+.
> **Resource note:** The `cicd/kind-sim` scenario deploys ~7 pods on a single Kind node. With the default 2 CPUs that Docker Desktop, Colima, and Podman ship with, the harness pod (and sometimes the gateway) cannot schedule due to `Insufficient cpu`. Bump your container runtime to **4 CPUs** before creating the Kind cluster. See [Troubleshooting](#pods-stuck-in-pending-during-standup-or-run) if you hit this.
54
57
55
58
Everything else — `kubectl`, `helm`, `helmfile`, `kind`, `skopeo`, `crane`, `helm-diff`, `jq`, `yq`, `kustomize` — will be installed for you by `./install.sh` in [step 3](#3-install-llmdbenchmark), with one exception: `kind` itself, which we install first below because we want the cluster up before the installer runs.
-**Low disk space**: Kind needs free space in `/tmp` and `/var/lib/docker`. `docker system prune -a` frees cache space.
233
236
-**Previous cluster still around**: `kind get clusters` then `kind delete cluster --name <name>`.
234
237
235
-
### Pods stuck in `Pending` during standup
238
+
### Pods stuck in `Pending` during standup or run
239
+
240
+
-**Insufficient CPU or memory on the Kind node**: this is the most common issue on laptops. Run `kubectl describe pod -n "$NS" <pod>` and look for events like:
The `cicd/kind-sim` scenario needs roughly **2.5 CPU** across all pods (decode, prefill, EPP, gateway, harness). If your container runtime (Docker Desktop, Colima, Podman) defaults to 2 CPUs, the harness pod won't fit alongside everything else.
247
+
248
+
**Check your current allocation:**
249
+
250
+
```bash
251
+
# Docker Desktop / Colima / Podman — any of these will work:
252
+
docker info 2>/dev/null | grep -E "CPUs|Total Memory"
After changing resources, **recreate the Kind cluster** (the kubelet captures allocatable resources at node boot):
274
+
275
+
```bash
276
+
kind delete cluster --name llmd-quickstart
277
+
kind create cluster --name llmd-quickstart
278
+
```
279
+
280
+
Then re-run standup from scratch.
236
281
237
282
-**PVC stuck**: `kubectl get pvc -n "$NS"` — the `standard` Kind storage class should provision immediately. If it does not, you're probably out of disk; see above.
238
283
-**Image pull backoff**: check `kubectl describe pod -n "$NS" <pod>` for the failing image and make sure your machine has network access to `ghcr.io`.
@@ -264,9 +309,21 @@ The `facebook/opt-125m` model is public and small. If the download fails, you mo
264
309
-**No network access from inside Kind pods** (corporate proxy, air-gapped laptop): run `kubectl logs -n "$NS" job/download-model --tail=50` to see the actual error.
265
310
-**HuggingFace rate limiting**: retry after a short wait, or set a `HUGGING_FACE_HUB_TOKEN` via `-v HUGGING_FACE_HUB_TOKEN=<token>`.
266
311
267
-
### Run phase hangs on `waiting for harness pod`
312
+
### Run phase hangs on `waiting for harness pod` or reports `No pods deployed`
313
+
314
+
-`kubectl get pods -n "$NS"` — check if the harness pod is `Pending`. If `kubectl describe pod -n "$NS" <harness-pod>` shows `Insufficient cpu` or `Insufficient memory`, see [Pods stuck in Pending](#pods-stuck-in-pending-during-standup-or-run) above.
315
+
- If a previous run failed and left a stale harness pod, clean it up before retrying:
316
+
317
+
```bash
318
+
kubectl delete pod -n "$NS" -l app=llmdbench-harness-launcher --ignore-not-found
319
+
```
320
+
321
+
- If you edited `harness.resources` in your scenario to reduce requests, you must re-run `plan` before `run` (no standup needed — the cluster infra is unchanged):
268
322
269
-
-`kubectl get pods -n "$NS"` — check if the harness pod is pending for resources. On very small Kind setups, the harness pod's default CPU/memory requests may not fit. Either give Docker more resources or lower the harness request in `config/scenarios/cicd/kind-sim.yaml` (`harness.resources`).
0 commit comments