Skip to content

Commit 25623d5

Browse files
marcusdsmckornfield
authored andcommitted
chore(deps): bump vllm 0.18→0.20 + torch 2.10→2.11 stack
vllm 0.20.0 bumps numba 0.61.2→0.65.0 (which lifts numpy<2.3 → <2.5), unblocking downstream consumers that need newer numpy. The bump requires moving the torch stack in lockstep — vllm 0.20.0 pins torch==2.11.0 and torchvision==0.26.0. - vllm 0.18.0 → 0.20.0 - torch 2.10.0 → 2.11.0 - torchvision 0.25.0 → 0.26.0 - torchao 0.16.0 → 0.17.0 - xformers 0.0.34 → 0.0.35 (open torch>=2.10) Stays on transformers 4.57.3 (vllm 0.20.0 allows it; v5 path explicitly out of scope here). This supersedes the dependabot-only PR #443, which couldn't lock because torch was left on 2.10 while vllm wanted 2.11. Signed-off-by: mschwab <mschwab@nvidia.com> Signed-off-by: Matt Kornfield <mkornfield@nvidia.com>
1 parent 58446ee commit 25623d5

29 files changed

Lines changed: 1218 additions & 1079 deletions

File tree

.agents/skills/diagnose-failures/SKILL.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ Common `ty` error patterns:
5555

5656
| Error | Likely cause | Fix |
5757
|-------|-------------|-----|
58-
| `unresolved-import` | Missing extra in venv | Run `uv sync --frozen --extra cu128 --extra engine --group dev` |
58+
| `unresolved-import` | Missing extra in venv | Run `uv sync --frozen --extra cu129 --extra engine --group dev` |
5959
| `unresolved-attribute` | Computed property treated as config field | Check if the attribute is a `@property`, not a Pydantic field |
6060
| `possibly-unbound` | Variable assigned only in one branch | Add an `else` branch or initialise before the conditional |
6161
| `invalid-argument-type` | Wrong type passed to function | Check the function signature; use `cast()` only as a last resort |
@@ -87,8 +87,8 @@ gh run view <run-id> --log-failed
8787

8888
## Import / Dependency Errors
8989

90-
- Check if the import requires an extras gate: `cpu`, `cu128`, or `engine`
91-
- Common: `vllm`, `torch`, `unsloth` need `cpu` or `cu128` extra
90+
- Check if the import requires an extras gate: `cpu`, `cu129`, or `engine`
91+
- Common: `vllm`, `torch`, `unsloth` need `cpu` or `cu129` extra
9292
- Use the `diagnose-deps` skill for lockfile diff diagnosis after `uv lock`
9393
- Run: `uv run tools/diff-lockfile.py` to see what changed
9494

@@ -97,7 +97,7 @@ gh run view <run-id> --log-failed
9797
| Error | Likely Cause | Fix |
9898
|-------|-------------|-----|
9999
| `CUDA out of memory` | Batch too large or model too big | Reduce `batch_size` or use quantization |
100-
| `CUDA not available` | Wrong extra installed | Reinstall with `make bootstrap-nss cu128` |
100+
| `CUDA not available` | Wrong extra installed | Reinstall with `make bootstrap-nss cu129` |
101101
| `NCCL error` | Multi-GPU issues | Use `CUDA_VISIBLE_DEVICES=0` for single-GPU |
102102

103103
### Running GPU / e2e Tests

.agents/skills/git-worktrees/SKILL.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,10 +29,10 @@ uv sync --frozen
2929

3030
This creates a local `.venv` in the worktree. With uv's cache the install takes ~2-3 seconds on a warm cache.
3131

32-
If you need different extras (e.g. `cu128` vs `cpu`), pass them explicitly:
32+
If you need different extras (e.g. `cu129` vs `cpu`), pass them explicitly:
3333

3434
```bash
35-
uv sync --frozen --extra cu128 --extra engine --group dev
35+
uv sync --frozen --extra cu129 --extra engine --group dev
3636
```
3737

3838
Never run bare `uv sync` without `--frozen` -- it re-locks `uv.lock` and creates dirty state.

.agents/skills/uv-build/SKILL.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ make bootstrap-tools && make bootstrap-nss cpu
1919
# Pick a variant:
2020
make bootstrap-nss dev # dev tools only (no engine/torch)
2121
make bootstrap-nss cpu # + engine + CPU PyTorch
22-
make bootstrap-nss cu128 # + engine + CUDA 12.8 PyTorch
22+
make bootstrap-nss cu129 # + engine + CUDA 12.9 PyTorch
2323
make bootstrap-nss engine # + engine (no torch)
2424
```
2525

@@ -30,11 +30,11 @@ Under the hood: `uv sync --frozen --extra <extra> [--extra engine] --group dev`
3030
| Extra | What it installs |
3131
|-------|------------------|
3232
| `cpu` | PyTorch CPU, faiss-cpu, flashinfer (Linux only) |
33-
| `cu128` | PyTorch+CUDA 12.8, faiss-gpu, flashinfer-jit-cache |
33+
| `cu129` | PyTorch+CUDA 12.9, faiss-gpu, flashinfer-jit-cache |
3434
| `engine` | ML pipeline deps (outlines, wandb, tiktoken, etc.) -- no torch |
3535
| `microservices` | `nemo-microservices` from local path |
3636

37-
`cpu` and `cu128` conflict -- you must pick one, never both. Enforced in `[tool.uv] conflicts`.
37+
`cpu` and `cu129` conflict -- you must pick one, never both. Enforced in `[tool.uv] conflicts`.
3838

3939
## Index Management
4040

@@ -43,9 +43,9 @@ PyTorch wheels come from dedicated indexes, not PyPI:
4343
| Index | URL | Used for |
4444
|-------|-----|----------|
4545
| `pytorch-cpu` | `download.pytorch.org/whl/cpu` | torch, torchvision (CPU, Linux) |
46-
| `pytorch-cu128` | `download.pytorch.org/whl/cu128` | torch, torchvision, triton, xformers (CUDA) |
46+
| `pytorch-cu129` | `download.pytorch.org/whl/cu129` | torch, torchvision, triton (CUDA) |
4747
| `nv-shared-pypi-local` | NVIDIA Artifactory | Internal NVIDIA packages |
48-
| `flashinfer-jit-cache` | `flashinfer.ai/whl/cu128` | FlashInfer JIT cache |
48+
| `flashinfer-jit-cache` | `flashinfer.ai/whl/cu129` | FlashInfer JIT cache |
4949
| `nvidia-pypi-public` | `pypi.nvidia.com` | Public NVIDIA packages |
5050

5151
All indexes are `explicit = true` (only used when a package is mapped to them in `[tool.uv.sources]`).

.claude/commands/bootstrap.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@ Set up the development environment from scratch.
1313
2. Install Python dependencies (choose one):
1414
```bash
1515
make bootstrap-nss cpu # CPU-only (macOS or Linux without GPU)
16-
make bootstrap-nss cuda # CUDA 12.8 (Linux with NVIDIA GPU)
16+
make bootstrap-nss cuda # CUDA 12.9 (Linux with NVIDIA GPU)
1717
make bootstrap-nss engine # Engine dependencies only (no torch)
1818
make bootstrap-nss dev # Minimal dev dependencies only
1919
```
2020

21-
Note: `cuda` is an alias for `cu128`. Both are equivalent.
21+
Note: `cuda` is an alias for `cu129`. Both are equivalent.

.cursor/setup-worktree.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,10 @@ fi
1616

1717
# Bare --frozen installs the base environment. For GPU dev work (ty, import
1818
# checks, GPU tests) run the full command manually after setup:
19-
# uv sync --frozen --extra cu128 --extra engine --group dev
19+
# uv sync --frozen --extra cu129 --extra engine --group dev
2020
uv sync --frozen
2121
echo "Venv ready: $(pwd)/.venv"
22-
echo "Note: for GPU extras run: uv sync --frozen --extra cu128 --extra engine --group dev"
22+
echo "Note: for GPU extras run: uv sync --frozen --extra cu129 --extra engine --group dev"
2323

2424
for _envfile in .env .env.local mise.local.toml .local.envrc; do
2525
if [ -f "$ROOT_WORKTREE_PATH/$_envfile" ]; then

.github/actions/setup-gpu-test-env/action.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ inputs:
2727
cuda-extra:
2828
description: "CUDA dependency extra to bootstrap"
2929
required: false
30-
default: "cu128"
30+
default: "cu129"
3131

3232
runs:
3333
using: "composite"

.github/workflows/gpu-tests.yml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,8 +126,26 @@ jobs:
126126
with:
127127
fetch-depth: 0
128128

129+
<<<<<<< HEAD
129130
- name: Setup GPU test environment
130131
uses: ./.github/actions/setup-gpu-test-env
132+
=======
133+
- name: Install make
134+
run: apt-get update && apt-get install -y --no-install-recommends make
135+
136+
- name: Setup Python environment
137+
uses: ./.github/actions/setup-python-env
138+
with:
139+
python-version: "3.11"
140+
bootstrap-tools: "true"
141+
142+
- name: Bootstrap CUDA environment
143+
run: make bootstrap-nss cu129
144+
145+
- name: Check GPU availability
146+
run: |
147+
uv run python -c "import torch; print('cuda available:', torch.cuda.is_available()); print('device count:', torch.cuda.device_count())"
148+
>>>>>>> 4a11f2bd (chore(deps): bump vllm 0.18→0.20 + torch 2.10→2.11 stack)
131149

132150
- name: Run GPU E2E tests
133151
timeout-minutes: 45

AGENTS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ Common commands: `make test` (unit tests), `make format` (auto-fix formatting +
2424
The canonical `uv sync` command for a full GPU/dev environment is:
2525

2626
```bash
27-
uv sync --frozen --extra cu128 --extra engine --group dev
27+
uv sync --frozen --extra cu129 --extra engine --group dev
2828
```
2929

3030
Bare `uv sync --frozen` (without extras) installs an incomplete environment -- `ty`, import checks, and GPU tests will fail.

CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ Please read our [Code of Conduct](CODE_OF_CONDUCT.md) before contributing.
5959

6060
# Install Python dependencies (choose one)
6161
make bootstrap-nss cpu # CPU-only (macOS or Linux without GPU)
62-
make bootstrap-nss cuda # CUDA 12.8 (Linux with NVIDIA GPU)
62+
make bootstrap-nss cuda # CUDA 12.9 (Linux with NVIDIA GPU)
6363
make bootstrap-nss engine # Engine dependencies only
6464
make bootstrap-nss dev # Minimal dev dependencies only
6565
```

Makefile

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ NSS_ROOT_PATH := $(shell pwd)
1313
# Normalize architecture names
1414
ifeq ($(ARCH),x86_64)
1515
ARCH := amd64
16-
PYTORCH_DEPS := cu128
16+
PYTORCH_DEPS := cu129
1717
export BUILD_ARCH ?= linux/amd64
1818
endif
1919
ifeq ($(ARCH),aarch64)
@@ -82,12 +82,12 @@ verify-python-version: ## Verify Python version and install if necessary
8282
uv venv --seed --allow-existing --python 3.11
8383

8484
.PHONY: bootstrap-python
85-
bootstrap-python: .venv ## Bootstrap Python dependencies. Set PYTORCH_DEPS to 'cpu' or 'cu128'. Here mostly for legacy usage.
85+
bootstrap-python: .venv ## Bootstrap Python dependencies. Set PYTORCH_DEPS to 'cpu' or 'cu129'. Here mostly for legacy usage.
8686
uv sync --frozen --extra ${PYTORCH_DEPS} --extra engine --group dev
8787

8888
# Dynamic targets for bootstrap-nss
8989
# Usage: make bootstrap-nss {dev,engine,cpu,cuda}
90-
BOOTSTRAP_EXTRAS := dev engine cpu cuda cu128
90+
BOOTSTRAP_EXTRAS := dev engine cpu cuda cu129
9191
$(BOOTSTRAP_EXTRAS):
9292
@:
9393

@@ -97,9 +97,9 @@ bootstrap-nss: .venv ## Bootstrap Python dependencies. Usage: make bootstrap-nss
9797
@echo "~~~~~~"
9898
@echo "attempting to install nss package with primary extra: $(EXTRA)"
9999
@if [ "$(EXTRA)" = "cuda" ]; then \
100-
uv sync --frozen --extra cu128 --extra engine --group dev; \
101-
elif [ "$(EXTRA)" = "cu128" ]; then \
102-
uv sync --frozen --extra cu128 --extra engine --group dev; \
100+
uv sync --frozen --extra cu129 --extra engine --group dev; \
101+
elif [ "$(EXTRA)" = "cu129" ]; then \
102+
uv sync --frozen --extra cu129 --extra engine --group dev; \
103103
elif [ "$(EXTRA)" = "cpu" ]; then \
104104
uv sync --frozen --extra cpu --extra engine --group dev; \
105105
elif [ "$(EXTRA)" = "engine" ]; then \
@@ -469,7 +469,7 @@ NSS_DATASETS := clinc_oos dow_jones_index
469469

470470
define nss_combo_test
471471
test-nss-$(1)-$(2)-ci: ## Run pytest test for $(shell echo $(1) | tr '_' '-') config with $(shell echo $(2) | tr '_' '-') dataset
472-
$(MAKE) bootstrap-nss cu128
472+
$(MAKE) bootstrap-nss cu129
473473
$(PYTEST_NO_XDIST_CMD) -vv $(PYTEST_CI_OPTS) $(NSS_ROOT_PATH)/tests/e2e/test_dataset_config.py -k "test_$(2)_dataset[$(subst _,-,$(1))]"
474474
endef
475475

0 commit comments

Comments
 (0)