Skip to content

feat: add direct vLLM provider support#265

Open
sozercan wants to merge 14 commits into
mainfrom
vllm-nightly
Open

feat: add direct vLLM provider support#265
sozercan wants to merge 14 commits into
mainfrom
vllm-nightly

Conversation

@sozercan

@sozercan sozercan commented May 4, 2026

Copy link
Copy Markdown
Member

Description

Adds a first-class Direct vLLM inference provider to AI Runway. Direct vLLM renders a ModelDeployment straight into native Kubernetes Deployment + Service objects (no upstream operator/CRD required), pins the container image to an immutable registry digest, and integrates with a new vLLM "recipe" system that imports tuned launch arguments from recipes.vllm.ai. It also introduces the shared CRD plumbing (spec.engine.image, spec.engine.extraArgs, status.image) that this and other providers consume.

AI Prompt (Optional)

🤖 AI Prompt Used
N/A - Manual implementation (with AI-assisted multi-agent code review and rebase/CI fixups)

AI Tool: Claude

Type of Change

  • ✨ New feature (non-breaking change that adds functionality)
  • 📚 Documentation update
  • 🎨 UI/UX improvement
  • 🧪 Test update
  • 🔧 Build/CI configuration
  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • 💥 Breaking change (fix or feature that would cause existing functionality to change)
  • ♻️ Refactoring (no functional changes)

Related Issues

Relates to the Direct vLLM provider work (PR #265).

Changes Made

  • New providers/vllm controller — a standalone Kubebuilder controller (controller.go, transformer.go, config.go, status.go, image_resolver.go, cmd/main.go) that transforms a ModelDeployment into an apps/v1 Deployment + Service. Supports tensor-parallel sizing from resources.gpu.count, a memory-backed /dev/shm volume for multi-GPU, spec.model.storage PVC mounts, HF-token secret injection, and reserved host/port arg protection. Ships Dockerfile, Makefile, RBAC, and deploy/vllm.yaml shim manifests.
  • Image digest resolution + provenanceRemoteImageResolver (via go-containerregistry) pins tag-based images to digests and records status.image (Requested/Resolved/Digest/Source/InNightly/Verified/…). Resolution is reused once a digest is cached. The default image is pinned for reproducibility.
  • CRD API additions (controller/api/v1alpha1) — new spec.engine.image (preferred) and spec.engine.extraArgs fields, a status.image (ImageStatus) block, and an ImageResolved condition. ValidateImageFields() rejects conflicting spec.image vs spec.engine.image, and ImageOverride() centralizes precedence (engine image wins over the legacy top-level image). Wired into the core reconciler and validating webhook; CRD/deepcopy regenerated.
  • Cross-provider spec.engine.image adoptiondynamo, kaito, kuberay, and llmd transformers now read ImageOverride() so the new engine-image field is honored consistently (previously only spec.image was read).
  • vLLM recipe backend — new backend/src/services/vllmRecipesClient.ts and vllmRecipeResolver.ts plus the backend/src/routes/vllmRecipes.ts routes (GET /vllm/recipes, GET /:org/:model, POST /resolve). Includes strict HF model-ID validation (rejects path traversal), HTTPS-only + origin/path-prefix pinning for recipe references, an AbortController fetch timeout, a TTL in-memory cache (stale-on-error), a response-size bound, and typed errors mapped to 400/502/504.
  • Frontend integrationDeployPage.tsx/DeploymentForm.tsx add the Direct vLLM deployment method (nightly/stable/custom launch images, recipe apply flow, FP8 precision controls), deploymentDisplay.ts centralizes engine/provider display names, and ModelCard/HfModelCard/DeploymentList/DeploymentDetailsPage surface the new provider and engine labels.
  • Shared types/APIshared/types/vllmRecipes.ts, shared/api/vllmRecipes.ts, and shared/types/deployment.ts add recipe types, engine.image/extraArgs, recipeProvenance annotations, and envEnvVar[] conversion.
  • Docs — new docs/providers/vllm.md, plus updates to README.md, docs/api.md, docs/architecture.md, docs/crd-reference.md, docs/providers.md, docs/versioning-upgrades.md, and agents.md.
  • Provider behavior decisions — Direct vLLM is explicit-only (SelectionRules: nil, never auto-selected) and advertises aggregated serving only (validateCompatibility rejects disaggregated).
  • Build/CI & rebase fixupsMakefile wiring for the new provider, resolution of stale-rebase build blockers (duplicate keys, stray brace, validateSpec arity), and CI lint/test fixes.

Testing

  • Unit tests pass (bun run test)
  • Manual testing performed
  • Tested with a Kubernetes cluster

Test coverage added across the stack: providers/vllm/*_test.go (transformer args/dedup, storage mounts, reserved-arg guard, image status, controller happy-path/ownership-conflict/deletion, real-resolver guards), controller webhook/validation tests, backend vllmRecipesClient/vllmRecipeResolver/deployments/shared-deployment tests, frontend DeploymentForm tests, and updated dynamo/kaito/llmd transformer tests.

cd controller && go test ./...
cd providers/vllm && go test ./...
cd providers/llmd && go test ./...
bun run test

Checklist

  • My code follows the project's style guidelines
  • I have run bun run lint
  • I have added tests that prove my fix/feature works
  • New and existing unit tests pass locally
  • I have updated documentation if needed
  • My changes generate no new warnings

Screenshots

N/A

Additional Notes

  • Scope of base: this branch is 11 commits ahead of main (merge-base c5a4422), spanning the initial feature, several review-feedback rounds, a rebase onto updated main, and CI fixups — totaling 79 files (+10,472/−175).
  • docs/plans/vllm-provider-full-plan.md describes forward-looking features (cosign verification, image catalog, broader disaggregated support) that are not all implemented in this PR — treat it as a roadmap, not shipped scope.
  • All previously-raised review threads have been addressed and resolved.

Copilot AI review requested due to automatic review settings May 4, 2026 16:17
@sozercan sozercan requested a review from a team as a code owner May 4, 2026 16:17

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds end-to-end “Direct vLLM” support across the Airunway stack (CRD/schema + controller/provider + backend recipe resolution APIs + frontend deploy UX), including image provenance / resolution status and recipe-derived deployment settings.

Changes:

  • Introduces vLLM Recipes APIs (shared types + backend routes/resolver + frontend client/mocks/tests).
  • Extends the ModelDeployment API surface for Direct vLLM: spec.engine.image, spec.engine.extraArgs, recipe provenance annotations, and status.image.
  • Adds a new providers/vllm provider controller + deployment manifests, and updates llm-d compatibility to prefer spec.engine.image + support extraArgs.
Show a summary per file
File Description
shared/types/vllmRecipes.ts Adds shared TS contract for listing/fetching/resolving vLLM recipes.
shared/types/index.ts Re-exports new vLLM recipe types.
shared/types/deployment.ts Adds recipe provenance + engine image/extraArgs + image status; updates manifest/spec conversion.
shared/api/vllmRecipes.ts Adds shared API client wrapper for vLLM recipes endpoints.
shared/api/index.ts Wires vLLM recipes API into the shared API client.
providers/vllm/transformer.go Implements Direct vLLM resource generation (Deployments/Services) from ModelDeployment.
providers/vllm/status.go Translates upstream Deployment status into ModelDeployment provider status.
providers/vllm/status_test.go Unit tests for Deployment→phase/status translation.
providers/vllm/Makefile Adds build/deploy helpers for the vLLM provider.
providers/vllm/image_status_test.go Tests for image resolution/provenance status behavior.
providers/vllm/image_resolver.go Implements remote digest resolution + best-effort OCI provenance extraction.
providers/vllm/go.mod New Go module for the vLLM provider.
providers/vllm/Dockerfile Builds and packages the vLLM provider controller image.
providers/vllm/deploy/vllm.yaml Generated deploy manifest for installing the vLLM provider.
providers/vllm/controller.go Core vLLM provider reconciler (SSA apply, image resolution status, finalizer).
providers/vllm/controller_test.go Unit tests for compatibility checks and reconcile early-exit behavior.
providers/vllm/config/rbac/service_account.yaml ServiceAccount for vLLM provider installation.
providers/vllm/config/rbac/role.yaml ClusterRole for vLLM provider controller permissions.
providers/vllm/config/rbac/role_binding.yaml ClusterRoleBinding for vLLM provider controller.
providers/vllm/config/rbac/kustomization.yaml Kustomize RBAC bundle.
providers/vllm/config/manager/manager.yaml Manager Deployment template for provider install.
providers/vllm/config/manager/kustomization.yaml Kustomize image override for provider manager.
providers/vllm/config/default/kustomization.yaml Default kustomize bundle wiring RBAC + manager.
providers/vllm/config.go Self-registration/heartbeat for InferenceProviderConfig + install info.
providers/vllm/config_test.go Tests for provider config spec + installation info.
providers/vllm/cmd/main.go Provider controller main entrypoint (controller-runtime manager).
providers/llmd/transformer.go Adds engine.extraArgs support + prefers ImageOverride() for image selection.
providers/llmd/transformer_test.go Tests engine image precedence + extraArgs ordering.
Makefile Includes vLLM provider in provider test target.
frontend/src/test/mocks/handlers.ts Adds MSW handlers for vLLM recipes endpoints.
frontend/src/pages/DeployPage.tsx Uses shared engine display naming for badges.
frontend/src/pages/DeploymentDetailsPage.tsx Improves provider/engine display labels and naming.
frontend/src/lib/deploymentDisplay.ts Adds provider/engine display name helpers.
frontend/src/lib/api.ts Adds frontend vLLM recipes API wrapper and exports shared recipe types.
frontend/src/components/models/ModelCard.tsx Uses engine display naming helper.
frontend/src/components/models/HfModelCard.tsx Uses engine display naming helper.
frontend/src/components/deployments/DeploymentList.tsx Uses provider/engine display naming helpers.
frontend/src/components/deployments/DeploymentForm.test.tsx Adds Direct vLLM deploy flow coverage (launch image + recipe apply + submission).
docs/versioning-upgrades.md Updates provider compatibility matrix with llm-d / Direct vLLM entries.
docs/providers.md Updates provider selection docs and capability matrix; adds Direct vLLM row.
docs/crd-reference.md Documents spec.engine.image + spec.engine.extraArgs and Direct vLLM usage.
docs/architecture.md Updates architecture narrative for provider/runtime registration.
docs/api.md Updates REST API docs for engine image/extraArgs and Direct vLLM semantics.
deploy/controller.yaml Updates published CRD schema (engine.image/extraArgs, provider name, image status).
controller/internal/webhook/v1alpha1/modeldeployment_webhook.go Adds webhook validation for conflicting image override fields.
controller/internal/webhook/v1alpha1/modeldeployment_webhook_test.go Tests webhook rejection/admission for image override conflicts.
controller/internal/controller/modeldeployment_validation_test.go Adds reconciliation-time validation tests for image override conflicts.
controller/internal/controller/modeldeployment_controller.go Enforces image override conflict validation before selection and during validation.
controller/internal/controller/gateway_reconciler.go Adds shared helper functions for label merging/setting in gateway reconciler.
controller/config/crd/bases/airunway.ai_modeldeployments.yaml CRD base schema updated for new engine/image fields + image status.
controller/api/v1alpha1/zz_generated.deepcopy.go Deepcopy updates for new fields (engine.extraArgs, status.image).
controller/api/v1alpha1/modeldeployment_validation.go Adds ValidateImageFields() and ImageOverride() helpers.
controller/api/v1alpha1/modeldeployment_types.go Adds EngineSpec.image/extraArgs, ImageStatus, and new condition constants.
backend/src/shared-deployment.test.ts Tests shared manifest conversion for vLLM image mapping, env, extraArgs, recipe provenance annotations.
backend/src/services/vllmRecipesClient.ts Fetches recipe index/raw payloads from recipes.vllm.ai (configurable base URL).
backend/src/services/vllmRecipeResolver.ts Resolves recipes into engine args/resources/image/env/annotations + provenance/warnings.
backend/src/services/vllmRecipeResolver.test.ts Unit tests for recipe materialization behavior.
backend/src/services/kubernetes.ts Improves provider display names for runtime status reporting.
backend/src/routes/vllmRecipes.ts Adds /api/vllm/recipes endpoints (list/get/resolve).
backend/src/routes/index.ts Exports vLLM recipes routes.
backend/src/routes/deployments.ts Extends create schema to accept recipe provenance, env, engineExtraArgs.
backend/src/routes/deployments.test.ts Adds preview/create tests for env + Direct vLLM recipe provenance materialization.
backend/src/hono-app.ts Registers vLLM recipes routes.
backend/scripts/embed-assets.ts Adds @ts-nocheck to generated embed module header for Bun file imports.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 67/68 changed files
  • Comments generated: 9

Comment thread shared/types/deployment.ts
Comment thread providers/vllm/transformer.go
Comment thread providers/vllm/config/manager/kustomization.yaml Outdated
Comment thread docs/providers.md Outdated
Comment thread docs/architecture.md Outdated
Comment thread docs/versioning-upgrades.md Outdated
Comment thread frontend/src/test/mocks/handlers.ts Outdated
Comment thread frontend/src/components/deployments/DeploymentForm.test.tsx
Comment thread backend/src/routes/deployments.ts Outdated
Comment thread providers/vllm/config.go Outdated
Comment thread providers/vllm/config.go Outdated
Comment thread providers/vllm/controller.go
Comment thread backend/src/services/vllmRecipesClient.ts Outdated
Comment thread controller/api/v1alpha1/modeldeployment_validation.go
Comment thread providers/vllm/config.go Outdated
Copilot AI review requested due to automatic review settings May 8, 2026 04:27
@sozercan sozercan requested a review from robert-cronin May 8, 2026 04:30

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 80/81 changed files
  • Comments generated: 3

Comment thread shared/types/deployment.ts Outdated
Comment thread backend/src/services/vllmRecipesClient.ts Outdated
Comment thread backend/src/services/vllmRecipesClient.ts Outdated
Copilot AI review requested due to automatic review settings May 8, 2026 05:26

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 80/81 changed files
  • Comments generated: 6

Comment thread controller/internal/controller/gateway_reconciler.go Outdated
Comment thread providers/vllm/Dockerfile Outdated
Comment thread frontend/src/test/mocks/handlers.ts
Comment thread providers/vllm/transformer.go Outdated
Comment thread providers/vllm/transformer.go Outdated
Comment thread providers/vllm/status_test.go Outdated
Copilot AI review requested due to automatic review settings May 12, 2026 05:53

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

Files not reviewed (1)
  • controller/api/v1alpha1/zz_generated.deepcopy.go: Language not supported
  • Files reviewed: 76/78 changed files
  • Comments generated: 2

Comment thread backend/src/services/vllmRecipesClient.ts Outdated
Comment thread backend/src/routes/vllmRecipes.ts Outdated
@sozercan sozercan added this to the 0.7.0 milestone May 19, 2026
@surajssd surajssd self-assigned this Jun 10, 2026
Copilot AI review requested due to automatic review settings June 12, 2026 00:27

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 76 out of 78 changed files in this pull request and generated no new comments.

Files not reviewed (1)
  • controller/api/v1alpha1/zz_generated.deepcopy.go: Generated file

sozercan and others added 8 commits June 15, 2026 12:03
- **Recipe client SSRF / path-traversal hardening**
  (`backend/src/services/vllmRecipesClient.ts`,
  `backend/src/routes/vllmRecipes.ts`): validate Hugging Face model IDs
  as exactly `<org>/<model>` and `encodeURIComponent` each segment,
  restrict the `/:org/:model` route to a single path segment, require
  `https:` for recipe references, and add a 10s `AbortController`
  timeout to `fetchJson`
- **Make Direct vLLM explicit-only** (`providers/vllm/config.go`):
  remove the selection rule so the provider is never auto-selected, and
  migrate capabilities to the per-engine `EngineCapability` shape
- **Reject disaggregated serving** (`providers/vllm/controller.go`):
  `validateCompatibility` now rejects `disaggregated` mode to match the
  advertised aggregated-only capability
- **KubeRay honors `spec.engine.image`**
  (`providers/kuberay/transformer.go`): use `ImageOverride()` so the
  engine image field is not silently ignored
- **Drop empty recipe-provenance annotations**
  (`shared/types/deployment.ts`): trim string values and skip empty
  strings/arrays so blank provenance no longer emits
  `airunway.ai/recipe.*` annotations or a false `generated-by` marker
- Update docs (`docs/providers.md`, `docs/providers/vllm.md`) and the
  related backend/provider tests to match the above

Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
- **`SettingsPage.tsx`**: remove a stray extra `}` after
  `selectDefaultRuntimeId` that broke the entire frontend build
  (`TS1128`)
- **`DeploymentForm.tsx`**: drop the duplicate `vllm` keys in
  `RUNTIME_INFO` and `RUNTIME_ENGINES` (`TS1117`); the canonical `Direct
  vLLM` entries now win instead of the stale `vLLM`/native ones
- **`kubernetes.ts`**: delete the local `getProviderDisplayName`
  redeclaration that shadowed the import from `../lib/providers`
  (`TS2440`)
- **`modeldeployment_validation_test.go`**: update the `validateSpec`
  call to the current 5-arg signature so the controller test package
  compiles
- **`DeploymentList.tsx`, `DeploymentDetailsPage.tsx`**: remove unused
  `generateAynaUrl` and `MessageSquare` imports that failed lint under
  `--max-warnings 0`

Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 76 out of 78 changed files in this pull request and generated 1 comment.

Files not reviewed (1)
  • controller/api/v1alpha1/zz_generated.deepcopy.go: Generated file

Comment thread providers/vllm/go.mod Outdated
surajssd added 2 commits June 15, 2026 12:29
- **`deployments.test.ts`**: type `capturedConfig` as `DeploymentConfig
  | undefined` instead of `any` (`@typescript-eslint/no-explicit-any`)
- **`vllmRecipeResolver.ts`**: remove the unused `findRecordAtPath`
  helper, and use `const` for the never-reassigned `result` in
  `applyExplicitFeatureSelection` (`prefer-const`)
- **`DeploymentForm.test.tsx`**: update the vLLM runtime card assertions
  to expect the current `Direct vLLM` description text instead of the
  stale `native vLLM provider` copy

Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
- **`transformer.go`**: skip a derived flag (`--tensor-parallel-size`,
  `--model`, `--max-model-len`, etc.) when its key is already set in
  `spec.engine.args`, so an edited GPU count can no longer emit a
  conflicting duplicate `--tensor-parallel-size`
- **`transformer.go`**: render `--enforce-eager` and
  `--enable-prefix-caching` (the `spec.engine` toggles were silently
  dropped)
- **`transformer.go`**: mount `spec.model.storage` PVC volumes
  (`volumes` + `volumeMounts`) alongside `/dev/shm`
- **`transformer.go`**: reject reserved `host`/`port` engine args in the
  `--key` and `--key=value` forms, and guard nil
  `Decode.GPU`/`Prefill.GPU` in `transformDisaggregated`
- **`vllmRecipesClient.ts`**: add typed errors
  (validation/timeout/upstream), a TTL in-memory cache with
  stale-on-error fallback, and a 5 MiB response-size bound
- **`vllmRecipes.ts`**: map recipe errors to `400`/`504`/`502` instead
  of a blanket `502`
- **`controller.go`**: name the conflicting owner in
  `resourceConflictError`
- Add controller and transformer tests for the above, and document the
  registry-coupling / nightly-digest behavior in
  `docs/providers/vllm.md`
- Note the new `spec.engine.image`/`extraArgs` fields and
  `providers/vllm` in `agents.md`
- Bump `providers/vllm` dependencies (`go.mod`/`go.sum`)

Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Copilot AI review requested due to automatic review settings June 15, 2026 19:57

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 77 out of 79 changed files in this pull request and generated no new comments.

Files not reviewed (1)
  • controller/api/v1alpha1/zz_generated.deepcopy.go: Generated file

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 77 out of 79 changed files in this pull request and generated no new comments.

Files not reviewed (1)
  • controller/api/v1alpha1/zz_generated.deepcopy.go: Generated file

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 77 out of 79 changed files in this pull request and generated no new comments.

Files not reviewed (1)
  • controller/api/v1alpha1/zz_generated.deepcopy.go: Generated file

surajssd added 3 commits June 15, 2026 15:37
- **`vllmRecipesClient.ts`**: cap the per-model cache with LRU eviction
  (`MAX_CACHE_ENTRIES`) so the unauthenticated recipe route cannot grow
  it without bound, and stream-bound the response body via
  `readBoundedBody` so a chunked / no-`content-length` reply is aborted
  at the 5 MiB cap instead of being fully buffered first
- **`vllmRecipeResolver.ts`**: compute GPUs-per-pod as `tensor-parallel
  × pipeline-parallel` only (data-parallel/decode-context scale
  replicas, not GPUs), and stop `stripVllmServePrefix` from dropping a
  leading `--model` flag as if it were the positional model id
- **`providers/vllm/controller.go`**: enforce the finalizer timeout even
  when the owned Deployment is stuck Terminating (Delete returns nil),
  and skip the `Deploying` phase downgrade when `syncStatus` failed so a
  transient API error cannot flip a `Running` deployment
- **`providers/vllm/image_resolver.go`**: bound the registry resolve
  with a `context.WithTimeout` so a hung registry cannot stall the
  reconcile worker
- **`providers/vllm/transformer.go`**: extend derived-flag dedup to
  `spec.engine.extraArgs` (not just `engine.args`), and drop a
  user-supplied `HF_TOKEN` from `spec.env` when the token secret is
  injected to avoid a duplicate env entry
- **`website/docusaurus.config.js`**: exclude `docs/plans/**` from the
  published site so internal planning docs are not rendered as public
  pages
- **`docs/providers/vllm.md`**: document the `status.image.source`
  classification and the `spec.provider.overrides` trust boundary
- Add tests covering cache eviction, the streaming size bound, extraArgs
  dedup, HF_TOKEN dedup, the finalizer-timeout path, GPU-per-pod
  derivation, the `--model` guard, and disaggregated-mode detection

Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
- **`versions.env`** / **`shared/types/versions.generated.ts`**: add
  `VLLM_VERSION` (`cu130-nightly`) as the single source of truth for the
  Direct vLLM default image tag, and regenerate the TS export
- **`providers/vllm/transformer.go`**: make `VLLMVersion` an
  ldflags-injectable `var` and compute `DefaultVLLMImage` from
  `officialVLLMImageRepository` + `VLLMVersion` so the default tracks
  `versions.env`
- **`providers/vllm/Makefile`**: include `versions.env`, inject
  `VLLMVersion` via `-ldflags`, and add the missing
  `verify-versions`/`vet`/`test` targets to match the dynamo/kaito
  providers
- **`providers/vllm/Dockerfile`**: require a `VLLM_VERSION` build-arg
  and inject it via `-ldflags` so the in-image default cannot drift from
  `versions.env`
- **`Makefile`**: add a `verify-versions` check asserting the
  `transformer.go` `VLLMVersion` fallback literal matches `versions.env`
- **`docs/providers/vllm.md`**: document `make controller-deploy` +
  `make -C providers/vllm deploy` as the in-repo install path alongside
  the published-manifest `kubectl apply` path

Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
- **`Makefile`**: replace the hardcoded `versions in sync` echo with an
  `awk` line that lists every `KEY=VALUE` from `versions.env`, so the
  summary stays current automatically as keys are added (and now prints
  all versions, not just three)
- **`hack/test-verify-versions.sh`**: add a
  `providers/vllm/transformer.go` mutation case so the guard self-test
  also exercises the `VLLMVersion` check

Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Copilot AI review requested due to automatic review settings June 15, 2026 23:33

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 81 out of 83 changed files in this pull request and generated no new comments.

Files not reviewed (1)
  • controller/api/v1alpha1/zz_generated.deepcopy.go: Generated file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants