You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|[**KAITO**](https://github.com/kaito-project/kaito)| vLLM (GPU) and llama.cpp (CPU/GPU) support |[kaito.yaml](providers/kaito/deploy/kaito.yaml)|
33
33
|[**LLM-D**](https://github.com/llm-d/llm-d)| vLLM (GPU) with aggregated or disaggregated serving |[llmd.yaml](providers/llmd/deploy/llmd.yaml)|
34
+
|[**Direct vLLM**](docs/providers/vllm.md)| Direct OpenAI-compatible vLLM Deployments for newest model support |[vllm.yaml](providers/vllm/deploy/vllm.yaml)|
Copy file name to clipboardExpand all lines: agents.md
+5-2Lines changed: 5 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
3
3
## WHY: Project Purpose
4
4
5
-
**AI Runway** is a platform for deploying and managing machine learning models on Kubernetes. It provides a unified CRD abstraction (`ModelDeployment`) that works across multiple inference providers (KAITO, Dynamo, KubeRay, llm-d, etc.).
5
+
**AI Runway** is a platform for deploying and managing machine learning models on Kubernetes. It provides a unified CRD abstraction (`ModelDeployment`) that works across multiple inference providers (KAITO, Dynamo, KubeRay, llm-d, Direct vLLM, etc.).
6
6
7
7
## WHAT: Tech Stack & Structure
8
8
@@ -18,6 +18,7 @@
18
18
-`controller/config/` - Kustomize manifests for CRDs/RBAC
19
19
-`frontend/src/` - React components, hooks, pages
20
20
-`backend/src/` - Hono app, providers, services
21
+
-`providers/` - Standalone provider controllers/shims (`dynamo`, `kaito`, `kuberay`, `llmd`, `vllm`); each renders `ModelDeployment` into its upstream resource. `providers/vllm` is the in-repo Direct vLLM provider (renders native `Deployment`+`Service`, selected via `provider.name: vllm`).
21
22
-`shared/types/` - Shared TypeScript definitions
22
23
-`plugins/headlamp/` - Headlamp dashboard plugin
23
24
-`docs/` - Detailed documentation (read as needed; also the source rendered on the website)
@@ -103,7 +104,9 @@ Unified API for deploying ML models. Key fields:
103
104
-`spec.model.id` - HuggingFace model ID or custom identifier
104
105
-`spec.model.source` - `huggingface` or `custom`
105
106
-`spec.engine.type` - `vllm`, `sglang`, `trtllm`, or `llamacpp` (optional, auto-selected from provider capabilities)
-`spec.engine.image` - Optional engine-specific container image override (preferred over legacy top-level `spec.image`; used by Direct vLLM/custom images)
108
+
-`spec.engine.extraArgs` - Optional list of raw engine flags appended verbatim
0 commit comments