Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci-pr-checks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -450,4 +450,4 @@ jobs:
E2E_LABEL_FILTER: ${{ matrix.suite.label-filter }}
LOAD_VLLM_RENDER_IMAGE: ${{ matrix.suite.needs-renderer }}
PULL_VLLM_RENDER_IMAGE: "false"
run: make test-e2e-scheduler-run
run: make test-e2e-router-run
16 changes: 8 additions & 8 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -304,24 +304,24 @@ test-e2e-gaie-run: image-pull ## Ensure images are present, then run GAIE e2e te
$(CONTAINER_RUNTIME) run $(BUILDER_RUN_FLAGS) $(BUILDER_E2E_FLAGS) \
-e EPP_IMAGE=$(GAIE_E2E_IMAGE) \
-e USE_KIND=true \
$(BUILDER_IMAGE) ./hack/test-e2e.sh
$(BUILDER_IMAGE) ./test/scripts/test-e2e-gaie.sh

.PHONY: test-e2e-gaie
test-e2e-gaie: image-build-builder image-build ## Build images and run GAIE e2e tests
$(MAKE) test-e2e-gaie-run

.PHONY: test-e2e-scheduler-run
test-e2e-scheduler-run: image-pull ## Ensure images are present, then run scheduler e2e tests
.PHONY: test-e2e-router-run
test-e2e-router-run: image-pull ## Ensure images are present, then run router e2e tests
@printf "\033[33;1m==== Running End to End Tests ====\033[0m\n"
$(CONTAINER_RUNTIME) run $(BUILDER_RUN_FLAGS) $(BUILDER_E2E_FLAGS) \
$(BUILDER_IMAGE) ./test/scripts/run_e2e.sh
$(BUILDER_IMAGE) ./test/scripts/test-e2e-router.sh

.PHONY: test-e2e-scheduler
test-e2e-scheduler: image-build-builder image-build ## Build images and run scheduler e2e tests
$(MAKE) test-e2e-scheduler-run
.PHONY: test-e2e-router
test-e2e-router: image-build-builder image-build ## Build images and run router e2e tests
$(MAKE) test-e2e-router-run

.PHONY: test-e2e
test-e2e: test-e2e-gaie test-e2e-scheduler ## Run all end-to-end tests sequentially
test-e2e: test-e2e-gaie test-e2e-router ## Run all end-to-end tests sequentially


.PHONY: bench-tokenizer
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ A lightweight deployment where a self-managed Envoy proxy runs alongside the EPP
### 2. Gateway Mode (Inference Gateway)
The recommended mode for production environments, leveraging the official [Gateway API]. In this mode, the EPP acts as a backend for an `InferencePool`, which is referenced by an `HTTPRoute` on a shared `Gateway`. This enables advanced traffic management, multi-cluster load balancing, and shared infrastructure for both inference and traditional workloads.

For more details on the router architecture, routing logic, and different plugins (filters and scorers), see the [Architecture Documentation].
For more details on the router architecture, routing logic, and different plugins (filters and scorers), see the [Architecture Documentation]. For resource provisioning and container sizing recommendations under heavy or long-context workloads, see the [EPP Container Sizing Guide].

---

Expand All @@ -61,6 +61,7 @@ To ensure clarity across the project, we use the following standard terminology:
[Kubernetes Gateway API]:https://gateway-api.sigs.k8s.io/
[Architecture Documentation]:docs/architecture.md
[Disaggregation Documentation]:docs/disaggregation.md
[EPP Container Sizing Guide]:docs/operations.md
[InferencePool]:https://github.com/kubernetes-sigs/gateway-api-inference-extension
[Gateway API Inference Extension (GIE)]:https://github.com/kubernetes-sigs/gateway-api-inference-extension
[Kubernetes Gateway API Inference Extensions]:https://github.com/kubernetes-sigs/gateway-api-inference-extension
Expand Down
28 changes: 11 additions & 17 deletions config/charts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,15 +32,14 @@ helm install my-standalone-router ./config/charts/llm-d-router-standalone \
--set router.modelServers.matchLabels.app=my-vllm-service
```

#### Standalone with Agentgateway Proxy (Service-Backed)
Deploys EPP with an Agentgateway proxy. This mode requires disabling the `InferencePool` resource creation (`create=false`) and routes traffic to an existing Kubernetes Service:
#### Standalone with Agentgateway Proxy
Deploys EPP with an Agentgateway proxy. This mode requires disabling the `InferencePool` resource creation (`create=false`) and routes traffic directly to model servers:

```bash
helm install my-standalone-router ./config/charts/llm-d-router-standalone \
--set router.inferencePool.create=false \
--set router.proxy.proxyType=agentgateway \
--set router.proxy.agentgateway.service.name=my-model-service \
--set router.proxy.agentgateway.service.ports="8000"
--set router.modelServers.matchLabels.app=my-model-service
```

#### Standalone with a Separate Proxy Service
Expand Down Expand Up @@ -538,20 +537,21 @@ Configures EPP to run with a proxy (Envoy proxy or Agentgateway proxy) that inte
| `router.proxy.volumeMounts` | Sidecar container volume mounts. | `[]` |
| `router.proxy.volumes` | Sidecar container volumes. | `[]` |
| `router.proxy.configMapData` | Key-value pairs to include in a ConfigMap created for the sidecar. | `{}` |
| `router.proxy.agentgateway.service.create` | **Agentgateway only**. Create a dedicated model Service for the Agentgateway proxy. | `true` |
| `router.proxy.agentgateway.service.name` | **Agentgateway only**. Name of the model Service to route to. | `""` |
| `router.proxy.agentgateway.service.namespace` | **Agentgateway only**. Namespace of the model Service. Defaults to release namespace. | `""` |
| `router.proxy.agentgateway.service.ports` | **Agentgateway only**. Port list for the model Service (must match `modelServers.targetPorts`). | `[]` |
#### Complete Standalone Example with Agentgateway Proxy

#### Complete Proxy Sidecar Example (Agentgateway Service-Backed)

To deploy EPP in standalone mode with an Agentgateway sidecar routing traffic directly to an existing model Service `my-model-service` (bypassing `InferencePool` creation):
To deploy EPP in standalone mode with an Agentgateway sidecar routing traffic directly to model servers matching the label `app=my-model-service` (bypassing `InferencePool` creation):

```yaml
router:
inferencePool:
create: false # Disable InferencePool creation

modelServers:
matchLabels:
app: "my-model-service"
targetPorts:
- number: 8000

proxy:
enabled: true
proxyType: agentgateway
Expand All @@ -561,10 +561,4 @@ router:
memory: 4Gi
limits:
memory: 8Gi
agentgateway:
service:
create: true # Create a Service to route client traffic to EPP
name: "my-model-service"
ports:
- 8000 # Intercept traffic on port 8000
```
28 changes: 10 additions & 18 deletions config/charts/llm-d-router-standalone/templates/_validations.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,13 @@ standalone validations
{{- if not (or (eq $proxyMode "sidecar") (eq $proxyMode "service")) -}}
{{- fail (printf ".Values.router.proxy.mode must be one of [sidecar, service], got %q" $proxyMode) -}}
{{- end -}}
{{- /* Without an InferencePool the EPP --endpoint-selector is rendered from modelServers.matchLabels; an empty selector is rejected by EPP at startup, so require it here. */ -}}
{{- $useInferencePool := ne .Values.router.inferencePool.create false -}}
{{- if not $useInferencePool -}}
{{- if or (empty .Values.router.modelServers) (not .Values.router.modelServers.matchLabels) -}}
{{- fail ".Values.router.modelServers.matchLabels is required when .Values.router.inferencePool.create=false: standalone mode renders the EPP --endpoint-selector from matchLabels and cannot start with an empty selector" -}}
{{- end -}}
{{- end -}}
Comment on lines +21 to +27

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Require and validate modelServers.matchLabels.app for agentgateway backend naming (CWE-20).

Line 21-Line 27 only enforces that matchLabels exists. In this PR, agentgateway backend identity is derived from matchLabels.app; if app is missing/invalid, routing can silently target the wrong backend name at runtime. Add a hard fail for missing/invalid matchLabels.app when router.proxy.proxyType=agentgateway and router.inferencePool.create=false.

Proposed validation hardening
 {{- if not $useInferencePool -}}
   {{- if or (empty .Values.router.modelServers) (not .Values.router.modelServers.matchLabels) -}}
     {{- fail ".Values.router.modelServers.matchLabels is required when .Values.router.inferencePool.create=false: standalone mode renders the EPP --endpoint-selector from matchLabels and cannot start with an empty selector" -}}
   {{- end -}}
+  {{- if eq ($proxy.proxyType | default "envoy") "agentgateway" -}}
+    {{- $appLabel := index (.Values.router.modelServers.matchLabels | default dict) "app" | default "" -}}
+    {{- if empty $appLabel -}}
+      {{- fail ".Values.router.modelServers.matchLabels.app is required when proxyType=agentgateway and inferencePool.create=false" -}}
+    {{- end -}}
+    {{- if or (gt (len $appLabel) 63) (not (regexMatch "^[a-z0-9]([-a-z0-9]*[a-z0-9])?$" $appLabel)) -}}
+      {{- fail ".Values.router.modelServers.matchLabels.app must be a DNS-1123 label for agentgateway backend naming" -}}
+    {{- end -}}
+  {{- end -}}
 {{- end -}}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@config/charts/llm-d-router-standalone/templates/_validations.tpl` around
lines 21 - 27, The current validation in the _validations.tpl template only
checks if matchLabels exists, but does not validate the specific matchLabels.app
field which is required for agentgateway backend naming. When
router.proxy.proxyType is set to agentgateway and router.inferencePool.create is
false (standalone mode), add an additional validation check after the existing
matchLabels check to ensure that matchLabels.app is present and valid. This
validation should fail with a clear error message if matchLabels.app is missing
or empty when using agentgateway in standalone mode, preventing silent routing
errors at runtime.

{{- $failOpen := index $proxy "failOpen" -}}
{{- if and (not (kindIs "invalid" $failOpen)) (not (kindIs "bool" $failOpen)) -}}
{{- fail (printf ".Values.router.proxy.failOpen must be a boolean, got %q" (toString $failOpen)) -}}
Expand Down Expand Up @@ -46,32 +53,17 @@ standalone validations
{{- fail (printf ".Values.router.proxy.proxyType must be one of [envoy, agentgateway], got %q" $proxyType) -}}
{{- end -}}
{{- if eq $proxyType "agentgateway" -}}
{{- if hasKey $proxy "agentgateway" -}}
{{- fail ".Values.router.proxy.agentgateway is no longer supported; standalone agentgateway uses EPP endpoint discovery with a logical service backend" -}}
{{- end -}}
{{- if ne .Values.router.inferencePool.create false -}}
{{- fail ".Values.router.inferencePool.create=false is required when proxyType=agentgateway; standalone agentgateway currently supports only service-backed routing" -}}
{{- end -}}
{{- $agentgateway := index $proxy "agentgateway" | default dict -}}
{{- $service := index $agentgateway "service" | default dict -}}
{{- $serviceName := index $service "name" | default "" -}}
{{- $serviceCreate := index $service "create" | default true -}}
{{- if hasKey $service "port" -}}
{{- fail ".Values.router.proxy.agentgateway.service.port has been replaced by .Values.router.proxy.agentgateway.service.ports" -}}
{{- end -}}
{{- if empty $serviceName -}}
{{- fail ".Values.router.proxy.agentgateway.service.name is required when proxyType=agentgateway" -}}
{{- end -}}
{{- $targetPorts := include "llm-d-router.standaloneEndpointTargetPorts" . -}}
{{- $servicePorts := include "llm-d-router.agentgateway.modelServicePorts" . -}}
{{- if ne $targetPorts $servicePorts -}}
{{- fail (printf ".Values.router.proxy.agentgateway.service.ports must match .Values.router.modelServers.targetPorts when proxyType=agentgateway, got service ports %q and target ports %q" $servicePorts $targetPorts) -}}
{{- end -}}
{{- $listenerPort := include "llm-d-router.standaloneProxyListenerPort" . -}}
{{- $flags := .Values.router.epp.flags | default dict -}}
{{- if and (hasKey $flags "secure-serving") (ne (toString (index $flags "secure-serving")) "false") -}}
{{- fail ".Values.router.epp.flags.secure-serving must be false when proxyType=agentgateway; standalone agentgateway uses plaintext gRPC to EPP over localhost" -}}
{{- end -}}
{{- if $serviceCreate -}}
{{- $selectorLabels := include "llm-d-router.agentgateway.modelServiceSelectorLabels" . -}}
{{- end -}}
{{- end -}}
{{- end -}}
{{- end -}}

This file was deleted.

11 changes: 1 addition & 10 deletions config/charts/llm-d-router-standalone/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,16 +33,7 @@ router:
limits:
memory: 16Gi

# Agentgateway-specific settings used by the built-in preset when
# proxyType=agentgateway. service.name is required.
agentgateway:
service:
create: true
name: ""
namespace: ""
# Must match inferencePool.targetPorts.
ports:
- 8000


# Built-in standalone proxy presets. The selected preset is merged with the
# top-level proxy.* fields below, so explicit user overrides still win.
Expand Down
42 changes: 17 additions & 25 deletions config/charts/routerlib/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -269,13 +269,20 @@ Return the standalone EPP model-server target ports.
{{- end -}}

{{/*
Return the agentgateway model Service ports.
Return the agentgateway standalone logical backend service name.
Derives the name from .Values.router.modelServers.matchLabels.app,
falling back to .Release.Name if not set.
*/}}
{{- define "llm-d-router.agentgateway.modelServicePorts" -}}
{{- $proxyValues := .Values.router.proxy | default dict -}}
{{- $agentgateway := index $proxyValues "agentgateway" | default dict -}}
{{- $service := index $agentgateway "service" | default dict -}}
{{- include "llm-d-router.normalizedPortList" (dict "path" ".Values.router.proxy.agentgateway.service.ports" "value" (index $service "ports")) -}}
{{- define "llm-d-router.agentgateway.logicalBackendName" -}}
{{- $appLabel := "" -}}
{{- if and .Values.router.modelServers .Values.router.modelServers.matchLabels -}}
{{- $appLabel = index .Values.router.modelServers.matchLabels "app" | default "" -}}
{{- end -}}
{{- if not (empty $appLabel) -}}
{{- $appLabel -}}
{{- else -}}
{{- .Release.Name -}}
{{- end -}}
{{- end -}}

{{/*
Expand Down Expand Up @@ -329,30 +336,15 @@ Return the rendered proxy ConfigMap data.
{{- toYaml $data -}}
{{- end -}}

{{/*
Render labels from the standalone endpoint selector for the generated model Service.
Only equality-based selectors are supported because Service selectors are a map.
*/}}
{{- define "llm-d-router.agentgateway.modelServiceSelectorLabels" -}}
{{- if and .Values.router.modelServers .Values.router.modelServers.matchLabels -}}
{{- range $key, $value := .Values.router.modelServers.matchLabels -}}
{{- printf "%s: %s\n" ($key | quote) ($value | quote) -}}
{{- end -}}
{{- else -}}
{{- fail ".Values.modelServers.matchLabels is required when creating an agentgateway model Service" -}}
{{- end -}}
{{- end -}}


{{/*
Render the default standalone agentgateway proxy config template.
*/}}
{{- define "llm-d-router.proxy.agentgatewayConfig" -}}
{{- $proxyValues := .Values.router.proxy | default dict -}}
{{- $agentgateway := index $proxyValues "agentgateway" | default dict -}}
{{- $service := index $agentgateway "service" | default dict -}}
{{- $serviceName := index $service "name" | default "" -}}
{{- $serviceNamespace := index $service "namespace" | default .Release.Namespace -}}
{{- $servicePorts := splitList "," (include "llm-d-router.agentgateway.modelServicePorts" .) -}}
{{- $serviceName := include "llm-d-router.agentgateway.logicalBackendName" . -}}
{{- $serviceNamespace := .Release.Namespace -}}
{{- $servicePorts := splitList "," (include "llm-d-router.standaloneEndpointTargetPorts" .) -}}
{{- $backendPort := index $servicePorts 0 -}}
{{- $listenerPort := include "llm-d-router.standaloneProxyListenerPort" . | int -}}
config:
Expand Down
1 change: 1 addition & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -306,3 +306,4 @@ Enable chunked decode via the pd-sidecar flag:

- [GIE Spec](../README.md#relation-to-gie-igw)
- [Envoy External Processing](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_proc_filter)
- [EPP Container Sizing Guide](./operations.md)
Loading
Loading