-
Notifications
You must be signed in to change notification settings - Fork 596
Open
Labels
Area: InferenceActivities related to Gateway API Inference Extension support.Activities related to Gateway API Inference Extension support.Area: agentgatewayPriority: HighRequired in next 3 months to make progress, bugs that affect multiple users, or very bad UXRequired in next 3 months to make progress, bugs that affect multiple users, or very bad UX
Description
Goal:
Support kgateway + agentgateway for the existing llm-d kgateway instructions for Envoy extproc (see existing docs here: https://llm-d.ai/docs/architecture/Components/infra#prerequisites)
At a high level, this will require:
- Ensuring agentgateway has up-to-date Gateway API Inference Extension
- Verifying and/or adding the telemetry required by the llm-d inference scheduler
Context / Background:
- kgtw with Envoy data plane is already supported as an Inference Gateway in llm-d-infra
- The inference extension itself currently does not emit metrics. EPP and vLLM emit metrics, which are documented upstream.
Open Questions / Next Steps:
- Test llm-d example with
--set agentgateway.enabled=true --set inferenceExtension.enabled=true - In theory agentgateway should work out of the box based on the requirements from the getting started guide: https://github.com/llm-d/llm-d/blob/dev/guides/prereq/gateway-provider/README.md#before-you-begin. Confirm that this is the case, if not, open follow up issues in agentgateway for dataplane issues, or kgateway for control plane issues.
- Update llm-d docs with agentgateway example
zhengkezhou1
Metadata
Metadata
Assignees
Labels
Area: InferenceActivities related to Gateway API Inference Extension support.Activities related to Gateway API Inference Extension support.Area: agentgatewayPriority: HighRequired in next 3 months to make progress, bugs that affect multiple users, or very bad UXRequired in next 3 months to make progress, bugs that affect multiple users, or very bad UX