Skip to content

Conversation

atchernych
Copy link
Contributor

@atchernych atchernych commented Oct 13, 2025

Overview:

DEP-505

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • New Features

    • Enable KV-aware Dynamo routing via a new, default-on Dynamo EPP setup.
    • Support configurable EPP config file path and routing options (namespace, component, KV block size).
  • Documentation

    • Overhauled Inference Gateway guide with dual Dynamo integration paths, step-by-step installation, updated Helm commands, and expanded KV routing/feature toggle guidance.
  • Refactor

    • Renamed and consolidated Helm values: introduce epp.useDynamo and epp.dynamo.*, deprecating previous eppAware fields; added a dedicated values file for Dynamo EPP.

@atchernych atchernych requested a review from a team as a code owner October 13, 2025 20:47
@github-actions github-actions bot added the feat label Oct 13, 2025
Copy link
Contributor

coderabbitai bot commented Oct 13, 2025

Walkthrough

Documentation and Helm chart updates introduce a configurable Dynamo EPP integration. The README is rewritten to detail two routing modes and updated deployment steps. Helm templates and values switch from eppAware to epp.useDynamo with epp.dynamo settings, configurable configFile, and conditional env/volumes based on Dynamo usage.

Changes

Cohort / File(s) Summary
Docs: Inference Gateway guide
deploy/inference-gateway/README.md
Rewrites installation/usage to cover two Dynamo integration modes, new default-on Dynamo EPP setup, updated helm flows, env vars, and detailed KV routing configuration; adds values-dynamo-epp.yaml references and reorganized sections.
Helm template: Dynamo EPP
deploy/inference-gateway/helm/dynamo-gaie/templates/dynamo-epp.yaml
Replaces eppAware gating with epp.useDynamo; always uses .Values.extension.image; configFile now configurable; wraps env/volumes under useDynamo; updates env vars (adds DYN_NAMESPACE, DYNAMO_COMPONENT, DYNAMO_KV_BLOCK_SIZE).
Helm values: Dynamo routing config
deploy/inference-gateway/helm/dynamo-gaie/values.yaml, deploy/inference-gateway/values-dynamo-epp.yaml
Removes eppAware fields; adds epp.useDynamo: true, epp.configFile, and epp.dynamo.{namespace,component,kvBlockSize}; standardizes image under extension.image.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as User
  participant H as Helm
  participant K as Kubernetes
  participant G as Inference Gateway
  participant E as EPP Container

  U->>H: helm upgrade --install (values incl. epp.useDynamo)
  H->>K: Apply manifests (CRDs, Deployments, ConfigMaps)
  K-->>G: Start Gateway Pod
  K-->>E: Start EPP Pod (image: extension.image)

  alt epp.useDynamo = true
    Note over E,G: Dynamo KV-aware routing enabled
    K-->>E: Mount configFile, set DYN_NAMESPACE, DYNAMO_COMPONENT, DYNAMO_KV_BLOCK_SIZE
    G->>E: Route requests via Dynamo-aware path
    E->>E: KV routing per config
  else epp.useDynamo = false
    Note over E,G: Black-box routing
    G->>E: Route requests without KV-aware features
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I twitch my whiskers at charts made new,
Two paths to hop—KV or plain will do.
Helm drums roll, values align,
Configs mount neat, deployments fine.
In Dynamo burrows, keys now gleam—
I thump approval for this routing dream. 🥕🐇

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The pull request description follows the template structure but contains only placeholders and lacks substantive content in the Details and Where should the reviewer start sections, so it does not inform reviewers about the actual changes or guidance on where to focus their review. Please complete the Details section with a summary of the actual documentation and helm chart changes, and fill in the Where should the reviewer start section by listing key files or areas of interest impacted by this PR.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly summarizes the primary change by indicating that the GAIE blackbox deployment is now treated as a backup option, which aligns with the core update of defaulting to Dynamo KV routing while preserving the blackbox path. It is concise, clear, and focused on the main functional shift introduced in the changeset.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
deploy/inference-gateway/README.md (1)

108-116: Minor typos

  • “the the HuggingFace token” → “the HuggingFace token”.
  • “for you model” → “for your model”.
🧹 Nitpick comments (8)
deploy/inference-gateway/helm/dynamo-gaie/values.yaml (2)

51-54: Potential footgun: useDynamo defaults to true but image is standard EPP

If users don’t override extension.image, useDynamo=true will run the standard EPP image (no Dynamo plugin). Consider documenting prominently that extension.image must be set to the Dynamo EPP image when useDynamo=true, or default useDynamo to false here.


65-73: Clarify mandatory fields when useDynamo is true

Add a comment that epp.dynamo.namespace and epp.dynamo.kvBlockSize are required when useDynamo=true, mirroring README guidance.

deploy/inference-gateway/helm/dynamo-gaie/templates/dynamo-epp.yaml (4)

68-75: Make kvBlockSize required when useDynamo=true

Per README, DYNAMO_KV_BLOCK_SIZE is mandatory. Use Helm required() to avoid a silent default.

Apply this diff:

-{{- $kv := default "16" .Values.epp.dynamo.kvBlockSize -}}
+{{- $kv := required "set epp.dynamo.kvBlockSize via values" .Values.epp.dynamo.kvBlockSize -}}

28-33: Roll pods on ConfigMap changes

Add a checksum/annotation for the EPP config to trigger rollout when the ConfigMap content changes.

Example insertion under spec.template.metadata:

       labels:
         app: {{ .Values.model.shortName }}-epp
+      annotations:
+        checksum/epp-config: {{ include (print $.Template.BasePath "/epp-config-dynamo.yaml") . | sha256sum }}

Adjust the include path to your actual ConfigMap template name.


42-66: Validate image when useDynamo is enabled

Chart always uses .Values.extension.image. If useDynamo=true but image isn’t the Dynamo-enabled EPP, startup may fail. At minimum, add a values/README note; optionally gate a warning/fail in template.


101-118: Add resource requests/limits and securityContext

Define requests/limits and a minimal securityContext (runAsNonRoot, drop ALL) for stability and security.

If desired, I can propose concrete values.

deploy/inference-gateway/README.md (2)

221-227: Standard EPP image registry inconsistency

README uses us-central1-docker.pkg.dev/k8s-staging-images/... while values.yaml defaults to us-central1-docker.pkg.dev/k8s-artifacts-prod/images/....

Pick one (prefer “artifacts-prod”) to avoid confusion.


174-176: Fix heading level/style (markdownlint: MD001, MD003)

Decrement this heading to H4 (####) and use standard atx style (no closing hashes).

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d4ff6f0 and 9c33b4b.

📒 Files selected for processing (4)
  • deploy/inference-gateway/README.md (11 hunks)
  • deploy/inference-gateway/helm/dynamo-gaie/templates/dynamo-epp.yaml (3 hunks)
  • deploy/inference-gateway/helm/dynamo-gaie/values.yaml (2 hunks)
  • deploy/inference-gateway/values-dynamo-epp.yaml (2 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
deploy/inference-gateway/README.md

174-174: Heading levels should only increment by one level at a time
Expected: h4; Actual: h5

(MD001, heading-increment)


174-174: Heading style
Expected: atx; Actual: atx_closed

(MD003, heading-style)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: trtllm (arm64)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (6)
deploy/inference-gateway/values-dynamo-epp.yaml (2)

16-18: Image name consistency with README

README references nvcr.io/.../epp-inference-extension-dynamo:v0.6.0-1, but this file uses .../gaie-epp-dynamo:v0.6.0-1. Align them to avoid pull errors.

Would you like me to update either the README or this values file to a single, verified image name?


25-31: Ensure kvBlockSize matches model block size

"DYNAMO_KV_BLOCK_SIZE" must match the model’s block size; "16" is just an example. Set this to the actual value to prevent routing issues.

deploy/inference-gateway/README.md (3)

130-134: Dynamo EPP image name mismatch

Here: nvcr.io/.../epp-inference-extension-dynamo:v0.6.0-1; values-dynamo-epp.yaml uses .../gaie-epp-dynamo:v0.6.0-1. Standardize to one verified image.

Would you like me to update both spots to the correct, published image?


206-209: build-epp-dynamo.sh is present and executable


36-41: Referenced script exists: install_gaie_crd_kgateway.sh is present in deploy/inference-gateway.

deploy/inference-gateway/helm/dynamo-gaie/templates/dynamo-epp.yaml (1)

119-127: ConfigMap and key presence confirmed
A ConfigMap named {{ include "dynamo-gaie.fullname" . }}-epp-config is defined in templates/epp-configmap.yaml with the epp-config-dynamo.yaml key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant