Skip to content

Conversation

atchernych
Copy link
Contributor

@atchernych atchernych commented Oct 10, 2025

Overview:

DEP-504

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • New Features

    • Added Dynamo-powered plugins: a pre-request worker ID injector and a KV-aware scorer to enhance routing and scoring capabilities.
  • Chores

    • Introduced a multi-stage Docker build with a minimal, non-root runtime including required libraries and certificates.
    • Updated build scripts and messaging to align with GAIE terminology, with validation for required Docker artifacts.
    • Bumped version to v0.5.1.

Signed-off-by: Anna Tchernych <[email protected]>
@atchernych atchernych requested review from a team as code owners October 10, 2025 19:05
@github-actions github-actions bot added the feat label Oct 10, 2025
Signed-off-by: Anna Tchernych <[email protected]>
Copy link
Contributor

coderabbitai bot commented Oct 10, 2025

Walkthrough

Introduces a multi-stage Dockerfile for building/running the GAIE EPP binary, updates the GAIE build script to use GAIE paths and copy a Dockerfile artifact, and applies a patch adding Dynamo-based pre-request and KV scorer plugins plus a version bump to v0.5.1.

Changes

Cohort / File(s) Summary of changes
Container build artifact
container/Dockerfile.epp
Adds multi-stage Dockerfile: Go-based builder compiles EPP with ldflags; runtime on minimal Ubuntu installs deps, sets nonroot user, copies /epp, sets entrypoint.
GAIE build script adjustments
deploy/inference-gateway/build-epp-dynamo.sh
Replaces EPP_DIR with GAIE_DIR; updates paths/messages; prepares dirs; copies headers, libs, and Dockerfile.epp; validates copied Dockerfile; applies patches in GAIE_DIR; updates build messages.
Dynamo plugins + version bump (patch)
deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch
Adds public Dynamo pre-request plugin (inject worker ID) and KV-aware scorer plugin with factories and a TypedName method; updates BundleVersion to v0.5.1.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CLI as Build Script (GAIE)
  participant Git as Patch Set
  participant Docker as Docker Builder
  participant Img as GAIE EPP Image

  CLI->>CLI: Prepare GAIE dirs (lib/include)
  CLI->>CLI: Copy headers, libs, Dockerfile.epp
  CLI->>Git: Apply epp-v0.5.1-dyn2.patch
  alt Patch applied
    CLI-->>CLI: Proceed
  else Patch already applied / skipped
    CLI-->>CLI: Continue without error
  end
  CLI->>Docker: docker build -f container/Dockerfile.epp
  Docker->>Docker: Build stage (golang 1.24) compile epp
  Docker->>Docker: Runtime stage (ubuntu) add deps, user
  Docker-->>Img: Produce GAIE EPP image
Loading
sequenceDiagram
  autonumber
  participant Client
  participant Gateway as Inference Gateway
  participant PreReq as Pre-Request Plugin (dynamo_inject_workerid)
  participant Router as Router
  participant Scorer as KV Scorer (dynamo_kv_scorer)
  participant Backend

  Client->>Gateway: Request
  Gateway->>PreReq: Execute pre-request
  note right of PreReq: Inject worker ID from Dynamo
  PreReq-->>Gateway: Mutated request
  Gateway->>Router: Route candidates
  Router->>Scorer: Score candidates (KV-aware)
  note right of Scorer: Uses KV signals for scoring
  Scorer-->>Router: Ranked candidates
  Router->>Backend: Dispatch
  Backend-->>Client: Response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I thump my paws—new plugs in tow,
A Docker burrow built to go.
KV whispers guide the way,
Worker IDs join the fray.
Version hops to five-point-one,
GAIE’s gears now neatly run—
Carrots compiled, ship it, done! 🥕🚀

Pre-merge checks

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The provided title “feat: move the EPP build docker file” is too narrow and doesn’t capture the primary changes in this pull request, which include adding a multi-stage build Dockerfile, updating GAIE integration scripts, and introducing new Dynamo plugins. The wording suggests only a file move rather than a significant feature addition and refactoring. As a result it misrepresents the scope and intent of the changeset. Please update the title to concisely reflect the main feature and scope of the changes, for example “feat: introduce multi-stage GAIE Dockerfile build and Dynamo-based EPP plugins” so that it clearly summarizes the core updates.
Description Check ⚠️ Warning The pull request description contains only a placeholder overview and empty template sections, without any actual details about the changes made or guidance on where reviewers should start. It fails to describe the modifications to the Dockerfile, build scripts, patch files, and plugin code, leaving the reader without context. The related issues section also references a placeholder rather than a valid issue number. Please complete the template sections by providing a meaningful overview of the work, summarizing the specific changes in the “Details” section, indicating key files or components in “Where should the reviewer start?”, and verifying that the “Related Issues” entry references the actual issue number.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch (2)

27-38: Dockerfile name mismatch: Makefile expects Dockerfile.dynamo but repo/script provide Dockerfile.epp

Target uses “-f Dockerfile.dynamo” while the artifact is Dockerfile.epp. Builds will fail.

Options:

  • Rename/copy the file as Dockerfile.dynamo, or
  • Change the Makefile target to use container/Dockerfile.epp, e.g.:
- $(IMAGE_BUILD_CMD) -f Dockerfile.dynamo -t $(IMAGE_TAG) \
+ $(IMAGE_BUILD_CMD) -f container/Dockerfile.epp -t $(IMAGE_TAG) \

Keep build script and Makefile consistent.


803-806: Per-request state stored on Server struct causes data races and cross-request leakage

workerIDHint/tokenDataHint live on Server and are mutated per request. Process can handle multiple streams concurrently; this invites races and wrong association across requests.

Refactor to store per-request state:

  • Keep hints in a per-stream/request context struct and pass to handler phases, or
  • Use gRPC stream context to carry values between header/body phases, or
  • Echo via ext_proc dynamic metadata between phases.
    Avoid shared mutable fields on Server for request-local data.

Also applies to: 765-792

deploy/inference-gateway/build-epp-dynamo.sh (1)

86-96: Patch path likely incorrect

Repo path shows deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch but script uses epp-patches/v0.5.1-2/...

-PATCH_FILE="${DYNAMO_DIR}/deploy/inference-gateway/epp-patches/v0.5.1-2/epp-v0.5.1-dyn2.patch"
+PATCH_FILE="${DYNAMO_DIR}/deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch"
🧹 Nitpick comments (3)
deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch (2)

615-645: Base64 token_data decode silently ignores malformed inputs

Non-blocking, but consider emitting a debug log when decode/unmarshal fails to aid diagnosis.


172-174: Nit: flag name in error message

Message uses “model-server-metrics-scheme”; actual flag is “modelServerMetricsScheme”. Align for clarity.

- return fmt.Errorf("unexpected %q value for %q flag, it can only be set to 'http' or 'https'", *modelServerMetricsScheme, "model-server-metrics-scheme")
+ return fmt.Errorf("unexpected %q value for %q flag, it can only be set to 'http' or 'https'", *modelServerMetricsScheme, "modelServerMetricsScheme")
deploy/inference-gateway/build-epp-dynamo.sh (1)

17-17: Harden script error handling

Add -u and pipefail to catch unset vars and pipeline failures.

-set -e  # Exit on any error
+set -euo pipefail  # Exit on error, unset vars, and pipeline errors
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a90ada1 and 865c8ef.

📒 Files selected for processing (3)
  • container/Dockerfile.epp (1 hunks)
  • deploy/inference-gateway/build-epp-dynamo.sh (3 hunks)
  • deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: trtllm (amd64)
  • GitHub Check: sglang
  • GitHub Check: vllm (amd64)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (4)
container/Dockerfile.epp (1)

51-52: Verify ldflags package path for CommitSHA/BuildRef

The path points to pkg/epp/metrics. The PR changes show metrics code under pkg/epp/backend/metrics. Ensure the target variables actually exist to avoid “symbol not found” or unused ldflags.

If the variables live elsewhere, update the path accordingly.

deploy/inference-gateway/epp-patches/epp-v0.5.1-2/epp-v0.5.1-dyn2.patch (2)

1306-1324: KV router pipeline global: confirm thread-safety of C API

The pipeline is a shared global used under RLock; multiple concurrent Score calls can hit dynamo_query_worker_selection_and_annotate. Verify the C pipeline is thread-safe; otherwise, serialize calls or create per-goroutine instances.

If not thread-safe, protect with a mutex around the C call or implement a worker pool.

Also applies to: 1476-1527


949-970: Prefix indexer Get: LGTM

Deep-copying the pod set under RLock avoids races. Good fix.

deploy/inference-gateway/build-epp-dynamo.sh (1)

37-48: Validate the produced static lib name

Script copies libdynamo_llm_capi.a after building “-p libdynamo_llm”. Ensure the crate produces libdynamo_llm_capi.a (staticlib). If not, build the correct crate/package.

If needed:

-cargo build --release -p libdynamo_llm
+cargo build --release -p libdynamo_llm_capi

Or adjust the copied filename to match Cargo output.

Also applies to: 58-59

Signed-off-by: Anna Tchernych <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant