Skip to content

[pull] main from llm-d:main#129

Open
pull[bot] wants to merge 70 commits intoopendatahub-io:mainfrom
llm-d:main
Open

[pull] main from llm-d:main#129
pull[bot] wants to merge 70 commits intoopendatahub-io:mainfrom
llm-d:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull bot commented Feb 18, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

* feat: update to use new kv-cache UDS tokenizer

- change preprocessing to types from kv-cache
- add new unit test case: same tests from the old ones
- keep old test case but mark it wont be used(for now can be removed
  later)
- add new make target to build UDS image
- update image to use the one from llm-d in deploy
- remove parts in Dockerfile to only build go code

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* update: more changes for UDS in makefile and docs

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* update: add comments for download-tokenizer and remove as dependecy to
build

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* GHAction: remove lint-and-test which still using python

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* update: fix rebase

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* fix: lint with 2.8.0

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* update: code review

- remove env variable
	LDFLAGS
	PYTHON_CONFIG
	CGO_CFLAGS
	TOKENIZER_ARCH
	PYTHON_VERSION
	epp_* and sidecar_* for CGO
- update documentation
- remove make targets related to tokenizer, pythone

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

---------

Signed-off-by: Wen Zhou <wenzhou@redhat.com>
@pull pull bot locked and limited conversation to collaborators Feb 18, 2026
@pull pull bot added ⤵️ pull merge-conflict Resolve conflicts manually labels Feb 18, 2026
Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
* Fix panic in SGLang proxy handling of concurrent requests

Signed-off-by: YANG LI <yangligt@google.com>

* Add concurrency unit test for SGLang context logic

Signed-off-by: YANG LI <yangligt@google.com>

---------

Signed-off-by: YANG LI <yangligt@google.com>
* Add opentelemetry tracing

    Add centralized telemetry package and custom spans
    following the llm-d distributed tracing proposal.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* update Dockerfile.sidecar

Signed-off-by: sallyom <somalley@redhat.com>

* tracing: remove extra success results & startup spans and cleanup

Signed-off-by: sallyom <somalley@redhat.com>

* fix: avoid os.Exit bypassing defer in main

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* fix: address review nits for tracing PR

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* test: add edge case tests for StripScheme

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* remove extra comments from sidecar spans

Signed-off-by: sallyom <somalley@redhat.com>

* fix lint error

Signed-off-by: sallyom <somalley@redhat.com>

* protect against segfault on tests

Signed-off-by: greg pereira <grpereir@redhat.com>

---------

Signed-off-by: sallyom <somalley@redhat.com>
Signed-off-by: greg pereira <grpereir@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: greg pereira <grpereir@redhat.com>
vMaroon and others added 12 commits February 27, 2026 06:59
Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>
Signed-off-by: greg pereira <grpereir@redhat.com>
Bumps [go.opentelemetry.io/otel/sdk](https://github.com/open-telemetry/opentelemetry-go) from 1.39.0 to 1.40.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](open-telemetry/opentelemetry-go@v1.39.0...v1.40.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/sdk
  dependency-version: 1.40.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…dates (#662)

Bumps the go-dependencies group with 2 updates in the / directory: [go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp](https://github.com/open-telemetry/opentelemetry-go-contrib) and [go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc](https://github.com/open-telemetry/opentelemetry-go).


Updates `go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp` from 0.64.0 to 0.65.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-go-contrib/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/CHANGELOG.md)
- [Commits](open-telemetry/opentelemetry-go-contrib@zpages/v0.64.0...zpages/v0.65.0)

Updates `go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc` from 1.39.0 to 1.40.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](open-telemetry/opentelemetry-go@v1.39.0...v1.40.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp
  dependency-version: 0.65.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: go-dependencies
- dependency-name: go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc
  dependency-version: 1.40.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: go-dependencies
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Guangya Liu <gyliu513@gmail.com>
Signed-off-by: learner0810 <zhongjun.li@daocloud.io>
…build (#664)

Signed-off-by: Guangya Liu <gyliu513@gmail.com>
Signed-off-by: Guangya Liu <gyliu513@gmail.com>
Bumps the kubernetes group with 5 updates:

| Package | From | To |
| --- | --- | --- |
| [k8s.io/api](https://github.com/kubernetes/api) | `0.34.4` | `0.34.5` |
| [k8s.io/apiextensions-apiserver](https://github.com/kubernetes/apiextensions-apiserver) | `0.34.4` | `0.34.5` |
| [k8s.io/apimachinery](https://github.com/kubernetes/apimachinery) | `0.34.4` | `0.34.5` |
| [k8s.io/client-go](https://github.com/kubernetes/client-go) | `0.34.4` | `0.34.5` |
| [k8s.io/component-base](https://github.com/kubernetes/component-base) | `0.34.4` | `0.34.5` |


Updates `k8s.io/api` from 0.34.4 to 0.34.5
- [Commits](kubernetes/api@v0.34.4...v0.34.5)

Updates `k8s.io/apiextensions-apiserver` from 0.34.4 to 0.34.5
- [Release notes](https://github.com/kubernetes/apiextensions-apiserver/releases)
- [Commits](kubernetes/apiextensions-apiserver@v0.34.4...v0.34.5)

Updates `k8s.io/apimachinery` from 0.34.4 to 0.34.5
- [Commits](kubernetes/apimachinery@v0.34.4...v0.34.5)

Updates `k8s.io/client-go` from 0.34.4 to 0.34.5
- [Changelog](https://github.com/kubernetes/client-go/blob/master/CHANGELOG.md)
- [Commits](kubernetes/client-go@v0.34.4...v0.34.5)

Updates `k8s.io/component-base` from 0.34.4 to 0.34.5
- [Commits](kubernetes/component-base@v0.34.4...v0.34.5)

---
updated-dependencies:
- dependency-name: k8s.io/api
  dependency-version: 0.34.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: kubernetes
- dependency-name: k8s.io/apiextensions-apiserver
  dependency-version: 0.34.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: kubernetes
- dependency-name: k8s.io/apimachinery
  dependency-version: 0.34.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: kubernetes
- dependency-name: k8s.io/client-go
  dependency-version: 0.34.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: kubernetes
- dependency-name: k8s.io/component-base
  dependency-version: 0.34.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: kubernetes
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…dates (#674)

Bumps the go-dependencies group with 2 updates in the / directory: [go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp](https://github.com/open-telemetry/opentelemetry-go-contrib) and [go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc](https://github.com/open-telemetry/opentelemetry-go).


Updates `go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp` from 0.65.0 to 0.66.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-go-contrib/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/CHANGELOG.md)
- [Commits](open-telemetry/opentelemetry-go-contrib@zpages/v0.65.0...zpages/v0.66.0)

Updates `go.opentelemetry.io/otel` from 1.40.0 to 1.41.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](open-telemetry/opentelemetry-go@v1.40.0...v1.41.0)

Updates `go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc` from 1.40.0 to 1.41.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](open-telemetry/opentelemetry-go@v1.40.0...v1.41.0)

Updates `go.opentelemetry.io/otel/sdk` from 1.40.0 to 1.41.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](open-telemetry/opentelemetry-go@v1.40.0...v1.41.0)

Updates `go.opentelemetry.io/otel/trace` from 1.40.0 to 1.41.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](open-telemetry/opentelemetry-go@v1.40.0...v1.41.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp
  dependency-version: 0.66.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: go-dependencies
- dependency-name: go.opentelemetry.io/otel
  dependency-version: 1.41.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: go-dependencies
- dependency-name: go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc
  dependency-version: 1.41.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: go-dependencies
- dependency-name: go.opentelemetry.io/otel/sdk
  dependency-version: 1.41.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: go-dependencies
- dependency-name: go.opentelemetry.io/otel/trace
  dependency-version: 1.41.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: go-dependencies
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [lycheeverse/lychee-action](https://github.com/lycheeverse/lychee-action) from 2.7.0 to 2.8.0.
- [Release notes](https://github.com/lycheeverse/lychee-action/releases)
- [Commits](lycheeverse/lychee-action@v2.7.0...v2.8.0)

---
updated-dependencies:
- dependency-name: lycheeverse/lychee-action
  dependency-version: 2.8.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* ci: add dev image workflow for main and release branches

Build and push -dev variants of EPP and sidecar container images
on pushes to main and release-* branches, tagged with commit SHA.

Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>

* ci: extract reusable build workflow and tag dev images by branch

Refactor ci-release and ci-dev to call a shared ci-build-images
reusable workflow, reducing duplication. Tag dev images with the
branch name instead of commit SHA so each branch has exactly one
image that gets overwritten on push, avoiding image accumulation.

Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>

* Newlines at EOF

Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>

---------

Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>
@openshift-merge-robot
Copy link
Copy Markdown

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

shmuelk and others added 29 commits March 18, 2026 11:45
…1.28 (#727)

* Corrected configuration file

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* No longer set primaryPort plugin parameter

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Deprecate use of primaryPort plugin parameter

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Deprecate use of the x-data-parallel-host-port header

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Add a real plugin Handle to the DP tests

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

* Make sure user understands that Istio >= 1.28.1 is needed

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>

---------

Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
* build: remove CGO dependency by migrating to pure-Go ZMQ

Update llm-d-kv-cache to v0.6.1-0.20260317063900-80aba2cb5a99 (main snapshot),
pending merge of llm-d/llm-d-kv-cache#431 which switches from pebbe/zmq4 (CGO)
to a pure-Go ZMQ implementation.

- go.mod/go.sum: bump kv-cache to current main pseudo-version (placeholder;
  will be updated to a real tag once llm-d/llm-d-kv-cache#431 is merged)
- Makefile: set CGO_ENABLED=0; drop check-dependencies prereq from test targets
- Makefile.tools.mk: remove ##@ Dependencies section (check/install-dependencies)
- Dockerfile.epp: remove EPEL + zeromq install steps; set CGO_ENABLED=0 in build
- DEVELOPMENT.md: remove ZeroMQ from prerequisites list
- .github/workflows/ci-pr-checks.yaml: remove CGO configuration and
  install-dependencies steps; remove CGO env vars from lint step

NOTE: This commit is intentionally draft — go.mod must be updated to the
tagged kv-cache release that includes the pure-Go ZMQ changes before merging.

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

* update kv-cache (pure Go zmq, tip of main)

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

* revert to micro image once CGO is disabled

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

---------

Signed-off-by: Etai Lev Ran <elevran@gmail.com>
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.79.2 to 1.79.3.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](grpc/grpc-go@v1.79.2...v1.79.3)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-version: 1.79.3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Etai Lev Ran <elevran@gmail.com>
Add idleThreshold and maxBusyScore parameters to create a scoring gap
between idle and busy pods, helping distribute prefix cache warmup.

- idleThreshold: max requests to be considered idle (default: 0)
- maxBusyScore: max score for busy pods (default: 1.0 for current behavior)

Examples:
- Binary mode: idleThreshold=0, maxBusyScore=0 (idle=1.0, busy=0.0)
- Hybrid mode: idleThreshold=0, maxBusyScore=0.5 (idle=1.0, busy=0-0.5)
- Flexible: idleThreshold=2, maxBusyScore=0.5 (≤2 req = idle)

Signed-off-by: David Whyte-Gray <40244437+dagrayvid@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
* Docker build enhancements

- reorder steps so go.mod/go.sum are in a cacheable layer (ie when not modified)
- strip debug info by default (stacktraces are NOT affected). Can override with LD_FLAGS
- cache CICD on GH action
- other nits and clean ups (e.g., unused Python arg, comments)

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

* pass GO build vars inline, not as ENV settings

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

* allow EPP and sidecar images to run concurrently

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

---------

Signed-off-by: Etai Lev Ran <elevran@gmail.com>
* feat: speculative indexing for PrecisePrefixCacheScorer

Signed-off-by: bongwoobak <bongwoobak@gmail.com>

* fix: use ip:port format for PodIdentifier to match KV event topics

Signed-off-by: bongwoobak <bongwoobak@gmail.com>

* fix: split Address/Port in test endpoints to match production ip:port format

Signed-off-by: bongwoobak <bongwoobak@gmail.com>

* make SpeculativeIndexing optional

Signed-off-by: bongwoobak <bongwoobak@gmail.com>

* feat: use confirmed-only scores for PrefixCacheServers cycle state

Signed-off-by: bongwoobak <bongwoobak@gmail.com>

* refactor: update PodEntry usage for Annotations struct

Signed-off-by: bongwoobak <bongwoobak@gmail.com>

* refactor: replace speculativeCache.Start() with cleanCachePeriodically()

Signed-off-by: bongwoobak <bongwoobak@gmail.com>

* fix: add nil metadata guard in PrepareRequestData

Signed-off-by: bongwoobak <bongwoobak@gmail.com>

* refactor: remove confirmedScores and simplify Annotations usage

Signed-off-by: bongwoobak <bongwoobak@gmail.com>

* fix: adapt to NewChunkedTokenDatabase signature change from PR 415

Signed-off-by: bongwoobak <bongwoobak@gmail.com>

* refactor: use KVBlockScorer for scoring and human-readable speculativeTTL

Replace computeScoresFromKeyToPods with kvcache.KVBlockScorer.Score()
to properly integrate device-backend weight configuration.

Change SpeculativeTTL from time.Duration (nanoseconds) to string,
parsed via time.ParseDuration, for human-readable config values
like "2s" or "500ms".

Compact docs/configuration.md per review feedback.

Signed-off-by: bongwoobak <bongwoobak@gmail.com>

* fix: gofmt import ordering in precise_prefix_cache.go

Signed-off-by: bongwoobak <bongwoobak@gmail.com>

* docs: move speculative indexing config into architecture.md

Signed-off-by: bongwoobak <bongwoobak@gmail.com>

---------

Signed-off-by: bongwoobak <bongwoobak@gmail.com>
Signed-off-by: roytman <roytman@il.ibm.com>
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
* The UDS tokenizer sidecar image is now treated as an external dependency built and
published by [llm-d-kv-cache](https://github.com/llm-d/llm-d-kv-cache).
A companion PR (llm-d/llm-d-kv-cache#436) adds a workflow
to publishe `ghcr.io/llm-d/llm-d-uds-tokenizer:dev` on every push to `main`,
matching the `:dev` default used here.

Changes:
- Removed the `image-build-uds-tokenizer` Makefile target.
- `image-build` and `test-e2e` no longer depend on building the tokenizer image;
  `image-pull` already handles pulling it.
- UDS tokenizer image variables to follow the same pattern as other images.
  Variables and  overrides work consistently across all scripts and Makefile targets.
- Removed `jlumbroso/free-disk-space@main` as it is no longer needed after the CGO/ZMQ and Python removal.

Depends on llm-d/llm-d-kv-cache#436
The companion kv-cache PR must merge (or the `:dev` tag must exist in GHCR) before
 `make image-pull` or `make env-dev-kind` will succeed without an explicit
 `UDS_TOKENIZER_IMAGE` override.

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

* standardize image loading across tools and architectures

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

---------

Signed-off-by: Etai Lev Ran <elevran@gmail.com>
Bumps the kubernetes group with 5 updates:

| Package | From | To |
| --- | --- | --- |
| [k8s.io/api](https://github.com/kubernetes/api) | `0.35.2` | `0.35.3` |
| [k8s.io/apiextensions-apiserver](https://github.com/kubernetes/apiextensions-apiserver) | `0.35.2` | `0.35.3` |
| [k8s.io/apimachinery](https://github.com/kubernetes/apimachinery) | `0.35.2` | `0.35.3` |
| [k8s.io/client-go](https://github.com/kubernetes/client-go) | `0.35.2` | `0.35.3` |
| [k8s.io/component-base](https://github.com/kubernetes/component-base) | `0.35.2` | `0.35.3` |


Updates `k8s.io/api` from 0.35.2 to 0.35.3
- [Commits](kubernetes/api@v0.35.2...v0.35.3)

Updates `k8s.io/apiextensions-apiserver` from 0.35.2 to 0.35.3
- [Release notes](https://github.com/kubernetes/apiextensions-apiserver/releases)
- [Commits](kubernetes/apiextensions-apiserver@v0.35.2...v0.35.3)

Updates `k8s.io/apimachinery` from 0.35.2 to 0.35.3
- [Commits](kubernetes/apimachinery@v0.35.2...v0.35.3)

Updates `k8s.io/client-go` from 0.35.2 to 0.35.3
- [Changelog](https://github.com/kubernetes/client-go/blob/master/CHANGELOG.md)
- [Commits](kubernetes/client-go@v0.35.2...v0.35.3)

Updates `k8s.io/component-base` from 0.35.2 to 0.35.3
- [Commits](kubernetes/component-base@v0.35.2...v0.35.3)

---
updated-dependencies:
- dependency-name: k8s.io/api
  dependency-version: 0.35.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: kubernetes
- dependency-name: k8s.io/apiextensions-apiserver
  dependency-version: 0.35.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: kubernetes
- dependency-name: k8s.io/apimachinery
  dependency-version: 0.35.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: kubernetes
- dependency-name: k8s.io/client-go
  dependency-version: 0.35.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: kubernetes
- dependency-name: k8s.io/component-base
  dependency-version: 0.35.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: kubernetes
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* test: add disruption e2e tests for scheduler failure scenarios

Signed-off-by: Sam Batschelet <sbatsche@redhat.com>

* test: address disruption test flakes

Signed-off-by: Sam Batschelet <sbatsche@redhat.com>

---------

Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
* Initial E/PD extension of the plugins.

Signed-off-by: Revital Sur <eres@il.ibm.com>

* Changes to Pick function.

Signed-off-by: Revital Sur <eres@il.ibm.com>

* Minor fix.

Signed-off-by: Revital Sur <eres@il.ibm.com>

* Rename files and structs.

Signed-off-by: Revital Sur <eres@il.ibm.com>

* Rename ed_prerequest.go

Signed-off-by: Revital Sur <eres@il.ibm.com>

* Fix lint errors.

Signed-off-by: Revital Sur <eres@il.ibm.com>

* initial implementation of unified handler

Signed-off-by: roytman <roytman@il.ibm.com>

* clean up code, renaming

Signed-off-by: roytman <roytman@il.ibm.com>

* fix renaming in common

Signed-off-by: roytman <roytman@il.ibm.com>

* remove prefixPluginTypedName

Signed-off-by: roytman <roytman@il.ibm.com>

* update according review comments

Signed-off-by: roytman <roytman@il.ibm.com>

* fix lint, addd deciderPlugin.go and prefix_based_pd_decider_test.go

Signed-off-by: roytman <roytman@il.ibm.com>

* add depricated to pd_profile_handler and remove PrefixPluginName, PrefixPluginType from disagg_profile_handler

Signed-off-by: roytman <roytman@il.ibm.com>

* fix lint

Signed-off-by: roytman <roytman@il.ibm.com>

* Initialize all profile defaults in DisaggProfileHandlerFactory struct.

Signed-off-by: Revital Sur <eres@il.ibm.com>

* Add nil guards to profile handlers

Signed-off-by: roytman <roytman@il.ibm.com>

* Add e2e test for disagg-profile-handler PD config.

Signed-off-by: Revital Sur <eres@il.ibm.com>

* Add e2e tests.

Signed-off-by: Revital Sur <eres@il.ibm.com>

* Fix e2e tests.

Signed-off-by: Revital Sur <eres@il.ibm.com>

* Address review comments.

Signed-off-by: Revital Sur <eres@il.ibm.com>

* Fix lint error.

Signed-off-by: Revital Sur <eres@il.ibm.com>

* add comment about linter

Signed-off-by: roytman <roytman@il.ibm.com>

* add disagg_metrics.go, depricate metrics.go.

Signed-off-by: roytman <roytman@il.ibm.com>

* update comment

Signed-off-by: roytman <roytman@il.ibm.com>

* Update pkg/plugins/pre-request/encode_prerequest.go

Co-authored-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Alexey Roytman <roytman@il.ibm.com>

* fix e2e tests and comments

Signed-off-by: roytman <roytman@il.ibm.com>

* fix comments

Signed-off-by: roytman <roytman@il.ibm.com>

* update documentation

Signed-off-by: roytman <roytman@il.ibm.com>

* update metrics

Signed-off-by: roytman <roytman@il.ibm.com>

* update profile tests

Signed-off-by: roytman <roytman@il.ibm.com>

* update architecture.md

Signed-off-by: roytman <roytman@il.ibm.com>

* fix NewPrefillRole update architecture.md

Signed-off-by: roytman <roytman@il.ibm.com>

* remove hashBlockSize and replace blockSize by blockSizeTokens

Signed-off-by: roytman <roytman@il.ibm.com>

* Update epp yamls in deploy/config.

Signed-off-by: Revital Sur <eres@il.ibm.com>

* fix comments

Signed-off-by: roytman <roytman@il.ibm.com>

* fix test errors (Consume)

Signed-off-by: roytman <roytman@il.ibm.com>

* replace em dashes with en dashes

Signed-off-by: roytman <roytman@il.ibm.com>

* add log messages in DisaggProfileHandlerFactory when there is no deciders

Signed-off-by: roytman <roytman@il.ibm.com>

* combine roles, add map based parameters to disagg_profile_handler

Signed-off-by: roytman <roytman@il.ibm.com>

* separate the new (map-based) and the old string flat APIs

Signed-off-by: roytman <roytman@il.ibm.com>

---------

Signed-off-by: Revital Sur <eres@il.ibm.com>
Signed-off-by: roytman <roytman@il.ibm.com>
Signed-off-by: Alexey Roytman <roytman@il.ibm.com>
Co-authored-by: Revital Sur <eres@il.ibm.com>
Co-authored-by: Shmuel Kallner <kallner@il.ibm.com>
* feat: add test coverage reporting with baseline comparison

Adds Makefile targets and CI workflow steps to collect Go unit test
coverage and compare it against the main-branch baseline on PRs.

- `make test` runs unit tests with -coverprofile for epp and sidecar components; `test-integration` covers integration tests
- `make coverage-compare` diffs current coverage against main (builds a baseline in a temporary git worktree locally; uses a restored cache in CI)
- `make coverage-report` generates HTML reports for browser viewing
- scripts/compare-coverage.sh produces a markdown table and writes it to
  the GitHub Actions Job Summary (visible to all users on public repos)
- CI: on PR, restores coverage-main cache as baseline and runs the compare;
  on push to main, saves the new coverage to the coverage-main cache key
  (single entry, auto-expires after 7 days, negligible storage cost)

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

* Scope `cd` to a subshell `( ... )` so the parent shell stays in the repo root after the worktree is removed.

Needed since cd in the parent could switch to a REF that does not include the coverage script...

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

* use the same path for save/restore of coverage files

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

* Run tests with coverage always enabled

Makefile
  - Removed COVERAGE ?= opt-in variable
  - test-*: always run with -race -coverprofile -covermode=atomic; always prints coverage summary

  DEVELOPMENT.md:
  - Consolidated testing subsections into one, noting coverage and race detection are always on

  CI workflow (ci-pr-checks.yaml):
  - Changed to use make test since that includes coverage

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

* Data races fixed in pkg/sidecar/proxy

1. proxy_helpers.go / proxy.go / test files: added readyCh chan struct{} to Server, closed after net.Listen sets s.addr; all 6 test sites replace
  time.Sleep(1s) + Expect(addr) with <-proxy.readyCh
2. data_parallel.go: captured s.decoderURL.Scheme before goroutine launch to avoid concurrent read
3. proxy.go (Clone()): removed decoderURL and decoderProxy from the copy — startDataParallel always sets them explicitly after cloning
4. connector_sglang_test.go: changed var prefillFinished bool to atomic.Bool

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

---------

Signed-off-by: Etai Lev Ran <elevran@gmail.com>
* sidecar: embed Config in Options, move port and target URL into Config

Config now holds the complete runtime configuration including port,
target URL, SSRF fields, and renamed TLS fields. Options embeds Config
so Complete() populates it directly. NewProxy takes a single Config.
Start constructs AllowlistValidator from config internally. main.go no
longer builds Config manually or calls NewAllowlistValidator directly.

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

* sidecar: update tests for new NewProxy and Start signatures

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

* config: add String via MarshalJSON

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

* simplify code

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

* Port and URL are already in Options, removed from Server

  - Removed port and decoderURL fields from Server struct (they duplicated config.Port and config.TargetURL)
  - Removed the two lines in NewProxy (L186-187) that initialized them
  - Removed them from Clone() as well — the config copy already carries the values
  - Replaced all s.port / s.decoderURL / clone.port / clone.decoderURL references with s.config.Port / s.config.TargetURL across proxy.go,
  proxy_helpers.go, data_parallel.go, connector_sglang.go, connector_nixlv2.go, and the test file

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

* change TargetURL to DecodeURL; make Options fields private

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

* fix race in sidecar, reorder imports in test file

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

---------

Signed-off-by: Etai Lev Ran <elevran@gmail.com>
* make coverage comparison optional

Signed-off-by: Etai Lev Ran <elevran@gmail.com>
Co-authored-by: Alexey Roytman <roytman@il.ibm.com>

* Handle failure cases in extract_total: missing, empty and corrupt files.

In all these cases the function returns "", and the caller handles that as missing data and not a script abort.

Signed-off-by: Etai Lev Ran <elevran@gmail.com>

---------

Signed-off-by: Etai Lev Ran <elevran@gmail.com>
Co-authored-by: Alexey Roytman <roytman@il.ibm.com>
Signed-off-by: roytman <roytman@il.ibm.com>
…saggHeadersHandler (#758)

* combine EncodeHeaderHandler PrefillHeaderHandler to DisaggHeadersHandler

Signed-off-by: roytman <roytman@il.ibm.com>

* rename disagg_prerequest.go to disagg_headers_handler.go

Signed-off-by: roytman <roytman@il.ibm.com>

---------

Signed-off-by: roytman <roytman@il.ibm.com>
* Prevent mismatch between new and deprecated APIs

Signed-off-by: roytman <roytman@il.ibm.com>

* simplify DisaggProfileHandlerFactory

Signed-off-by: roytman <roytman@il.ibm.com>

---------

Signed-off-by: roytman <roytman@il.ibm.com>
)

* - import IGW 1.4.0
- turn tokenizer PrepareData plugin into a scorer and wrap the latter in no-build flag

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

* address review

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

---------

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>
Signed-off-by: Guangya Liu <gyliu513@gmail.com>
* Containerize build system with builder image

Introduce Dockerfile.builder based on the same Go image as production,
with all build tools (golangci-lint, typos, kind, kubectl, podman)
at pinned versions.

All build, test, lint, and format targets now run inside the builder
container. Go module and build caches use named volumes for persistence
across runs. Host requirements are reduced to a container runtime and
git. kubectl and envsubst are only needed for cluster deployment targets.

Signed-off-by: Antonio Cardace <acardace@redhat.com>

* CI: remove host Go/lint setup, use containerized targets

All CI steps now delegate to make targets that run inside the builder
container. Remove Go setup, go mod tidy, and golangci-lint-action.
Expand paths filter to cover Dockerfile.builder, go.sum, and
.golangci.yml.

Signed-off-by: Antonio Cardace <acardace@redhat.com>

* Fix issues detected by linter

Signed-off-by: Antonio Cardace <acardace@redhat.com>

* Add builder shell targets for debugging

Signed-off-by: Antonio Cardace <acardace@redhat.com>

* e2e-tests: load images into kind directly to save space

Signed-off-by: Antonio Cardace <acardace@redhat.com>

---------

Signed-off-by: Antonio Cardace <acardace@redhat.com>
Signed-off-by: roytman <roytman@il.ibm.com>
* implement context-length-aware plugin (scorer/filter)

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

* fix newline

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

* apply review suggestion

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

* code improvements

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

* Update pkg/plugins/multi/context_length_aware_bench_test.go

Co-authored-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Maroon Ayoub <Maroonay@gmail.com>

* refactor UX

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

* add tokenization and fix tests

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

* use tokenizer plugin

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

* fix lint

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

* implement proximity for out-of-range requests

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

* addressed comments

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

* remove support for multiple ranges per pod

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

* rebase on tokenizer PrepareData -> Scorer change

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

* addressed comments

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

---------

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>
Signed-off-by: Maroon Ayoub <Maroonay@gmail.com>
Co-authored-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
* change simulator version to v0.8.1

Signed-off-by: Maya Barnea <mayab@il.ibm.com>

* update simulator version

Signed-off-by: Maya Barnea <mayab@il.ibm.com>

* make e2e test timeout 20min

Signed-off-by: Maya Barnea <mayab@il.ibm.com>

---------

Signed-off-by: Maya Barnea <mayab@il.ibm.com>
Signed-off-by: Etai Lev Ran <elevran@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.