Open
Conversation
* feat: update to use new kv-cache UDS tokenizer - change preprocessing to types from kv-cache - add new unit test case: same tests from the old ones - keep old test case but mark it wont be used(for now can be removed later) - add new make target to build UDS image - update image to use the one from llm-d in deploy - remove parts in Dockerfile to only build go code Signed-off-by: Wen Zhou <wenzhou@redhat.com> * update: more changes for UDS in makefile and docs Signed-off-by: Wen Zhou <wenzhou@redhat.com> * update: add comments for download-tokenizer and remove as dependecy to build Signed-off-by: Wen Zhou <wenzhou@redhat.com> * GHAction: remove lint-and-test which still using python Signed-off-by: Wen Zhou <wenzhou@redhat.com> * update: fix rebase Signed-off-by: Wen Zhou <wenzhou@redhat.com> * fix: lint with 2.8.0 Signed-off-by: Wen Zhou <wenzhou@redhat.com> * update: code review - remove env variable LDFLAGS PYTHON_CONFIG CGO_CFLAGS TOKENIZER_ARCH PYTHON_VERSION epp_* and sidecar_* for CGO - update documentation - remove make targets related to tokenizer, pythone Signed-off-by: Wen Zhou <wenzhou@redhat.com> --------- Signed-off-by: Wen Zhou <wenzhou@redhat.com>
Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
* Fix panic in SGLang proxy handling of concurrent requests Signed-off-by: YANG LI <yangligt@google.com> * Add concurrency unit test for SGLang context logic Signed-off-by: YANG LI <yangligt@google.com> --------- Signed-off-by: YANG LI <yangligt@google.com>
* Add opentelemetry tracing
Add centralized telemetry package and custom spans
following the llm-d distributed tracing proposal.
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>
* update Dockerfile.sidecar
Signed-off-by: sallyom <somalley@redhat.com>
* tracing: remove extra success results & startup spans and cleanup
Signed-off-by: sallyom <somalley@redhat.com>
* fix: avoid os.Exit bypassing defer in main
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>
* fix: address review nits for tracing PR
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>
* test: add edge case tests for StripScheme
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>
* remove extra comments from sidecar spans
Signed-off-by: sallyom <somalley@redhat.com>
* fix lint error
Signed-off-by: sallyom <somalley@redhat.com>
* protect against segfault on tests
Signed-off-by: greg pereira <grpereir@redhat.com>
---------
Signed-off-by: sallyom <somalley@redhat.com>
Signed-off-by: greg pereira <grpereir@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: greg pereira <grpereir@redhat.com>
Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>
Signed-off-by: greg pereira <grpereir@redhat.com>
Bumps [go.opentelemetry.io/otel/sdk](https://github.com/open-telemetry/opentelemetry-go) from 1.39.0 to 1.40.0. - [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases) - [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md) - [Commits](open-telemetry/opentelemetry-go@v1.39.0...v1.40.0) --- updated-dependencies: - dependency-name: go.opentelemetry.io/otel/sdk dependency-version: 1.40.0 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…dates (#662) Bumps the go-dependencies group with 2 updates in the / directory: [go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp](https://github.com/open-telemetry/opentelemetry-go-contrib) and [go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc](https://github.com/open-telemetry/opentelemetry-go). Updates `go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp` from 0.64.0 to 0.65.0 - [Release notes](https://github.com/open-telemetry/opentelemetry-go-contrib/releases) - [Changelog](https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/CHANGELOG.md) - [Commits](open-telemetry/opentelemetry-go-contrib@zpages/v0.64.0...zpages/v0.65.0) Updates `go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc` from 1.39.0 to 1.40.0 - [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases) - [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md) - [Commits](open-telemetry/opentelemetry-go@v1.39.0...v1.40.0) --- updated-dependencies: - dependency-name: go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp dependency-version: 0.65.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: go-dependencies - dependency-name: go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc dependency-version: 1.40.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: go-dependencies ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Guangya Liu <gyliu513@gmail.com>
Signed-off-by: learner0810 <zhongjun.li@daocloud.io>
…build (#664) Signed-off-by: Guangya Liu <gyliu513@gmail.com>
Signed-off-by: Guangya Liu <gyliu513@gmail.com>
Bumps the kubernetes group with 5 updates: | Package | From | To | | --- | --- | --- | | [k8s.io/api](https://github.com/kubernetes/api) | `0.34.4` | `0.34.5` | | [k8s.io/apiextensions-apiserver](https://github.com/kubernetes/apiextensions-apiserver) | `0.34.4` | `0.34.5` | | [k8s.io/apimachinery](https://github.com/kubernetes/apimachinery) | `0.34.4` | `0.34.5` | | [k8s.io/client-go](https://github.com/kubernetes/client-go) | `0.34.4` | `0.34.5` | | [k8s.io/component-base](https://github.com/kubernetes/component-base) | `0.34.4` | `0.34.5` | Updates `k8s.io/api` from 0.34.4 to 0.34.5 - [Commits](kubernetes/api@v0.34.4...v0.34.5) Updates `k8s.io/apiextensions-apiserver` from 0.34.4 to 0.34.5 - [Release notes](https://github.com/kubernetes/apiextensions-apiserver/releases) - [Commits](kubernetes/apiextensions-apiserver@v0.34.4...v0.34.5) Updates `k8s.io/apimachinery` from 0.34.4 to 0.34.5 - [Commits](kubernetes/apimachinery@v0.34.4...v0.34.5) Updates `k8s.io/client-go` from 0.34.4 to 0.34.5 - [Changelog](https://github.com/kubernetes/client-go/blob/master/CHANGELOG.md) - [Commits](kubernetes/client-go@v0.34.4...v0.34.5) Updates `k8s.io/component-base` from 0.34.4 to 0.34.5 - [Commits](kubernetes/component-base@v0.34.4...v0.34.5) --- updated-dependencies: - dependency-name: k8s.io/api dependency-version: 0.34.5 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: kubernetes - dependency-name: k8s.io/apiextensions-apiserver dependency-version: 0.34.5 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: kubernetes - dependency-name: k8s.io/apimachinery dependency-version: 0.34.5 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: kubernetes - dependency-name: k8s.io/client-go dependency-version: 0.34.5 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: kubernetes - dependency-name: k8s.io/component-base dependency-version: 0.34.5 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: kubernetes ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…dates (#674) Bumps the go-dependencies group with 2 updates in the / directory: [go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp](https://github.com/open-telemetry/opentelemetry-go-contrib) and [go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc](https://github.com/open-telemetry/opentelemetry-go). Updates `go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp` from 0.65.0 to 0.66.0 - [Release notes](https://github.com/open-telemetry/opentelemetry-go-contrib/releases) - [Changelog](https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/CHANGELOG.md) - [Commits](open-telemetry/opentelemetry-go-contrib@zpages/v0.65.0...zpages/v0.66.0) Updates `go.opentelemetry.io/otel` from 1.40.0 to 1.41.0 - [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases) - [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md) - [Commits](open-telemetry/opentelemetry-go@v1.40.0...v1.41.0) Updates `go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc` from 1.40.0 to 1.41.0 - [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases) - [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md) - [Commits](open-telemetry/opentelemetry-go@v1.40.0...v1.41.0) Updates `go.opentelemetry.io/otel/sdk` from 1.40.0 to 1.41.0 - [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases) - [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md) - [Commits](open-telemetry/opentelemetry-go@v1.40.0...v1.41.0) Updates `go.opentelemetry.io/otel/trace` from 1.40.0 to 1.41.0 - [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases) - [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md) - [Commits](open-telemetry/opentelemetry-go@v1.40.0...v1.41.0) --- updated-dependencies: - dependency-name: go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp dependency-version: 0.66.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: go-dependencies - dependency-name: go.opentelemetry.io/otel dependency-version: 1.41.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: go-dependencies - dependency-name: go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc dependency-version: 1.41.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: go-dependencies - dependency-name: go.opentelemetry.io/otel/sdk dependency-version: 1.41.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: go-dependencies - dependency-name: go.opentelemetry.io/otel/trace dependency-version: 1.41.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: go-dependencies ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [lycheeverse/lychee-action](https://github.com/lycheeverse/lychee-action) from 2.7.0 to 2.8.0. - [Release notes](https://github.com/lycheeverse/lychee-action/releases) - [Commits](lycheeverse/lychee-action@v2.7.0...v2.8.0) --- updated-dependencies: - dependency-name: lycheeverse/lychee-action dependency-version: 2.8.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* ci: add dev image workflow for main and release branches Build and push -dev variants of EPP and sidecar container images on pushes to main and release-* branches, tagged with commit SHA. Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com> * ci: extract reusable build workflow and tag dev images by branch Refactor ci-release and ci-dev to call a shared ci-build-images reusable workflow, reducing duplication. Tag dev images with the branch name instead of commit SHA so each branch has exactly one image that gets overwritten on push, avoiding image accumulation. Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com> * Newlines at EOF Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com> --------- Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
…1.28 (#727) * Corrected configuration file Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * No longer set primaryPort plugin parameter Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Deprecate use of primaryPort plugin parameter Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Deprecate use of the x-data-parallel-host-port header Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Add a real plugin Handle to the DP tests Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> * Make sure user understands that Istio >= 1.28.1 is needed Signed-off-by: Shmuel Kallner <kallner@il.ibm.com> --------- Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
* build: remove CGO dependency by migrating to pure-Go ZMQ Update llm-d-kv-cache to v0.6.1-0.20260317063900-80aba2cb5a99 (main snapshot), pending merge of llm-d/llm-d-kv-cache#431 which switches from pebbe/zmq4 (CGO) to a pure-Go ZMQ implementation. - go.mod/go.sum: bump kv-cache to current main pseudo-version (placeholder; will be updated to a real tag once llm-d/llm-d-kv-cache#431 is merged) - Makefile: set CGO_ENABLED=0; drop check-dependencies prereq from test targets - Makefile.tools.mk: remove ##@ Dependencies section (check/install-dependencies) - Dockerfile.epp: remove EPEL + zeromq install steps; set CGO_ENABLED=0 in build - DEVELOPMENT.md: remove ZeroMQ from prerequisites list - .github/workflows/ci-pr-checks.yaml: remove CGO configuration and install-dependencies steps; remove CGO env vars from lint step NOTE: This commit is intentionally draft — go.mod must be updated to the tagged kv-cache release that includes the pure-Go ZMQ changes before merging. Signed-off-by: Etai Lev Ran <elevran@gmail.com> * update kv-cache (pure Go zmq, tip of main) Signed-off-by: Etai Lev Ran <elevran@gmail.com> * revert to micro image once CGO is disabled Signed-off-by: Etai Lev Ran <elevran@gmail.com> --------- Signed-off-by: Etai Lev Ran <elevran@gmail.com>
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.79.2 to 1.79.3. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](grpc/grpc-go@v1.79.2...v1.79.3) --- updated-dependencies: - dependency-name: google.golang.org/grpc dependency-version: 1.79.3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Etai Lev Ran <elevran@gmail.com>
Add idleThreshold and maxBusyScore parameters to create a scoring gap between idle and busy pods, helping distribute prefix cache warmup. - idleThreshold: max requests to be considered idle (default: 0) - maxBusyScore: max score for busy pods (default: 1.0 for current behavior) Examples: - Binary mode: idleThreshold=0, maxBusyScore=0 (idle=1.0, busy=0.0) - Hybrid mode: idleThreshold=0, maxBusyScore=0.5 (idle=1.0, busy=0-0.5) - Flexible: idleThreshold=2, maxBusyScore=0.5 (≤2 req = idle) Signed-off-by: David Whyte-Gray <40244437+dagrayvid@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
* Docker build enhancements - reorder steps so go.mod/go.sum are in a cacheable layer (ie when not modified) - strip debug info by default (stacktraces are NOT affected). Can override with LD_FLAGS - cache CICD on GH action - other nits and clean ups (e.g., unused Python arg, comments) Signed-off-by: Etai Lev Ran <elevran@gmail.com> * pass GO build vars inline, not as ENV settings Signed-off-by: Etai Lev Ran <elevran@gmail.com> * allow EPP and sidecar images to run concurrently Signed-off-by: Etai Lev Ran <elevran@gmail.com> --------- Signed-off-by: Etai Lev Ran <elevran@gmail.com>
* feat: speculative indexing for PrecisePrefixCacheScorer Signed-off-by: bongwoobak <bongwoobak@gmail.com> * fix: use ip:port format for PodIdentifier to match KV event topics Signed-off-by: bongwoobak <bongwoobak@gmail.com> * fix: split Address/Port in test endpoints to match production ip:port format Signed-off-by: bongwoobak <bongwoobak@gmail.com> * make SpeculativeIndexing optional Signed-off-by: bongwoobak <bongwoobak@gmail.com> * feat: use confirmed-only scores for PrefixCacheServers cycle state Signed-off-by: bongwoobak <bongwoobak@gmail.com> * refactor: update PodEntry usage for Annotations struct Signed-off-by: bongwoobak <bongwoobak@gmail.com> * refactor: replace speculativeCache.Start() with cleanCachePeriodically() Signed-off-by: bongwoobak <bongwoobak@gmail.com> * fix: add nil metadata guard in PrepareRequestData Signed-off-by: bongwoobak <bongwoobak@gmail.com> * refactor: remove confirmedScores and simplify Annotations usage Signed-off-by: bongwoobak <bongwoobak@gmail.com> * fix: adapt to NewChunkedTokenDatabase signature change from PR 415 Signed-off-by: bongwoobak <bongwoobak@gmail.com> * refactor: use KVBlockScorer for scoring and human-readable speculativeTTL Replace computeScoresFromKeyToPods with kvcache.KVBlockScorer.Score() to properly integrate device-backend weight configuration. Change SpeculativeTTL from time.Duration (nanoseconds) to string, parsed via time.ParseDuration, for human-readable config values like "2s" or "500ms". Compact docs/configuration.md per review feedback. Signed-off-by: bongwoobak <bongwoobak@gmail.com> * fix: gofmt import ordering in precise_prefix_cache.go Signed-off-by: bongwoobak <bongwoobak@gmail.com> * docs: move speculative indexing config into architecture.md Signed-off-by: bongwoobak <bongwoobak@gmail.com> --------- Signed-off-by: bongwoobak <bongwoobak@gmail.com>
Signed-off-by: roytman <roytman@il.ibm.com>
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
* The UDS tokenizer sidecar image is now treated as an external dependency built and published by [llm-d-kv-cache](https://github.com/llm-d/llm-d-kv-cache). A companion PR (llm-d/llm-d-kv-cache#436) adds a workflow to publishe `ghcr.io/llm-d/llm-d-uds-tokenizer:dev` on every push to `main`, matching the `:dev` default used here. Changes: - Removed the `image-build-uds-tokenizer` Makefile target. - `image-build` and `test-e2e` no longer depend on building the tokenizer image; `image-pull` already handles pulling it. - UDS tokenizer image variables to follow the same pattern as other images. Variables and overrides work consistently across all scripts and Makefile targets. - Removed `jlumbroso/free-disk-space@main` as it is no longer needed after the CGO/ZMQ and Python removal. Depends on llm-d/llm-d-kv-cache#436 The companion kv-cache PR must merge (or the `:dev` tag must exist in GHCR) before `make image-pull` or `make env-dev-kind` will succeed without an explicit `UDS_TOKENIZER_IMAGE` override. Signed-off-by: Etai Lev Ran <elevran@gmail.com> * standardize image loading across tools and architectures Signed-off-by: Etai Lev Ran <elevran@gmail.com> --------- Signed-off-by: Etai Lev Ran <elevran@gmail.com>
Bumps the kubernetes group with 5 updates: | Package | From | To | | --- | --- | --- | | [k8s.io/api](https://github.com/kubernetes/api) | `0.35.2` | `0.35.3` | | [k8s.io/apiextensions-apiserver](https://github.com/kubernetes/apiextensions-apiserver) | `0.35.2` | `0.35.3` | | [k8s.io/apimachinery](https://github.com/kubernetes/apimachinery) | `0.35.2` | `0.35.3` | | [k8s.io/client-go](https://github.com/kubernetes/client-go) | `0.35.2` | `0.35.3` | | [k8s.io/component-base](https://github.com/kubernetes/component-base) | `0.35.2` | `0.35.3` | Updates `k8s.io/api` from 0.35.2 to 0.35.3 - [Commits](kubernetes/api@v0.35.2...v0.35.3) Updates `k8s.io/apiextensions-apiserver` from 0.35.2 to 0.35.3 - [Release notes](https://github.com/kubernetes/apiextensions-apiserver/releases) - [Commits](kubernetes/apiextensions-apiserver@v0.35.2...v0.35.3) Updates `k8s.io/apimachinery` from 0.35.2 to 0.35.3 - [Commits](kubernetes/apimachinery@v0.35.2...v0.35.3) Updates `k8s.io/client-go` from 0.35.2 to 0.35.3 - [Changelog](https://github.com/kubernetes/client-go/blob/master/CHANGELOG.md) - [Commits](kubernetes/client-go@v0.35.2...v0.35.3) Updates `k8s.io/component-base` from 0.35.2 to 0.35.3 - [Commits](kubernetes/component-base@v0.35.2...v0.35.3) --- updated-dependencies: - dependency-name: k8s.io/api dependency-version: 0.35.3 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: kubernetes - dependency-name: k8s.io/apiextensions-apiserver dependency-version: 0.35.3 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: kubernetes - dependency-name: k8s.io/apimachinery dependency-version: 0.35.3 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: kubernetes - dependency-name: k8s.io/client-go dependency-version: 0.35.3 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: kubernetes - dependency-name: k8s.io/component-base dependency-version: 0.35.3 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: kubernetes ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* test: add disruption e2e tests for scheduler failure scenarios Signed-off-by: Sam Batschelet <sbatsche@redhat.com> * test: address disruption test flakes Signed-off-by: Sam Batschelet <sbatsche@redhat.com> --------- Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
* Initial E/PD extension of the plugins. Signed-off-by: Revital Sur <eres@il.ibm.com> * Changes to Pick function. Signed-off-by: Revital Sur <eres@il.ibm.com> * Minor fix. Signed-off-by: Revital Sur <eres@il.ibm.com> * Rename files and structs. Signed-off-by: Revital Sur <eres@il.ibm.com> * Rename ed_prerequest.go Signed-off-by: Revital Sur <eres@il.ibm.com> * Fix lint errors. Signed-off-by: Revital Sur <eres@il.ibm.com> * initial implementation of unified handler Signed-off-by: roytman <roytman@il.ibm.com> * clean up code, renaming Signed-off-by: roytman <roytman@il.ibm.com> * fix renaming in common Signed-off-by: roytman <roytman@il.ibm.com> * remove prefixPluginTypedName Signed-off-by: roytman <roytman@il.ibm.com> * update according review comments Signed-off-by: roytman <roytman@il.ibm.com> * fix lint, addd deciderPlugin.go and prefix_based_pd_decider_test.go Signed-off-by: roytman <roytman@il.ibm.com> * add depricated to pd_profile_handler and remove PrefixPluginName, PrefixPluginType from disagg_profile_handler Signed-off-by: roytman <roytman@il.ibm.com> * fix lint Signed-off-by: roytman <roytman@il.ibm.com> * Initialize all profile defaults in DisaggProfileHandlerFactory struct. Signed-off-by: Revital Sur <eres@il.ibm.com> * Add nil guards to profile handlers Signed-off-by: roytman <roytman@il.ibm.com> * Add e2e test for disagg-profile-handler PD config. Signed-off-by: Revital Sur <eres@il.ibm.com> * Add e2e tests. Signed-off-by: Revital Sur <eres@il.ibm.com> * Fix e2e tests. Signed-off-by: Revital Sur <eres@il.ibm.com> * Address review comments. Signed-off-by: Revital Sur <eres@il.ibm.com> * Fix lint error. Signed-off-by: Revital Sur <eres@il.ibm.com> * add comment about linter Signed-off-by: roytman <roytman@il.ibm.com> * add disagg_metrics.go, depricate metrics.go. Signed-off-by: roytman <roytman@il.ibm.com> * update comment Signed-off-by: roytman <roytman@il.ibm.com> * Update pkg/plugins/pre-request/encode_prerequest.go Co-authored-by: Shmuel Kallner <kallner@il.ibm.com> Signed-off-by: Alexey Roytman <roytman@il.ibm.com> * fix e2e tests and comments Signed-off-by: roytman <roytman@il.ibm.com> * fix comments Signed-off-by: roytman <roytman@il.ibm.com> * update documentation Signed-off-by: roytman <roytman@il.ibm.com> * update metrics Signed-off-by: roytman <roytman@il.ibm.com> * update profile tests Signed-off-by: roytman <roytman@il.ibm.com> * update architecture.md Signed-off-by: roytman <roytman@il.ibm.com> * fix NewPrefillRole update architecture.md Signed-off-by: roytman <roytman@il.ibm.com> * remove hashBlockSize and replace blockSize by blockSizeTokens Signed-off-by: roytman <roytman@il.ibm.com> * Update epp yamls in deploy/config. Signed-off-by: Revital Sur <eres@il.ibm.com> * fix comments Signed-off-by: roytman <roytman@il.ibm.com> * fix test errors (Consume) Signed-off-by: roytman <roytman@il.ibm.com> * replace em dashes with en dashes Signed-off-by: roytman <roytman@il.ibm.com> * add log messages in DisaggProfileHandlerFactory when there is no deciders Signed-off-by: roytman <roytman@il.ibm.com> * combine roles, add map based parameters to disagg_profile_handler Signed-off-by: roytman <roytman@il.ibm.com> * separate the new (map-based) and the old string flat APIs Signed-off-by: roytman <roytman@il.ibm.com> --------- Signed-off-by: Revital Sur <eres@il.ibm.com> Signed-off-by: roytman <roytman@il.ibm.com> Signed-off-by: Alexey Roytman <roytman@il.ibm.com> Co-authored-by: Revital Sur <eres@il.ibm.com> Co-authored-by: Shmuel Kallner <kallner@il.ibm.com>
* feat: add test coverage reporting with baseline comparison
Adds Makefile targets and CI workflow steps to collect Go unit test
coverage and compare it against the main-branch baseline on PRs.
- `make test` runs unit tests with -coverprofile for epp and sidecar components; `test-integration` covers integration tests
- `make coverage-compare` diffs current coverage against main (builds a baseline in a temporary git worktree locally; uses a restored cache in CI)
- `make coverage-report` generates HTML reports for browser viewing
- scripts/compare-coverage.sh produces a markdown table and writes it to
the GitHub Actions Job Summary (visible to all users on public repos)
- CI: on PR, restores coverage-main cache as baseline and runs the compare;
on push to main, saves the new coverage to the coverage-main cache key
(single entry, auto-expires after 7 days, negligible storage cost)
Signed-off-by: Etai Lev Ran <elevran@gmail.com>
* Scope `cd` to a subshell `( ... )` so the parent shell stays in the repo root after the worktree is removed.
Needed since cd in the parent could switch to a REF that does not include the coverage script...
Signed-off-by: Etai Lev Ran <elevran@gmail.com>
* use the same path for save/restore of coverage files
Signed-off-by: Etai Lev Ran <elevran@gmail.com>
* Run tests with coverage always enabled
Makefile
- Removed COVERAGE ?= opt-in variable
- test-*: always run with -race -coverprofile -covermode=atomic; always prints coverage summary
DEVELOPMENT.md:
- Consolidated testing subsections into one, noting coverage and race detection are always on
CI workflow (ci-pr-checks.yaml):
- Changed to use make test since that includes coverage
Signed-off-by: Etai Lev Ran <elevran@gmail.com>
* Data races fixed in pkg/sidecar/proxy
1. proxy_helpers.go / proxy.go / test files: added readyCh chan struct{} to Server, closed after net.Listen sets s.addr; all 6 test sites replace
time.Sleep(1s) + Expect(addr) with <-proxy.readyCh
2. data_parallel.go: captured s.decoderURL.Scheme before goroutine launch to avoid concurrent read
3. proxy.go (Clone()): removed decoderURL and decoderProxy from the copy — startDataParallel always sets them explicitly after cloning
4. connector_sglang_test.go: changed var prefillFinished bool to atomic.Bool
Signed-off-by: Etai Lev Ran <elevran@gmail.com>
---------
Signed-off-by: Etai Lev Ran <elevran@gmail.com>
* sidecar: embed Config in Options, move port and target URL into Config Config now holds the complete runtime configuration including port, target URL, SSRF fields, and renamed TLS fields. Options embeds Config so Complete() populates it directly. NewProxy takes a single Config. Start constructs AllowlistValidator from config internally. main.go no longer builds Config manually or calls NewAllowlistValidator directly. Signed-off-by: Etai Lev Ran <elevran@gmail.com> * sidecar: update tests for new NewProxy and Start signatures Signed-off-by: Etai Lev Ran <elevran@gmail.com> * config: add String via MarshalJSON Signed-off-by: Etai Lev Ran <elevran@gmail.com> * simplify code Signed-off-by: Etai Lev Ran <elevran@gmail.com> * Port and URL are already in Options, removed from Server - Removed port and decoderURL fields from Server struct (they duplicated config.Port and config.TargetURL) - Removed the two lines in NewProxy (L186-187) that initialized them - Removed them from Clone() as well — the config copy already carries the values - Replaced all s.port / s.decoderURL / clone.port / clone.decoderURL references with s.config.Port / s.config.TargetURL across proxy.go, proxy_helpers.go, data_parallel.go, connector_sglang.go, connector_nixlv2.go, and the test file Signed-off-by: Etai Lev Ran <elevran@gmail.com> * change TargetURL to DecodeURL; make Options fields private Signed-off-by: Etai Lev Ran <elevran@gmail.com> * fix race in sidecar, reorder imports in test file Signed-off-by: Etai Lev Ran <elevran@gmail.com> --------- Signed-off-by: Etai Lev Ran <elevran@gmail.com>
* make coverage comparison optional Signed-off-by: Etai Lev Ran <elevran@gmail.com> Co-authored-by: Alexey Roytman <roytman@il.ibm.com> * Handle failure cases in extract_total: missing, empty and corrupt files. In all these cases the function returns "", and the caller handles that as missing data and not a script abort. Signed-off-by: Etai Lev Ran <elevran@gmail.com> --------- Signed-off-by: Etai Lev Ran <elevran@gmail.com> Co-authored-by: Alexey Roytman <roytman@il.ibm.com>
Signed-off-by: roytman <roytman@il.ibm.com>
…saggHeadersHandler (#758) * combine EncodeHeaderHandler PrefillHeaderHandler to DisaggHeadersHandler Signed-off-by: roytman <roytman@il.ibm.com> * rename disagg_prerequest.go to disagg_headers_handler.go Signed-off-by: roytman <roytman@il.ibm.com> --------- Signed-off-by: roytman <roytman@il.ibm.com>
* Prevent mismatch between new and deprecated APIs Signed-off-by: roytman <roytman@il.ibm.com> * simplify DisaggProfileHandlerFactory Signed-off-by: roytman <roytman@il.ibm.com> --------- Signed-off-by: roytman <roytman@il.ibm.com>
Signed-off-by: Guangya Liu <gyliu513@gmail.com>
* Containerize build system with builder image Introduce Dockerfile.builder based on the same Go image as production, with all build tools (golangci-lint, typos, kind, kubectl, podman) at pinned versions. All build, test, lint, and format targets now run inside the builder container. Go module and build caches use named volumes for persistence across runs. Host requirements are reduced to a container runtime and git. kubectl and envsubst are only needed for cluster deployment targets. Signed-off-by: Antonio Cardace <acardace@redhat.com> * CI: remove host Go/lint setup, use containerized targets All CI steps now delegate to make targets that run inside the builder container. Remove Go setup, go mod tidy, and golangci-lint-action. Expand paths filter to cover Dockerfile.builder, go.sum, and .golangci.yml. Signed-off-by: Antonio Cardace <acardace@redhat.com> * Fix issues detected by linter Signed-off-by: Antonio Cardace <acardace@redhat.com> * Add builder shell targets for debugging Signed-off-by: Antonio Cardace <acardace@redhat.com> * e2e-tests: load images into kind directly to save space Signed-off-by: Antonio Cardace <acardace@redhat.com> --------- Signed-off-by: Antonio Cardace <acardace@redhat.com>
Signed-off-by: roytman <roytman@il.ibm.com>
* implement context-length-aware plugin (scorer/filter) Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com> * fix newline Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com> * apply review suggestion Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com> * code improvements Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com> * Update pkg/plugins/multi/context_length_aware_bench_test.go Co-authored-by: Shmuel Kallner <kallner@il.ibm.com> Signed-off-by: Maroon Ayoub <Maroonay@gmail.com> * refactor UX Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com> * add tokenization and fix tests Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com> * use tokenizer plugin Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com> * fix lint Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com> * implement proximity for out-of-range requests Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com> * addressed comments Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com> * remove support for multiple ranges per pod Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com> * rebase on tokenizer PrepareData -> Scorer change Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com> * addressed comments Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com> --------- Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com> Signed-off-by: Maroon Ayoub <Maroonay@gmail.com> Co-authored-by: Shmuel Kallner <kallner@il.ibm.com>
Signed-off-by: Shmuel Kallner <kallner@il.ibm.com>
* change simulator version to v0.8.1 Signed-off-by: Maya Barnea <mayab@il.ibm.com> * update simulator version Signed-off-by: Maya Barnea <mayab@il.ibm.com> * make e2e test timeout 20min Signed-off-by: Maya Barnea <mayab@il.ibm.com> --------- Signed-off-by: Maya Barnea <mayab@il.ibm.com>
Signed-off-by: Etai Lev Ran <elevran@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )