Conversation
There was a problem hiding this comment.
Pull request overview
This PR removes the legacy embedded (Python/cgo) tokenizer implementation and build path, standardizing the codebase on the UDS tokenizer service. It also updates examples and CI to build/run against the tokenizer container instead of local Python dependencies.
Changes:
- Deleted embedded tokenizer and chat-templating (Python/cgo) implementation plus associated tests and e2e suite.
- Standardized tokenization pool to UDS-only and removed
embedded_tokenizersbuild tag gating across code/tests. - Updated Makefile/CI/examples to build the UDS tokenizer image and run examples via the tokenizer container.
Reviewed changes
Copilot reviewed 35 out of 36 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/e2e/uds_tokenizer/uds_e2e_test.go | Removes build-tag gating so UDS e2e test always compiles. |
| tests/e2e/uds_tokenizer/uds_e2e_suite_test.go | Removes build-tag gating for suite setup. |
| tests/e2e/uds_tokenizer/uds_e2e_mm_test.go | Removes build-tag gating for multimodal UDS e2e test. |
| tests/e2e/redis_mock/e2e_test.go | Deletes embedded-tokenizers-only e2e tests. |
| tests/e2e/redis_mock/e2e_suite_test.go | Deletes embedded-tokenizers-only e2e suite harness. |
| pkg/tokenization/pool_uds.go | Makes UDS tokenizer the only supported pool path; simplifies error message. |
| pkg/tokenization/pool_embedded_test.go | Deletes embedded pool integration/benchmark tests. |
| pkg/tokenization/pool_embedded.go | Deletes embedded-tokenizers pool implementation/config. |
| pkg/tokenization/embedded_tokenizer_test.go | Deletes embedded tokenizer tests (HF/local/composite + discovery). |
| pkg/tokenization/embedded_tokenizer.go | Deletes embedded tokenizer implementations (HF/local/composite + templating). |
| pkg/preprocessing/chat_completions/tokenizer_wrapper.py | Deletes Python wrapper for embedded templating/tokenization. |
| pkg/preprocessing/chat_completions/setup.sh | Deletes vLLM install script for embedded mode. |
| pkg/preprocessing/chat_completions/cgo_functions_test.go | Deletes CGO/Python templating tests/benchmarks. |
| pkg/preprocessing/chat_completions/cgo_functions.h | Deletes CGO header for Python bridge. |
| pkg/preprocessing/chat_completions/cgo_functions.go | Deletes Go CGO bridge for Python templating/tokenization. |
| pkg/preprocessing/chat_completions/cgo_functions.c | Deletes C implementation for Python bridge + caching layer. |
| pkg/preprocessing/chat_completions/README.md | Removes documentation for embedded chat-templating integration. |
| hack/verify-examples.sh | Updates example verification to run examples that manage a tokenizer container; adds cleanup trap. |
| examples/valkey_example/main.go | Switches example to UDS tokenizer endpoint configuration via env var. |
| examples/valkey_example/README.md | Updates example output/docs to reflect new model naming. |
| examples/testdata/data.go | Changes example model name used across examples/testdata. |
| examples/kv_events/online/main.go | Removes embedded templating setup and wires tokenizer endpoint env var; improves shutdown path. |
| examples/kv_events/offline/main.go | Wires tokenizer endpoint env var; adjusts shutdown flow for UDS-only setup. |
| examples/kv_cache_index_service/server/server.go | Removes unused sample-data helper tied to old embedded flow. |
| examples/kv_cache_index_service/server/main.go | Adds early returns on fatal setup errors. |
| examples/kv_cache_index/main.go | Switches to computing block keys via Indexer.ComputeBlockKeys instead of precomputed hashes. |
| examples/kv_cache_index/README.md | Updates output to reflect new default model naming. |
| examples/kv_cache_aware_scorer/kvcache_aware_scorer.go | Drops embedded-tokenizers build tag constraint (still excluded). |
| examples/helper/indexer.go | Wires tokenizer endpoint env var into example indexer setup. |
| examples/helper/events.go | Extends simulated event payload to include multimodal extra keys field. |
| Makefile | Removes embedded/Python build pipeline; updates CI/test targets; runs examples by starting tokenizer container. |
| .github/workflows/ci-test.yaml | Drops embedded build job; builds UDS tokenizer image and runs e2e tests. |
| .github/workflows/ci-examples.yaml | Removes Python setup; builds UDS tokenizer image and runs example verification. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
d8cb319 to
cb01d0c
Compare
|
Unsigned commits detected! Please sign your commits. For instructions on how to set up GPG/SSH signing and verify your commits, please see GitHub Documentation. |
cb01d0c to
4473e2e
Compare
49688f0 to
5e415e3
Compare
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
5e415e3 to
4d1dc9a
Compare
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 22 out of 24 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Sage <80211083+sagearc@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Sage <80211083+sagearc@users.noreply.github.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Thanks! I think it is already removed? |
|
@sagearc Yes, seems my local file is not up-to-date. LGTM |
There was a problem hiding this comment.
Thanks @sagearc — nice cleanup!
One issue: make build (build-uds) now only runs go build ./pkg/... which doesn't produce a binary. This breaks:
make run—./bin/llm-d-kv-cachedoesn't exist- Dockerfile —
COPY --from=builder /workspace/bin/llm-d-kv-cachefails
The old build-embedded had -o bin/$(PROJECT_NAME) but that was removed without a replacement. Suggest adding -o to build-uds or updating the Dockerfile accordingly.
Summary
Removes the embedded tokenizer code (python/cgo) that has been behind build tags since #303. The scheduler dropped cgo entirely in scheduler#609, and the embedded path was never updated for multimodal support — so there's no realistic way to use it and no plan to maintain it.
Changes
pkg/preprocessing/chat_completionsandpkg/tokenization/pool_embedded*/embedded_tokenizer*build-embeddedCI job and its apt/pip/ZMQ setup!embedded_tokenizersbuild tag frompool_uds.go— UDS is now the only poolRef