Skip to content

fix: deprecate embedded tokenizers#473

Open
sagearc wants to merge 14 commits intollm-d:mainfrom
sagearc:deprecate-embedded-tokenizers
Open

fix: deprecate embedded tokenizers#473
sagearc wants to merge 14 commits intollm-d:mainfrom
sagearc:deprecate-embedded-tokenizers

Conversation

@sagearc
Copy link
Copy Markdown
Collaborator

@sagearc sagearc commented Mar 29, 2026

Summary

Removes the embedded tokenizer code (python/cgo) that has been behind build tags since #303. The scheduler dropped cgo entirely in scheduler#609, and the embedded path was never updated for multimodal support — so there's no realistic way to use it and no plan to maintain it.

Changes

  • Delete pkg/preprocessing/chat_completions and pkg/tokenization/pool_embedded* / embedded_tokenizer*
  • Drop the build-embedded CI job and its apt/pip/ZMQ setup
  • Remove !embedded_tokenizers build tag from pool_uds.go — UDS is now the only pool

Ref

Copilot AI review requested due to automatic review settings March 29, 2026 11:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes the legacy embedded (Python/cgo) tokenizer implementation and build path, standardizing the codebase on the UDS tokenizer service. It also updates examples and CI to build/run against the tokenizer container instead of local Python dependencies.

Changes:

  • Deleted embedded tokenizer and chat-templating (Python/cgo) implementation plus associated tests and e2e suite.
  • Standardized tokenization pool to UDS-only and removed embedded_tokenizers build tag gating across code/tests.
  • Updated Makefile/CI/examples to build the UDS tokenizer image and run examples via the tokenizer container.

Reviewed changes

Copilot reviewed 35 out of 36 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/e2e/uds_tokenizer/uds_e2e_test.go Removes build-tag gating so UDS e2e test always compiles.
tests/e2e/uds_tokenizer/uds_e2e_suite_test.go Removes build-tag gating for suite setup.
tests/e2e/uds_tokenizer/uds_e2e_mm_test.go Removes build-tag gating for multimodal UDS e2e test.
tests/e2e/redis_mock/e2e_test.go Deletes embedded-tokenizers-only e2e tests.
tests/e2e/redis_mock/e2e_suite_test.go Deletes embedded-tokenizers-only e2e suite harness.
pkg/tokenization/pool_uds.go Makes UDS tokenizer the only supported pool path; simplifies error message.
pkg/tokenization/pool_embedded_test.go Deletes embedded pool integration/benchmark tests.
pkg/tokenization/pool_embedded.go Deletes embedded-tokenizers pool implementation/config.
pkg/tokenization/embedded_tokenizer_test.go Deletes embedded tokenizer tests (HF/local/composite + discovery).
pkg/tokenization/embedded_tokenizer.go Deletes embedded tokenizer implementations (HF/local/composite + templating).
pkg/preprocessing/chat_completions/tokenizer_wrapper.py Deletes Python wrapper for embedded templating/tokenization.
pkg/preprocessing/chat_completions/setup.sh Deletes vLLM install script for embedded mode.
pkg/preprocessing/chat_completions/cgo_functions_test.go Deletes CGO/Python templating tests/benchmarks.
pkg/preprocessing/chat_completions/cgo_functions.h Deletes CGO header for Python bridge.
pkg/preprocessing/chat_completions/cgo_functions.go Deletes Go CGO bridge for Python templating/tokenization.
pkg/preprocessing/chat_completions/cgo_functions.c Deletes C implementation for Python bridge + caching layer.
pkg/preprocessing/chat_completions/README.md Removes documentation for embedded chat-templating integration.
hack/verify-examples.sh Updates example verification to run examples that manage a tokenizer container; adds cleanup trap.
examples/valkey_example/main.go Switches example to UDS tokenizer endpoint configuration via env var.
examples/valkey_example/README.md Updates example output/docs to reflect new model naming.
examples/testdata/data.go Changes example model name used across examples/testdata.
examples/kv_events/online/main.go Removes embedded templating setup and wires tokenizer endpoint env var; improves shutdown path.
examples/kv_events/offline/main.go Wires tokenizer endpoint env var; adjusts shutdown flow for UDS-only setup.
examples/kv_cache_index_service/server/server.go Removes unused sample-data helper tied to old embedded flow.
examples/kv_cache_index_service/server/main.go Adds early returns on fatal setup errors.
examples/kv_cache_index/main.go Switches to computing block keys via Indexer.ComputeBlockKeys instead of precomputed hashes.
examples/kv_cache_index/README.md Updates output to reflect new default model naming.
examples/kv_cache_aware_scorer/kvcache_aware_scorer.go Drops embedded-tokenizers build tag constraint (still excluded).
examples/helper/indexer.go Wires tokenizer endpoint env var into example indexer setup.
examples/helper/events.go Extends simulated event payload to include multimodal extra keys field.
Makefile Removes embedded/Python build pipeline; updates CI/test targets; runs examples by starting tokenizer container.
.github/workflows/ci-test.yaml Drops embedded build job; builds UDS tokenizer image and runs e2e tests.
.github/workflows/ci-examples.yaml Removes Python setup; builds UDS tokenizer image and runs example verification.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@sagearc sagearc marked this pull request as draft March 29, 2026 11:34
@sagearc sagearc added the hold PRs that are blocked on design, other features, release cycle, etc. label Mar 29, 2026
@sagearc sagearc force-pushed the deprecate-embedded-tokenizers branch from d8cb319 to cb01d0c Compare March 31, 2026 18:12
@github-actions
Copy link
Copy Markdown

Unsigned commits detected! Please sign your commits.

For instructions on how to set up GPG/SSH signing and verify your commits, please see GitHub Documentation.

@github-actions github-actions bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Mar 31, 2026
@sagearc sagearc force-pushed the deprecate-embedded-tokenizers branch from cb01d0c to 4473e2e Compare March 31, 2026 18:15
@sagearc sagearc marked this pull request as ready for review March 31, 2026 18:18
@sagearc sagearc force-pushed the deprecate-embedded-tokenizers branch 3 times, most recently from 49688f0 to 5e415e3 Compare March 31, 2026 18:52
sagearc added 3 commits March 31, 2026 21:56
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
@sagearc sagearc force-pushed the deprecate-embedded-tokenizers branch from 5e415e3 to 4d1dc9a Compare March 31, 2026 19:00
sagearc added 7 commits March 31, 2026 22:06
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
@sagearc sagearc removed hold PRs that are blocked on design, other features, release cycle, etc. labels Mar 31, 2026
@sagearc sagearc requested a review from Copilot March 31, 2026 19:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 24 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

sagearc and others added 4 commits March 31, 2026 22:54
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Sage <80211083+sagearc@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Sage <80211083+sagearc@users.noreply.github.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Copy link
Copy Markdown
Contributor

@gyliu513 gyliu513 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup, just one comment, please remove build-embedded from ci-test.yaml. It is weird that CI did not fail...

@sagearc
Copy link
Copy Markdown
Collaborator Author

sagearc commented Mar 31, 2026

Nice cleanup, just one comment, please remove build-embedded from ci-test.yaml. It is weird that CI did not fail...

Thanks! I think it is already removed?

@gyliu513
Copy link
Copy Markdown
Contributor

@sagearc Yes, seems my local file is not up-to-date.

LGTM

@sagearc sagearc added the lgtm Looks good to me, indicates that a PR is ready to be merged. label Mar 31, 2026
Copy link
Copy Markdown
Collaborator

@yankay yankay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sagearc — nice cleanup!

One issue: make build (build-uds) now only runs go build ./pkg/... which doesn't produce a binary. This breaks:

  1. make run./bin/llm-d-kv-cache doesn't exist
  2. DockerfileCOPY --from=builder /workspace/bin/llm-d-kv-cache fails

The old build-embedded had -o bin/$(PROJECT_NAME) but that was removed without a replacement. Suggest adding -o to build-uds or updating the Dockerfile accordingly.

@sagearc sagearc removed the lgtm Looks good to me, indicates that a PR is ready to be merged. label Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants