[TITO] model-support: add DeepSeek V4 TITO support #1065
Open
zyzshishui wants to merge 3 commits intoradixark:mainfrom
Open
[TITO] model-support: add DeepSeek V4 TITO support #1065zyzshishui wants to merge 3 commits intoradixark:mainfrom
zyzshishui wants to merge 3 commits intoradixark:mainfrom
Conversation
Contributor
There was a problem hiding this comment.
Code Review
This pull request introduces support for DeepSeek V4 within the TITO tokenization framework and shifts the responsibility for prompt tokenization from SGLang to the Miles session server. Key implementation details include the addition of a specialized DeepSeekV4TITOTokenizer, custom rendering logic for DSv4 in apply_chat_template, and middleware updates to handle input_ids injection. The changes also include compatibility workarounds for SGLang v4 and significant enhancements to the chat template verification tools. Feedback highlights an incomplete suffix removal logic for DSv4 and the need for more robust tool formatting in fallback scenarios.
17f10a8 to
6fc716c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds TITO support for DeepSeek V4 (tested on Flash due to lack of gpu, but should also work for V4 pro).
Registers a new
deepseekv4TITO family and wires it to SGLang’s DeepSeek V4 encoder path instead of the regular HF/Jinja chat template path.Note: A temporary compatibility guard is included in
miles/utils/dumper_utils.pybecause the current SGLang V4 support branch is not upstreamed and its dumper API is too old. We would remove it after #1045 get merged.Test Plan
1. DeepSeek V4 TITO tokenizer verifier
Verifies that the registered
deepseekv4TITO tokenizer can incrementally merge appended tool messages and decode back to the same text as a full prompt render.2. Fast tokenizer regression suite
Verifies the shared chat-template / TITO tokenizer utilities, including DeepSeek V4 prompt-id alignment against SGLang’s DSv4 encoder path.
3. DeepSeek V4 session-server e2e
Runs the real Miles session-server TITO path with SGLang inference. This verifies that Miles-owned input_ids, DeepSeek V4 prompt encoding, tool-call parsing, rollback, and accumulated token mismatch checks work together.