-
Notifications
You must be signed in to change notification settings - Fork 237
Fix Qwen3Renderer stripping out last <think>, and Qwen3DisableThinkingRenderer not adding <think>\n\n</think> #178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… format The previous implementation only added `<think>\n` to assistant messages via the parent class, but the official Qwen3-8B tokenizer format requires the complete empty thinking block: `<think>\n\n</think>\n\n`. This commit fixes the issue by: 1. Overriding render_message() to prepend the complete empty thinking block to assistant messages that don't already have one 2. Delegating to the parent class for all rendering logic 3. Adding test cases to verify the fix matches official tokenizer behavior Fixes: thinking-machines-lab#176
| self.strip_thinking_from_history | ||
| and message["role"] == "assistant" | ||
| and "</think>" in ac_content | ||
| and not is_last |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix for Qwen3Renderer: Don't remove last think block content
issue:
2-TURN CONVERSATION - THINKING SHOULD BE PRESERVED:
================================================================================
TINKER:
<|im_start|>user
What is 2+2?<|im_end|>
<|im_start|>assistant
The answer is 4.<|im_end|>
HUGGINGFACE:
<|im_start|>user
What is 2+2?<|im_end|>
<|im_start|>assistant
<think> <---- Huggingface tokenizer template preserves <think>
Let me calculate this.
</think>
The answer is 4.<|im_end|```
| if "<think>" not in content: | ||
| message = message.copy() | ||
| message["content"] = "<think>\n\n</think>\n\n" + content | ||
| return super().render_message(idx, message, is_last=is_last) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for Qwen3DisableThinkingRenderer, it did not add <think>\n\n</think>\n\n during SFT.
================================================================================
BUG: Official tinker-cookbook Qwen3DisableThinkingRenderer
================================================================================
Actual output from renderer:
<|im_start|>user
What is 2+2?<|im_end|>
<|im_start|>assistant
<think> <---- Missing the \n\n</think> tokens
The answer is 4.<|im_end|>
================================================================================
Expected output from Qwen3-8B tokenizer:
================================================================================
<|im_start|>user
What is 2+2?<|im_end|>
<|im_start|>assistant
<think>
</think>
The answer is 4.<|im_end|>
|
Thanks for looking into this -- I agree that there's a bug. Could you check (and add a test) that |
| elif message["role"] == "assistant" and "<think>" not in ac_content and is_last: | ||
| # Matching the paper, we force the assistant to start with <think>. Some SFT datasets include | ||
| # <think> in the assistant messages, we so don't need to re-add it in those cases. | ||
| ob_str += "<think>\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the renderer test caught another bug:
in multiturn, <think> was added to the intermediate steps
Cookbook string: <|im_start|>user
Hello, how are you?<|im_end|>
<|im_start|>assistant
<think> <--- THIS SHOULDN'T GET ADDED
I'm fine, thank you!<|im_end|>
<|im_start|>user
What is the capital of France?<|im_end|>
<|im_start|>assistant
<think>
| # XXX this causes inefficiency in RL, because the observations don't grow by appending to the end. | ||
| # Maybe we should just insert this empty thinking block in every message? | ||
| prefill = "<think>\n\n</think>\n\n" + (prefill or "") | ||
| return super().build_generation_prompt(messages, role, prefill) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, the new generation test caught a bug. we still need to prefill like this
| f"HF tokens: {hf_tokens}\n" | ||
| f"HF string: {tokenizer.decode(hf_tokens)}" | ||
| ) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added test for generation
Suggested fix for #177 and #176