Skip to content

Conversation

@thejaminator
Copy link

Suggested fix for #177 and #176

… format

The previous implementation only added `<think>\n` to assistant messages
via the parent class, but the official Qwen3-8B tokenizer format requires
the complete empty thinking block: `<think>\n\n</think>\n\n`.

This commit fixes the issue by:
1. Overriding render_message() to prepend the complete empty thinking block
   to assistant messages that don't already have one
2. Delegating to the parent class for all rendering logic
3. Adding test cases to verify the fix matches official tokenizer behavior

Fixes: thinking-machines-lab#176
self.strip_thinking_from_history
and message["role"] == "assistant"
and "</think>" in ac_content
and not is_last
Copy link
Author

@thejaminator thejaminator Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix for Qwen3Renderer: Don't remove last think block content

issue:

2-TURN CONVERSATION - THINKING SHOULD BE PRESERVED:
================================================================================
TINKER:
<|im_start|>user
What is 2+2?<|im_end|>
<|im_start|>assistant
The answer is 4.<|im_end|>

HUGGINGFACE:
<|im_start|>user
What is 2+2?<|im_end|>
<|im_start|>assistant
<think> <---- Huggingface tokenizer template preserves <think>
Let me calculate this.
</think>

The answer is 4.<|im_end|```

if "<think>" not in content:
message = message.copy()
message["content"] = "<think>\n\n</think>\n\n" + content
return super().render_message(idx, message, is_last=is_last)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for Qwen3DisableThinkingRenderer, it did not add <think>\n\n</think>\n\n during SFT.

================================================================================
BUG: Official tinker-cookbook Qwen3DisableThinkingRenderer
================================================================================

Actual output from renderer:
<|im_start|>user
What is 2+2?<|im_end|>
<|im_start|>assistant
<think> <---- Missing the \n\n</think> tokens
The answer is 4.<|im_end|>

================================================================================
Expected output from Qwen3-8B tokenizer:
================================================================================
<|im_start|>user
What is 2+2?<|im_end|>
<|im_start|>assistant
<think>

</think>

The answer is 4.<|im_end|>

@joschu
Copy link
Collaborator

joschu commented Dec 15, 2025

Thanks for looking into this -- I agree that there's a bug. Could you check (and add a test) that build_generation_prompt does the right thing for Qwen3DisableThinkingRenderer?

elif message["role"] == "assistant" and "<think>" not in ac_content and is_last:
# Matching the paper, we force the assistant to start with <think>. Some SFT datasets include
# <think> in the assistant messages, we so don't need to re-add it in those cases.
ob_str += "<think>\n"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the renderer test caught another bug:
in multiturn, <think> was added to the intermediate steps

Cookbook string: <|im_start|>user
Hello, how are you?<|im_end|>
<|im_start|>assistant
<think> <--- THIS SHOULDN'T GET ADDED
I'm fine, thank you!<|im_end|>
<|im_start|>user
What is the capital of France?<|im_end|>
<|im_start|>assistant
<think>

# XXX this causes inefficiency in RL, because the observations don't grow by appending to the end.
# Maybe we should just insert this empty thinking block in every message?
prefill = "<think>\n\n</think>\n\n" + (prefill or "")
return super().build_generation_prompt(messages, role, prefill)
Copy link
Author

@thejaminator thejaminator Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, the new generation test caught a bug. we still need to prefill like this

f"HF tokens: {hf_tokens}\n"
f"HF string: {tokenizer.decode(hf_tokens)}"
)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added test for generation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants