Fix token_type_ids requirement for gemma-3 models in GRPOTrainer by robrui · Pull Request #5644 · huggingface/trl

robrui · 2026-04-26T02:54:40Z

What does this PR do?

Fixes #5032: gemma-3 models crash during GRPO/RLOO training with ValueError: token_type_ids is required when no processor (e.g. VLM processor) is available.

Root cause: in the text-only forward path, forward_kwargs is empty. Models like gemma-3 that include token_type_ids in their forward signature receive no token_type_ids, causing a crash.

Fix: Before the text-only forward pass, if "token_type_ids" in self.model_kwarg_keys, create a zero tensor matching the prompt length. The existing extension block then pads zeros for completion tokens automatically.

Applied across 5 trainers: GRPOTrainer, RLOOTrainer, DPPOTrainer, GFPOTrainer, GRPOWithReplayBufferTrainer.

Before submitting

This PR fixes a typo or improves the docs (you can leave this unchecked)
Did you read the contributor guideline?
Was this discussed/approved via a GitHub issue? (Fine-Tuning gemma-3-4b-pt raises "ValueError: token_type_ids is required as a model input when training" #5032)
Did you make sure to update the documentation? No — the fix has no user-facing API change.
Did you write any new necessary tests? No — the change is a guarded zero-tensor creation. Existing training CI exercises the text-only path for gemma models.

AI writing disclosure

AI-assisted (AI tools assisted with code generation; all changes reviewed and verified by a human before submission)

When fine-tuning gemma-3-4b-pt with TRL, the model requires token_type_ids in the forward pass. This fix ensures token_type_ids are properly set during training data preparation for text-only (non-VLM) models whose forward method expects them. Closes huggingface#5032

…ainer These trainers have the same code path: text-only (non-VLM) models receive an empty forward_kwargs dict, causing token_type_ids to be missing when the model requires them (e.g., gemma-3).

These experimental trainers share the same pattern: text-only models receive an empty forward_kwargs dict, missing token_type_ids when the model requires them (e.g., gemma-3).

robrui added 3 commits April 26, 2026 02:10

Apply same token_type_ids fix to RLOOTrainer and replay buffer GRPOTr…

0049bb0

…ainer These trainers have the same code path: text-only (non-VLM) models receive an empty forward_kwargs dict, causing token_type_ids to be missing when the model requires them (e.g., gemma-3).

Apply same token_type_ids fix to GFPOTrainer and DPPOTrainer

507b0e4

These experimental trainers share the same pattern: text-only models receive an empty forward_kwargs dict, missing token_type_ids when the model requires them (e.g., gemma-3).

qgallouedec added the 🫠 AI slop label May 4, 2026

qgallouedec closed this May 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix token_type_ids requirement for gemma-3 models in GRPOTrainer#5644

Fix token_type_ids requirement for gemma-3 models in GRPOTrainer#5644
robrui wants to merge 3 commits intohuggingface:mainfrom
robrui:fix/gemma-token-type-ids

robrui commented Apr 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

robrui commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

AI writing disclosure

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

robrui commented Apr 26, 2026 •

edited

Loading