-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Open
Labels
Description
- Did you update?
pip install --upgrade unsloth unsloth_zoo- yes ColaborKaggleor local / cloud - local- Number GPUs used, use
nvidia-smi- 1 L40s - Which notebook? Please link!
- Which Unsloth version, TRL version, transformers version, PyTorch version?
Name: unsloth
Version: 2026.2.1
Name: trl
Version: 0.24.0
Name: transformers
Version: 4.57.6
Name: torch
Version: 2.9.
- Which trainer?
SFTTrainer,GRPOTraineretc
GRPOTrainer
🦥 You can also ask via our Reddit page: https://reddit.com/r/unsloth/
Unsloth: Will smartly offload gradients to save VRAM!
Traceback (most recent call last):
File "/umbc/ada/ferraro/users/sroydip1/DecomposeRL/decomposer/unsloth/grpo_lora.py", line 202, in <module>
main(args)
File "/umbc/ada/ferraro/users/sroydip1/DecomposeRL/decomposer/unsloth/grpo_lora.py", line 170, in main
trainer.train(resume_from_checkpoint=resume_from_checkpoint)
File "/umbc/ada/ferraro/users/sroydip1/DecomposeRL/unsloth_compiled_cache/UnslothGRPOTrainer.py", line 66, in wrapper
output = f(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/umbc/ada/ferraro/users/sroydip1/DecomposeRL/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 2325, in train
return inner_training_loop(
^^^^^^^^^^^^^^^^^^^^
File "<string>", line 330, in _fast_inner_training_loop
File "<string>", line 40, in _unsloth_training_step
File "/umbc/ada/ferraro/users/sroydip1/DecomposeRL/unsloth_compiled_cache/UnslothGRPOTrainer.py", line 3698, in compute_loss
low_clip = masked_batch_mean(is_low_clipped.float())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/umbc/ada/ferraro/users/sroydip1/DecomposeRL/unsloth_compiled_cache/UnslothGRPOTrainer.py", line 3687, in masked_batch_mean
return (x * completion_mask).sum() / completion_token_count
~~^~~~~~~~~~~~~~~~~
RuntimeError: The size of tensor a (828) must match the size of tensor b (824) at non-singleton dimension 1
[rank0]: Traceback (most recent call last):
[rank0]: File "/umbc/ada/ferraro/users/sroydip1/DecomposeRL/decomposer/unsloth/grpo_lora.py", line 202, in <module>
[rank0]: main(args)
[rank0]: File "/umbc/ada/ferraro/users/sroydip1/DecomposeRL/decomposer/unsloth/grpo_lora.py", line 170, in main
[rank0]: trainer.train(resume_from_checkpoint=resume_from_checkpoint)
[rank0]: File "/umbc/ada/ferraro/users/sroydip1/DecomposeRL/unsloth_compiled_cache/UnslothGRPOTrainer.py", line 66, in wrapper
[rank0]: output = f(self, *args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/umbc/ada/ferraro/users/sroydip1/DecomposeRL/.venv/lib/python3.12/site-packages/transformers/trainer.py", line 2325, in train
[rank0]: return inner_training_loop(
[rank0]: ^^^^^^^^^^^^^^^^^^^^
[rank0]: File "<string>", line 330, in _fast_inner_training_loop
[rank0]: File "<string>", line 40, in _unsloth_training_step
[rank0]: File "/umbc/ada/ferraro/users/sroydip1/DecomposeRL/unsloth_compiled_cache/UnslothGRPOTrainer.py", line 3698, in compute_loss
[rank0]: low_clip = masked_batch_mean(is_low_clipped.float())
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/umbc/ada/ferraro/users/sroydip1/DecomposeRL/unsloth_compiled_cache/UnslothGRPOTrainer.py", line 3687, in masked_batch_mean
[rank0]: return (x * completion_mask).sum() / completion_token_count
[rank0]: ~~^~~~~~~~~~~~~~~~~
[rank0]: RuntimeError: The size of tensor a (828) must match the size of tensor b (824) at non-singleton dimension 1
Reactions are currently unavailable