SFT sample packing #3566

djsaunde · 2025-11-07T19:03:09Z

This PR patches the SFTTrainer class' constructor to auto-enable sample packing when applicable (i.e., when the user is not training a multi-modal model or passing in a custom data collator). Sample packing is a good default in SFT since we reduce the amount of zero-padding we see and increase the training token/s throughput as a result.

~~Followup to #3525; can merge after that PR is.~~ I closed that PR, I think we should just merge this one.

Needs to be extensively tested so we don't break any existing notebooks / scripts.

djsaunde · 2025-11-20T15:58:08Z

I tested all the SFT notebooks on the main page README. I'd like to test a few more notebooks, including ones that we don't expect to use packing (multimodal training, RL training, ...).

djsaunde · 2025-11-20T17:53:24Z

Okay, I ran through all the notebooks displayed in the README. As expected, non of the non-text-only SFT notebooks utilized sample packing, and none of them hit any errors when switching to this branch.

for more information, see https://pre-commit.ci

djsaunde requested a review from danielhanchen November 7, 2025 19:03

djsaunde self-assigned this Nov 7, 2025

djsaunde removed the request for review from danielhanchen November 7, 2025 19:03

djsaunde changed the title ~~auto-enable sample packing~~ auto-enable SFT sample packing Nov 7, 2025

djsaunde force-pushed the auto-packing branch 2 times, most recently from 072de80 to efe4424 Compare November 10, 2025 20:23

djsaunde force-pushed the auto-packing branch 3 times, most recently from f352814 to 9cd0b26 Compare November 19, 2025 19:14

djsaunde mentioned this pull request Nov 20, 2025

Uncontaminated Sample Packing #3525

Closed

1 task

djsaunde marked this pull request as ready for review November 20, 2025 13:47

djsaunde force-pushed the auto-packing branch from 8f06935 to 50e5104 Compare November 20, 2025 14:06

djsaunde mentioned this pull request Nov 20, 2025

gemma3-270m: reduce batch size for sample packing unslothai/notebooks#135

Merged

djsaunde changed the title ~~auto-enable SFT sample packing~~ SFT sample packing Nov 21, 2025

djsaunde and others added 10 commits November 23, 2025 11:20

implement (sdpa, xformers, fa2) sample packing

7bd2558

attention dispatching

ebd9c77

ddp working OOTB with CLI

96b06fb

packed SWA and softcap support

cc62c07

enable batch flattening

a6d1fe2

LGPL license headers

7986a09

mask packed sequence boundaries

7e430cf

auto-enable sample packing

f54d7d3

[pre-commit.ci] auto fixes from pre-commit.com hooks

112351d

for more information, see https://pre-commit.ci

Add explicit toggle for sample packing

2b9f5a7

djsaunde force-pushed the auto-packing branch from 585e26f to e498912 Compare November 23, 2025 16:20

Add explicit toggle for sample packing

6c90169

djsaunde force-pushed the auto-packing branch from e498912 to 6c90169 Compare November 24, 2025 13:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

SFT sample packing #3566

SFT sample packing #3566

djsaunde commented Nov 7, 2025 •

edited

Loading

Uh oh!

djsaunde commented Nov 20, 2025

Uh oh!

djsaunde commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

SFT sample packing #3566

Are you sure you want to change the base?

SFT sample packing #3566

Conversation

djsaunde commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

djsaunde commented Nov 20, 2025

Uh oh!

djsaunde commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

djsaunde commented Nov 7, 2025 •

edited

Loading