Skip to content

Add a regression test for the non-FSDP ORPO path#1

Open
kashif wants to merge 3 commits into
Ola-Vish:bugfix/orpo_trainerfrom
kashif:orpo-nonfsdp-test
Open

Add a regression test for the non-FSDP ORPO path#1
kashif wants to merge 3 commits into
Ola-Vish:bugfix/orpo_trainerfrom
kashif:orpo-nonfsdp-test

Conversation

@kashif

@kashif kashif commented May 31, 2026

Copy link
Copy Markdown

Nice fix! The only thing missing from linkedin#1229 / linkedin#1230 was a test that locks it in, so here's one to go on top of your branch.

It builds a tiny Llama and calls concatenated_forward on a plain (non-FSDP) model. Before your change this path went unconditionally through _FSDPForwardRedirection, which asserts the module is FSDP and blew up on a single GPU — exactly the crash from linkedin#1229. The test just checks it runs and returns a finite loss.

I kept it narrow on purpose: it stubs concatenated_inputs and builds the trainer with __new__, so it only exercises the forward path you touched and doesn't drag in the whole dataset/tokenizer pipeline. Guarded with importorskip("trl") since the trainer subclasses trl's ORPOTrainer, and it uses infer_device() so CI runs it on GPU.

Feel free to squash it into your branch.

@kashif

kashif commented May 31, 2026

Copy link
Copy Markdown
Author

Folded the trl-compat work into this branch so everything ORPO lands together (closed linkedin#1244 in favor of this):

  • orpo_trainer.py: import ORPOTrainer from trl.experimental.orpo with a fallback to the old trl.trainer path — recent trl moved ORPO into experimental, so the old import fails outright otherwise.
  • run_orpo.py + docs/Examples.md: same fallback for ORPOConfig (the docs snippet was flagged by Copilot on Import ORPO from trl.experimental linkedin/Liger-Kernel#1244).
  • Added the skipped regression test for the non-FSDP path (importorskip("trl"), so it's a no-op in CI where trl isn't installed, and runs where it is — verified passing locally on trl 1.5 + GPU).

No new deps: trl stays optional, nothing added to setup.py. ruff clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant