Upload testing suite for `DistillationTrainer` by cmpatino · Pull Request #5615 · huggingface/trl

cmpatino · 2026-04-21T13:59:09Z

What does this PR do?

Uploads tests for the DistillationTrainer

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline, Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

AI writing disclosure

We welcome the use of AI tools to help with contributions. For transparency and to help us improve our review process, please indicate the level of AI involvement in this PR.

No AI usage: the PR was written entirely by a human.
AI-assisted: some parts were suggested or improved by AI, but the PR was written and reviewed by a human.
AI-generated: the PR was mostly or fully generated by an AI tool.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

Note

Low Risk
Test-only changes; primary risk is increased CI runtime/flakiness due to added integration-style training tests and dataset/model downloads.

Overview
Adds a much broader test suite for DistillationTrainer, covering config validation, teacher-server request shaping via build_teacher_request_inputs, and correct handling of ragged/padded teacher logprob responses (including -inf padding behavior and finite gradients through reverse-KL/JSD paths).

Introduces unit tests for DistillationTrainer.generalized_jsd_loss (beta/temperature/reduction/symmetry/identity cases), a lightweight end-to-end train() smoke test (including checkpoint save), an optional Liger-kernel training test, parity checks between local-teacher vs vLLM-teacher-server loss computation (mocked VLLMClient), and a regression test that _RepeatBatchDataLoader forwards set_epoch correctly.

^{Reviewed by Cursor Bugbot for commit b3340a2. Bugbot is set up for automated code reviews on this repo. Configure here.}

qgallouedec

Thanks for adding tests to DistillationTrainer, it had none, so this is directionally welcome. A few things worth addressing before merge:

Heavy use of DistillationTrainer.__new__(...) + manual attribute assignment, plus five Dummy*/custom mock classes. Every test assembles a fake trainer by bypassing __init__ and setting ~10 private attributes inline. This couples every test to internal attribute names. Any rename or new attribute used in compute_loss silently breaks the suite. It also cuts against two principles: consistency (the rest of tests/experimental/, e.g. test_gkd_trainer.py, loads a tiny real model and exercises the real __init__) and simplicity (the mock scaffolding and duplicated attribute setup in test_server_teacher_path_handles_variable_prompt_lengths / ..._padded_completions is exactly that). A tiny-model fixture would remove most of it.
No end-to-end test. One short trainer.train() on a tiny student+teacher would catch far more than the current mocked suite combined, and would make most of the attribute-juggling tests redundant. Plus it would be more align with the principle of testing behavior over implementation.

qgallouedec · 2026-04-21T16:19:35Z

+    torch.testing.assert_close(local_loss, server_loss)
+
+
+def test_sampled_mode_keeps_teacher_argmax_for_forward_support():


this is tautological. The expected value is computed by calling the same private helper _jsd_divergence with the same support/mask construction that compute_loss uses internally, so it tests wiring, not semantics. Any bug inside _jsd_divergence is invisible. Derive the expected value from first principles (plain JSD formula) or drop it.

qgallouedec · 2026-04-21T16:20:25Z

+                report_to="none",
+            )
+
+    assert caught == []


fragile: any unrelated DeprecationWarning from a dependency will break it. The pytest.raises is already the real assertion

Thank you for the feedback! I addressed your comments with the following changes:

Followed GKD's tests and used a small model and dataset to test the trainer.

Improved the Liger tests by following the testing approach from the SFTTrainer.

HuggingFaceDocBuilderDev · 2026-04-22T08:39:36Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 1033b7c. Configure here.}

Upload testing suite for DistillationTrainer

91cb49c

cmpatino marked this pull request as ready for review April 21, 2026 13:59

qgallouedec reviewed Apr 21, 2026

View reviewed changes

Revampt tests

43997cf

cursor Bot reviewed Apr 22, 2026

View reviewed changes

Comment thread tests/experimental/test_distillation_trainer.py Outdated

Comment thread tests/experimental/test_distillation_trainer.py Outdated

cmpatino added 2 commits April 22, 2026 09:39

Improve Liger tests

2a41c97

Address comment about tautological test

1033b7c

cursor Bot reviewed Apr 22, 2026

View reviewed changes

Comment thread tests/experimental/test_distillation_trainer.py

Simulate variable-length responses from server

43950bb

cmpatino requested a review from qgallouedec April 23, 2026 08:30

qgallouedec approved these changes Apr 29, 2026

View reviewed changes

cmpatino added 3 commits April 30, 2026 10:02

Merge branch 'main' into kd-distillation-tests

00991a9

Remove unnecessary slow tag

8dc8cd5

Run precommit

b3340a2

cmpatino merged commit 64e3eb1 into huggingface:main Apr 30, 2026
4 checks passed

cmpatino deleted the kd-distillation-tests branch April 30, 2026 09:11

qgallouedec pushed a commit that referenced this pull request May 3, 2026

Upload testing suite for DistillationTrainer (#5615)

b231373

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upload testing suite for `DistillationTrainer`#5615

Upload testing suite for `DistillationTrainer`#5615
cmpatino merged 8 commits intohuggingface:mainfrom
cmpatino:kd-distillation-tests

cmpatino commented Apr 21, 2026 •

edited by cursor Bot

Loading

Uh oh!

qgallouedec left a comment

Uh oh!

qgallouedec Apr 21, 2026

Uh oh!

qgallouedec Apr 21, 2026

Uh oh!

cmpatino Apr 22, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Apr 22, 2026

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		torch.testing.assert_close(local_loss, server_loss)


		def test_sampled_mode_keeps_teacher_argmax_for_forward_support():

Conversation

cmpatino commented Apr 21, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

AI writing disclosure

Who can review?

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

qgallouedec Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

qgallouedec Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

cmpatino Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 22, 2026

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cmpatino commented Apr 21, 2026 •

edited by cursor Bot

Loading