huggingface / trl Public

Notifications You must be signed in to change notification settings
Fork 2.7k
Star 18.2k

Code
Issues 553
Pull requests 142
Discussions
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security and quality
Insights

Pull requests: huggingface/trl

Labels 37 Milestones 0

New pull request New

120 Open 2,152 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Reduce inconsistency across trainer test files

#5678 opened Apr 29, 2026 by qgallouedec Member

Loading…

Fix discarded assertion message in trainer parameter checks

#5677 opened Apr 29, 2026 by qgallouedec Member

Loading…

8 tasks

Enable chunked NLL loss with PEFT in SFT

#5676 opened Apr 28, 2026 by qgallouedec Member

Loading…

Add {% generation %} markers for Cohere2 chat template

#5675 opened Apr 28, 2026 by qgallouedec Member

Loading…

Simplify peft_config handling in experimental trainers

#5674 opened Apr 28, 2026 by albertvillanova Member

Loading…

Simplify peft_config handling in core trainers

#5673 opened Apr 28, 2026 by albertvillanova Member

Loading…

feat(vllm-serve): add --reasoning-parser and --reasoning-config flags

#5672 opened Apr 28, 2026 by kfirah-create

Loading…

2 of 8 tasks

Fix token_type_ids requirement for gemma-3 models in GRPOTrainer

#5644 opened Apr 26, 2026 by robrui

Loading…

3 of 6 tasks

DeepSeek v4

#5641 opened Apr 25, 2026 by qgallouedec Member • Draft

8 tasks

Fix spurious KL gradients for zero-std reward groups in GRPOTrainer

#5640 opened Apr 24, 2026 by robrui

Loading…

3 of 6 tasks

Align tiny-Glm4MoeForCausalLM with GLM-4.5 reference config

#5638 opened Apr 24, 2026 by qgallouedec Member

Loading…

8 tasks

Refactor tiny-model generation scripts

#5637 opened Apr 24, 2026 by qgallouedec Member

Loading…

feat: Add generation_kwargs support to LogCompletionsCallback and Wea…

#5625 opened Apr 22, 2026 by LhaseParth2610

Loading…

4 of 8 tasks

Upload testing suite for DistillationTrainer

#5615 opened Apr 21, 2026 by cmpatino Collaborator

Loading…

3 of 8 tasks

Add LoRA support for AsyncGRPO

#5610 opened Apr 21, 2026 by jonahsamost

Loading…

2 of 8 tasks

experimental: Self-Distillation Zero

#5609 opened Apr 20, 2026 by LeonEricsson Collaborator

Loading…

1 of 8 tasks

support prefetch/prefetch_depth for async GRPO for ~5% speedups

#5602 opened Apr 20, 2026 by winglian Contributor

Loading…

1 of 8 tasks

fix(distillation): reverse-KL server path NaN on variable completion length

#5594 opened Apr 19, 2026 by k1064190

Loading…

3 of 8 tasks

Fix nested vocab_size for DistillationTrainer and GOLDTrainer

#5592 opened Apr 19, 2026 by Beichen-Ma

Loading…

2 of 8 tasks

feat: add TargetPO trainer

#5591 opened Apr 18, 2026 by JeanKaddour • Draft

4 of 8 tasks

Add training chat template for Qwen3-2507

#5574 opened Apr 16, 2026 by SwayamInSync Contributor

Loading…

refactor: self distillation trainers (sdpo/sdft/...)

#5573 opened Apr 16, 2026 by LeonEricsson Collaborator

Loading…

2 of 8 tasks

Fix empty-target self-distillation loss to stay connected to model graph

#5572 opened Apr 16, 2026 by walawalagoose

Loading…

3 of 8 tasks

Improve BrowserGym examples for latest OpenEnv version

#5568 opened Apr 16, 2026 by sergiopaniego Member

Loading…

8 tasks

Revert VLM support in parse_response

#5561 opened Apr 15, 2026 by qgallouedec Member • Draft

Previous 1 2 3 4 5 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!