batch params together in weight sync and async update the weights by winglian · Pull Request #5249 · huggingface/trl

winglian · 2026-03-09T13:04:39Z

What does this PR do?

Rather than firing off 100s of HTTP calls to update weights to vLLM, we can update these in batch. Also, we can async the rpc collective call so the update doesn't block the HTTP call.

For Owen2 0.5B, this cuts the weight sync from 0.6sec to 0.15sec.

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Note

Medium Risk
Touches distributed weight-sync plumbing (HTTP + NCCL/XCCL ordering, new batching/chunking, and async worker dispatch), so subtle ordering or shape/dtype mismatches could cause sync hangs or incorrect weights despite the change being localized.

Overview
Speeds up vLLM server-mode weight synchronization by introducing a batched update path: the client now can POST parameter metadata once (optionally chunked by total tensor elements) and then broadcast tensors sequentially, and the server loads the received weights in one load_weights() call.

Updates VLLMGeneration.sync_weights() to use this batched sync when not using ZeRO-3/FSDP gather paths, and threads a new config/arg (weight_sync_chunk_size / vllm_weight_sync_chunk_size) from GRPOConfig through GRPOTrainer into VLLMGeneration.

On the trl vllm-serve side, adds a /batch_update_named_params/ endpoint and switches the existing init_communicator/update_named_param fan-out to workers to send over pipes concurrently via asyncio to avoid blocking the HTTP handler.

^{Written by Cursor Bugbot for commit b777252. This will update automatically on new commits. Configure here.}

cursor

Cursor Bugbot has reviewed your changes and found 3 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

trl/generation/vllm_generation.py

trl/scripts/vllm_serve.py

commit the batch_update_named_params method fix peft handling and also reorder elif so we don't have unhandled conditions

qgallouedec · 2026-03-09T23:32:37Z

thanks for the PR. I'd prefer having a good heuristic (hard-coded value) and avoid adding again a new parameter. What would be a good value?

cursor bot reviewed Mar 9, 2026

View reviewed changes

trl/generation/vllm_generation.py Outdated Show resolved Hide resolved

trl/generation/vllm_generation.py Outdated Show resolved Hide resolved

trl/scripts/vllm_serve.py Show resolved Hide resolved

batch params together in weight sync and async update the weights

b777252

commit the batch_update_named_params method fix peft handling and also reorder elif so we don't have unhandled conditions

winglian force-pushed the faster-sync-weights branch from fe437e1 to b777252 Compare March 9, 2026 13:55

kashif approved these changes Mar 9, 2026

View reviewed changes

winglian mentioned this pull request Mar 9, 2026

async streaming grpo w prefetch #5250

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

batch params together in weight sync and async update the weights#5249

batch params together in weight sync and async update the weights#5249
winglian wants to merge 1 commit intohuggingface:mainfrom
winglian:faster-sync-weights

winglian commented Mar 9, 2026 •

edited by cursor bot

Loading

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qgallouedec commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

winglian commented Mar 9, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qgallouedec commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

winglian commented Mar 9, 2026 •

edited by cursor bot

Loading