Introduce backend rollout-completions interface and decouple OpenEnv helper from vLLM internals#5256
Introduce backend rollout-completions interface and decouple OpenEnv helper from vLLM internals#5256rycerzes wants to merge 8 commits intohuggingface:mainfrom
Conversation
…neration and sync weights
- prevent recursive generation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
|
|
||
| # Use the base Trainer input preparation path, not trainer-specific overrides | ||
| # like GRPO/RLOO _prepare_inputs, to avoid recursive generation. | ||
| base_prepare_inputs = super(type(trainer), trainer)._prepare_inputs |
There was a problem hiding this comment.
super(type(trainer)) breaks for subclassed trainers
Medium Severity
super(type(trainer), trainer)._prepare_inputs resolves based on the runtime class, not a fixed class. If a user subclasses GRPOTrainer (or RLOOTrainer), type(trainer) is the subclass, and super() lands on the trainer-specific _prepare_inputs override instead of the base Trainer._prepare_inputs. The old inline code used Python 3's argument-free super() inside the trainer method, which always resolved relative to the defining class (GRPOTrainer), correctly skipping its own override. The new standalone factory function can't use __class__-based super(), and the type()-based workaround doesn't skip enough MRO levels for subclassed trainers.


Summary
Closes #5194, (previous step #5244, part of #5119) adds an internal rollout-completions capability to backend.py and refactors utils.py to dispatch through
trainer.generation_backend, removing direct trainer/backend introspection in helper flow.Changes
RolloutCompletiondataclassRolloutCompletionsBackendprotocolgenerate_rollout_completions(...)inVLLMBackendAdapter(server + colocate)trainer.use_vllm/trainer.vllm_modetrainer.vllm_generation.*callstrainer.generation_backend.generate_rollout_completions(...)prompt_ids,completion_ids,logprobs,textPreserved
CC: @albertvillanova
Note
Cursor Bugbot is generating a summary for commit f9bb56b. Configure here.