-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
System Info
vllm 0.11.0
verl 0.6.1
torch 2.8.0
transformers 4.57.3
flash-attn 2.8.3
GPU: H100
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
- When I deploy model via vllm serve ${MODEL} and inference via openai-python api, the output is fine.
- But when I use verl with vllm backend, the output often contain weird text ( ie. A mix of various languages)
rollout_manager = RayPPOTrainer.async_rollout_manager
self.rollout_manager: AgentLoopManager = rollout_manager
self.server_manager = AsyncLLMServerManager(config, server_handles=rollout_manager.server_handles)
token_output: TokenOutput = await self.server_manager.generate(request_id=application_id, prompt_ids=request_prompt_ids, image_data=image_data, sampling_params=sampling_params)
async def wake_up(self):
"""Wake up all rollout replica instances asynchronously."""
await asyncio.gather(*[replica.wake_up() for replica in self.rollout_manager.rollout_replicas])
async def sleep(self):
"""Sleep all rollout replica instances asynchronously."""
await asyncio.gather(*[replica.sleep() for replica in self.rollout_manager.rollout_replicas])
Expected behavior
- When I deploy model via vllm serve ${MY_SFT_MODEL} and inference via openai-python api, the output is fine.
- But when I use verl with vllm backend, the output often contain weird text ( ie. A mix of various languages)
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working