AsyncLLMServerManager.generate() outputs weird text ( ie. A mix of various languages)

### System Info

vllm                               0.11.0
verl                               0.6.1
torch                              2.8.0
transformers                       4.57.3
flash-attn                         2.8.3
GPU: H100

### Information

- [ ] The official example scripts
- [x] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [x] My own task or dataset (give details below)

### Reproduction

- When I deploy model via vllm serve ${MODEL} and inference via openai-python api, the output is fine.
- But when I use verl with vllm backend, the output often contain weird text ( ie. A mix of various languages)


```
rollout_manager = RayPPOTrainer.async_rollout_manager
self.rollout_manager: AgentLoopManager = rollout_manager
self.server_manager = AsyncLLMServerManager(config, server_handles=rollout_manager.server_handles)

token_output: TokenOutput = await self.server_manager.generate(request_id=application_id, prompt_ids=request_prompt_ids, image_data=image_data, sampling_params=sampling_params)

    async def wake_up(self):
        """Wake up all rollout replica instances asynchronously."""
        await asyncio.gather(*[replica.wake_up() for replica in self.rollout_manager.rollout_replicas])

    async def sleep(self):
        """Sleep all rollout replica instances asynchronously."""
        await asyncio.gather(*[replica.sleep() for replica in self.rollout_manager.rollout_replicas])

```

### Expected behavior

- When I deploy model via vllm serve ${MY_SFT_MODEL} and inference via openai-python api, the output is fine.
- But when I use verl with vllm backend, the output often contain weird text ( ie. A mix of various languages)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AsyncLLMServerManager.generate() outputs weird text ( ie. A mix of various languages) #4717

System Info

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AsyncLLMServerManager.generate() outputs weird text ( ie. A mix of various languages) #4717

Description

System Info

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions