[Bug]: Async chunk requests can stay in WAITING_FOR_CHUNK indefinitely if upstream chunk never arrives

### Your current environment

<details>
<summary>The output of <code>python collect_env.py</code></summary>

OS: Ubuntu 22.04.5 LTS (x86_64)
Python: 3.12.11
PyTorch: 2.11.0+cu130
CUDA available: True
GPU: NVIDIA GeForce RTX 4090

</details>

### Your code version

<details>
vLLM Version: 0.20.0
</details>


###  Describe the bug

<img width="415" height="75" alt="Image" src="https://github.com/user-attachments/assets/42f609ed-cab9-4ac7-bc23-dd1692d94f5a" />

In async chunk mode, downstream requests can remain in `RequestStatus.WAITING_FOR_CHUNK` indefinitely if the expected upstream chunk is never produced or never becomes visible through the connector.

The relevant path is `OmniChunkTransferAdapter.process_pending_chunks()`:

- downstream stage calls `load_async(request)`;
- request status is changed to `WAITING_FOR_CHUNK`;
- request is temporarily removed from the scheduler queue;
- `restore_queues()` puts it back after scheduling;
- if `_poll_single_request()` never observes the expected chunk, the request keeps cycling in `WAITING_FOR_CHUNK`.

Currently I could not find a timeout or cleanup path for this state. This means failures in the upstream stage, connector write failures, dropped terminal chunks, or a custom payload extraction path returning no payload can turn into a hanging streaming request instead of a visible request failure.

Expected behavior:

If a request waits too long for an async chunk, vllm-omni should eventually fail/cleanup the request instead of leaving it in `WAITING_FOR_CHUNK` forever.

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://vllm-omni.readthedocs.io), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Async chunk requests can stay in WAITING_FOR_CHUNK indefinitely if upstream chunk never arrives #3833

Your current environment

Your code version

Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug]: Async chunk requests can stay in WAITING_FOR_CHUNK indefinitely if upstream chunk never arrives #3833

Description

Your current environment

Your code version

Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions