[Feature]: Use fixed polling interval for vLLM instance readiness

### Feature Area

Other

### Problem Statement

While a vLLM instance starts up, the dual-pods controller polls `/is_sleeping` through the work queue's exponential backoff (5ms growing to 20s). Early retries are wastefully frequent; later ones add unnecessary latency. A fixed ~5s interval would be more appropriate.                                                                                                                      

From https://github.com/llm-d-incubation/llm-d-fast-model-actuation/pull/443#issuecomment-4310725850 (point 4).

### Proposed Solution

TBD

### Alternatives Considered

_No response_

### Willingness to Contribute

Yes, I can submit a PR

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Use fixed polling interval for vLLM instance readiness #455

Feature Area

Problem Statement

Proposed Solution

Alternatives Considered

Willingness to Contribute

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature]: Use fixed polling interval for vLLM instance readiness #455

Description

Feature Area

Problem Statement

Proposed Solution

Alternatives Considered

Willingness to Contribute

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions