Add support for priority in vllm backend #88

TheCodeWrangler · 2025-04-24T02:29:41Z

Add Priority Request Support for vLLM Async Engine

Description

This PR adds support for priority-based request scheduling in the vLLM async engine. When the engine is configured with a scheduler policy set to priority, the .generate() method now supports an input parameter for priority (lowest priority first). This PR adds an optional input tensor for priority (defaults to 0) which is passed to the generate method.

Motivation

In applications where multiple sources submit work to the vLLM backend with different priorities, it is desirable to have the most time-sensitive work performed first. This feature allows users to:

Prioritize critical requests over background tasks
Implement different service level agreements (SLAs) for different types of requests
Better manage system resources by processing high-priority requests first

Changes

Added an optional priority input tensor to the model configuration:

{
    "name": "priority",
    "data_type": "TYPE_INT32",
    "dims": [1],
    "optional": True
}

Modified the _generate method to handle the priority parameter:

if not priority:
    priority = 0
response_iterator = self._llm_engine.generate(
    prompt, sampling_params, request_id, lora_request=lora_request, priority=priority
)

Testing

Added unit tests for priority handling
Verified that requests with different priorities are processed in the correct order
Confirmed that default priority (0) works when priority is not specified

Documentation

Updated model configuration documentation
Added examples of priority usage

AlwaysLearningMore and others added 5 commits April 23, 2025 21:28

Add support for priority

1ef3eeb

Updated to set default value on priority

e38a818

Updated README.md

800a502

Ran pre-commit and black formatter

920c693

Updated to handle optional input

9324920

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for priority in vllm backend #88

Add support for priority in vllm backend #88

Uh oh!

TheCodeWrangler commented Apr 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

Add support for priority in vllm backend #88

Are you sure you want to change the base?

Add support for priority in vllm backend #88

Uh oh!

Conversation

TheCodeWrangler commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add Priority Request Support for vLLM Async Engine

Description

Motivation

Changes

Testing

Documentation

Uh oh!

Uh oh!

TheCodeWrangler commented Apr 24, 2025 •

edited

Loading