Skip to content

Support for Data Parallelism in serving #340

@elevran

Description

@elevran

Based on the assumption that vLLM supports multiple rank process in the same Pod using different ports (see this doc for details).
Should support P/D disaggregation as well.

Sub-issues

Metadata

Metadata

Assignees

Labels

triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

Status

Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions