Skip to content

[Performance]: VLLM with DP performing worst #30655

@akasshdeep

Description

@akasshdeep

Name of failing test

examples/offline_inference/data_parallel.py

Basic information

  • Flaky test
  • Can reproduce locally
  • Caused by external libraries (e.g. bug in transformers)

🧪 Describe the failing test

I have tested DP feature with 4 x A100 card. I observed that vllm with DP 4 and api-server-count = 4 performs poor as compare to 4 x VLLM instances with 1 GPU each .

📝 History of failing test

NA

CC List.

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    performancePerformance-related issues

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions