Skip to content

Impact of flow controller algorithm on saturation based scaling #813

@asm582

Description

@asm582

When EPP flow controller is enabled in simulated mode experiments, we see that WVA scale up is delayed as it does not see traffic queued up in EPP. The dropped request count from the GuideLLM client perspective is increased. If this is current limitation then it should be documented by running experiments on real GPU cluster.

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triageIndicates an issue or PR lacks a triage label and requires one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions