Skip to content

Define a standard approach for plugins to identify prefill and decode endpoints #2080

@ahg-g

Description

@ahg-g

What would you like to be added:

Define a standard approach for plugins to identify prefill and decode endpoints

Why is this needed:

Some plugins need to be p/d aware, one example is #2021

One suggestion is to have the SchedulingResult define optional PrefillProfileName and DecodeProfileName in addition to the PrimaryProfileName we currently have. The argument for this is that p/d is a core feature of LLM serving, and so having such explicit definitions is reasonable especially when defined as optional fields.

Please suggest other options to address this requirement.

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions