Skip to content

Add a generic metrics scorer #2201

@liu-cong

Description

@liu-cong

What would you like to be added:

A scorer that's configurable to score based on an arbitrary metric from the model server. To do this we need to configure: the metric spec (name, label, type, etc.) and the scoring algo. To begin with, we will only support "gauge" metric, and a "linear normalization" algo which produces a normalized [0-1] score based on either lower_is_better or higher_is_better.

An example:

plugins:
- type: metric-scorer
  parameters:
    metric: 
      name: "num_requests_running"
       type: "gauge" 
       labels: # labels are a list of key-value pairs to match
          k1: v1 
    algo: 
      type: "linear_normalization"
      direction: "lower_is_better" # Higher raw metric = lower final score

Why is this needed:

While we implemented several in-tree scorers based on well-known metrics such as kv cache utilization, some use cases may desire a different metric. Having a generic metric scorer avoids the friction of writing a new scorer each time.

One potential use case is that running requests is desired than queue depth for latency sensitive applications.

TODO: This feature would require the ability to configure the data layer to scrape arbitrary metrics. I will need to look deeper on this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    triage/acceptedIndicates an issue or PR is ready to be actively worked on.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions