Skip to content

feature: Model onboarding procedure #1153

@rootfs

Description

@rootfs

Describe the feature

Right now, the router config uses model endpoint, and assigns signals, without a transparent and consistent way of evaluating the model or routing policy.

Why do you need this feature?

We need an model onboard procedure that does the following:

  • Evaluate the model through curated benchmark to infer the latency, accuracy, reasoning, and token consumption.
  • Update the config if necessary especially if the model's eval results are not covered by the (e.g. accuracy, latency) thresholds used by signals
  • Update routing models, e.g. the embedding based clustering.
  • Update the config and continuously monitor the new model's performance through online learning feature: enhance online learning with bidirectional signal extraction #1140

Additional context

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

In progress

Relationships

None yet

Development

No branches or pull requests

Issue actions