Skip to content

Commit

Permalink
updating criticality comment to link to discussion issue
Browse files Browse the repository at this point in the history
  • Loading branch information
kfswain committed Jan 23, 2025
1 parent 1c9786f commit bb516eb
Showing 1 changed file with 1 addition and 4 deletions.
5 changes: 1 addition & 4 deletions api/v1alpha1/inferencemodel_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -74,10 +74,7 @@ type InferenceModelSpec struct {
ModelName string `json:"modelName"`

// Criticality defines how important it is to serve the model compared to other models referencing the same pool.
// Criticality impacts how traffic is handled in resource constrained situations. It handles this by
// queuing or rejecting requests of lower criticality. InferenceModels of an equivalent Criticality will
// fairly share resources over throughput of tokens. In the future, the metric used to calculate fairness,
// and the proportionality of fairness will be configurable.
// TODO: Update field upon resolution of: https://github.com/kubernetes-sigs/gateway-api-inference-extension/issues/213
//
// Default values for this field will not be set, to allow for future additions of new field that may 'one of' with this field.
// Any implementations that may consume this field may treat an unset value as the 'Standard' range.
Expand Down

0 comments on commit bb516eb

Please sign in to comment.