Skip to content

Impove derivation of the model tag #104

@adubovik

Description

@adubovik

Currently the model is derived from request.model field, which is not reliable, since the user may have provided anything in this field.

request_body = json.loads(request_body_str)
stream = request_body.get("stream", False)
model = request_body.get("model", deployment)

We should rather look at response.model for chat completion requests because this field is populated by the model (or adapter) itself.
The model/adapter must know better which model has been actually called.


Note also that model field doesn't exist in the chat completion request according to Azure OpenAI API:

https://github.com/Azure/azure-rest-api-specs/blob/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview/2024-12-01-preview/inference.yaml#L2188

At the same time, model fields is a required field in the chat completion response:

https://github.com/Azure/azure-rest-api-specs/blob/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview/2024-12-01-preview/inference.yaml#L4152


We should probably keep request.model as a fallback from a missing response.model for the special case of the assistant deployment:

assistant service receives the deployment id of the model it's going to call via the request.model field.

Whether it makes any sense to report the model field for non-model deployments (applications and assisstant) in the first place, remains unclear.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions