Skip to content

Avoid using usage_per_model field #102

@adubovik

Description

@adubovik

There are two reasons to avoid using the usage_per_model field to compute request prices:

for usage in usage_per_model:
point = await make_point(
logger,
deployment,
usage["model"],
project_id,
None,
None,
user_hash,
user_title,
timestamp,
request,
response,
type,
usage,
topic_model,
rates_calculator,
parent_deployment,
trace,
execution_path,
)

Unreliability of usage_per_model

chat_completion_response.statistics.usage_per_model is expected to contain a list of all model usages that were initiated directly or indirectly by the given request.

Currently, the population of this field is the responsibility of an application developer (which is not great - ideally DIAL Core should populate these fields, but this is how it is right now).

Therefore, it may not be provided at all, or it may give false information.

Double counting issue

The value of usage_per_model provides information about all transitive calls.
Therefore, nested calls may share the same usage_per_model. And if it's counted naively without considering this token sharing, we end up counting the same tokens twice.

E.g the token used by the following chain of calls:

app1 -> app2 -> app3 -> ... -> app(N-1) -> gpt-4

will be computed as app1.usage_per_model + app2.usage_per_model + ... + gpt-4.usage = N * gpt-4.usage

Whereas, it should simply be gpt-4.usage.
Therefore, we end up with N-times overestimation of tokens and price as well.


The solution is to simply avoid using usage_per_model in analytics.

DIAL Core already supplies the correct price of the request in the price and deployment_price fields.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions