Skip to content

同一个模型多个AI服务提供商如何hash路由 || How to hash route multiple AI service providers for the same model #3473

@WellMaxWang

Description

@WellMaxWang

If you are reporting any crash or any potential security issue, do not
open an issue in this repo. Please report the issue via ASRC(Alibaba Security Response Center) where the issue will be triaged appropriately.

  • I have searched the issues of this repository and believe that this is not a duplicate.

Ⅰ. Issue Description

对于Responses协议的模型,每次推理请求会携带上一次对话的ID,请求之间不是完全无状态。如果一个Responses的模型,配置了多个AI 服务提供商/账号, 按照比例随机轮询会导致一次会话的多个请求被路由到不同AI服务提供商。Higress是否支持按照consumer进行hash路由,一个用户的请求被固定路由到同一个AI服务提供商?

另外: 对于视频生成等场景也会涉及到问题。

Ⅱ. Describe what happened

If there is an exception, please attach the exception trace:

Just paste your stack trace here!

目前AI 路由管理中,只能对AI 服务提供商进行按比例路由分发,不能实现hash路由。

Ⅲ. Describe what you expected to happen

希望得到指导,是否可以进行consumer固定路由到一个AI 服务提供商

Ⅳ. How to reproduce it (as minimally and precisely as possible)

  1. xxx
  2. xxx
  3. xxx

Ⅴ. Anything else we need to know?

It is recommended to provided Higress runtime logs and configurations for us to investigate your issue, especially for controller and gateway components.

Please checkout following documents on how to obtain these data.

Ⅵ. Environment:

  • Higress version: 2.1.9
  • OS: 阿里云ACK
  • Others:

If you are reporting any crash or any potential security issue, do not
open an issue in this repo. Please report the issue via ASRC(Alibaba Security Response Center) where the issue will be triaged appropriately.

  • I have searched the issues of this repository and believe that this is not a duplicate.

Ⅰ. Issue Description

For the Responses protocol model, each inference request carries the ID of the previous conversation, and the requests are not completely stateless. If a Responses model is configured with multiple AI service providers/accounts, random polling in proportion will cause multiple requests in one session to be routed to different AI service providers. Does Higress support hash routing according to consumers, so that a user's request is fixedly routed to the same AI service provider?

In addition: Problems will also be involved in scenarios such as video generation.

Ⅱ. Describe what happened

If there is an exception, please attach the exception trace:

Just paste your stack trace here!

Currently, in AI routing management, only proportional routing can be distributed to AI service providers, and hash routing cannot be implemented.

Ⅲ. Describe what you expected to happen

I would like guidance on whether it is possible to route consumers to an AI service provider.

Ⅳ. How to reproduce it (as minimally and precisely as possible)

1.xxx
2.xxx
3.xxx

Ⅴ. Anything else we need to know?

It is recommended to provided Higress runtime logs and configurations for us to investigate your issue, especially for controller and gateway components.

Please checkout following documents on how to obtain these data.

Ⅵ. Environment:

  • Higress version: 2.1.9
  • OS: Alibaba Cloud ACK
  • Others:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions