Skip to content

feat(ai-proxy): support Amazon Bedrock #2039

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 22, 2025

Conversation

HecarimV
Copy link
Contributor

@HecarimV HecarimV commented Apr 10, 2025

Ⅰ. Describe what this PR did

Support ai-proxy bedrock provider

Ⅱ. Does this pull request fix one issue?

fix #1695

Ⅳ. Describe how to verify it

docker-compose.yaml

version: '3.7'
services:
  envoy:
    image: higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:v1.4.0-rc.1
    entrypoint: /usr/local/bin/envoy
    # 开启了 debug 级别日志方便调试
    command: -c /etc/envoy/envoy.yaml --component-log-level wasm:debug
    networks:
      - higress-net
    ports:
      - "10000:10000"
    volumes:
      - ./envoy.yaml:/etc/envoy/envoy.yaml
      - ./plugin.wasm:/etc/envoy/plugin.wasm
networks:
  higress-net: {}

envoy.yaml

# File generated by hgctl. Modify as required.

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: aws
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/plugin.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                "provider": {
                                  "type": "bedrock",                                
                                  "apiTokens": [
                                    "your-api-token"
                                  ],
                                  "awsAccessKey": "your-access-key"
                                  "awsRegion": "your-aws-regin e.g.  us-west-2"
                                  "awsSecretKey": "your-secret-key"
                                }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: aws
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: aws
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: bedrock-runtime.us-west-2.amazonaws.com
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "bedrock-runtime.us-west-2.amazonaws.com"

测试非流式请求:

curl -X POST 'http: //localhost:10000/v1/chat/completions' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "meta.llama3-1-70b-instruct-v1:0",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": false
}'

非流式响应:

{
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "\n\n😊 Ni Hao! I'm LLaMA, an AI assistant developed by Meta AI that can understand and respond to human input in a conversational manner. I'm not a human, but a computer program designed to simulate conversation and answer questions to the best of my knowledge. I'm here to help you with any questions or topics you'd like to discuss! 💬",
                "role": "assistant"
            }
        }
    ],
    "created": 1744605371,
    "id": "465b3c10-a210-4d39-b7e1-9acc0b990604",
    "model": "meta.llama3-8b-instruct-v1:0",
    "object": "chat.completion",
    "usage": {
        "completion_tokens": 77,
        "prompt_tokens": 20,
        "total_tokens": 97
    }
}

测试流式请求:

curl -X POST 'http: //localhost:10000/v1/chat/completions' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "meta.llama3-1-70b-instruct-v1:0",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": true
}'

流式响应:

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"role":"assistant"}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":"\n\n"}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":""}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":""}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":""}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":""}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":""}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":" L"}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":"La"}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":"MA"}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":""}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":"一个"}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":""}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":""}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":"智能"}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":"语言"}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":"模型"}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":""}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{"content":""}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{}}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{}}

data: {"id":"cc212aec-5682-4bc5-a1f3-c0e164e2d8b4","choices":[],"created":1744605647,"model":"meta.llama3-8b-instruct-v1:0","object":"chat.completion","usage":{"prompt_tokens":28,"completion_tokens":18,"total_tokens":46}}

@johnlanni
Copy link
Collaborator

cc @cr7258 @CH3CHO

@HecarimV HecarimV force-pushed the support-bedrock-04101137 branch from ca13e9c to f89fa7a Compare April 14, 2025 05:59
contextCache *contextCache
}

func (b *bedrockProvider) OnStreamingResponseBody(ctx wrapper.HttpContext, name ApiName, chunk []byte, isLastChunk bool) ([]byte, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这块可以看下能否改成实现 StreamingEventHandler 接口,这样就不需要 extractAmazonEventStreamEvents 方法了,SSE 消息的逻辑会由 ExtractStreamingEvents 公共方法进行处理。

参考 PR: #1800

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bedrock的返回不是text/event-stream,是这个
content-type:application/vnd.amazon.eventstream
好像不能转换成

type StreamEvent struct {
	Id         string `json:"id"`
	Event      string `json:"event"`
	Data       string `json:"data"`
	HttpStatus string `json:"http_status"`
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这块可以看下能否改成实现 StreamingEventHandler

bedrock 的event stream大概是这个结构
image

Copy link
Collaborator

@CH3CHO CH3CHO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 43.66%. Comparing base (ef31e09) to head (bf4301b).
Report is 461 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2039      +/-   ##
==========================================
+ Coverage   35.91%   43.66%   +7.75%     
==========================================
  Files          69       79      +10     
  Lines       11576    12753    +1177     
==========================================
+ Hits         4157     5569    +1412     
+ Misses       7104     6836     -268     
- Partials      315      348      +33     

see 75 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@CH3CHO CH3CHO merged commit a57173c into alibaba:main Apr 22, 2025
12 checks passed
VinceCui pushed a commit to VinceCui/higress that referenced this pull request May 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AI Proxy's provider support AWS Bedrock
5 participants