Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
15 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,9 @@ Now, you can try out the proxy APIs. Let's say you want to test Claude 3 Sonnet
```bash
export OPENAI_API_KEY=<API key>
export OPENAI_BASE_URL=<API base url>
# Optional: use VPC interface endpoints / custom Bedrock endpoints
# export BEDROCK_URL=https://vpce-xxxxxxxx.bedrock.<region>.vpce.amazonaws.com
# export BEDROCK_RUNTIME_URL=https://vpce-xxxxxxxx.bedrock-runtime.<region>.vpce.amazonaws.com
# For older versions
# https://github.com/openai/openai-python/issues/624
export OPENAI_API_BASE=<API base url>
Expand Down
10 changes: 10 additions & 0 deletions deployment/BedrockProxy.template
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,14 @@ Parameters:
- "true"
- "false"
Description: Enable prompt caching for supported models (Claude, Nova). When enabled, adds cachePoint to system prompts and messages for cost savings.
BedrockUrl:
Type: String
Default: ""
Description: Optional custom endpoint URL for Bedrock control plane (e.g., VPC endpoint URL)
BedrockRuntimeUrl:
Type: String
Default: ""
Description: Optional custom endpoint URL for Bedrock runtime plane (e.g., VPC endpoint URL)
Resources:
# IAM Role for Lambda
ProxyApiHandlerServiceRole:
Expand Down Expand Up @@ -85,6 +93,8 @@ Resources:
ENABLE_CROSS_REGION_INFERENCE: "true"
ENABLE_APPLICATION_INFERENCE_PROFILES: "true"
ENABLE_PROMPT_CACHING: !Ref EnablePromptCaching
BEDROCK_URL: !Ref BedrockUrl
BEDROCK_RUNTIME_URL: !Ref BedrockRuntimeUrl
API_ROUTE_PREFIX: /v1
MemorySize: 1024
PackageType: Image
Expand Down
15 changes: 14 additions & 1 deletion deployment/BedrockProxyFargate.template
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,14 @@ Parameters:
- "true"
- "false"
Description: Enable prompt caching for supported models (Claude, Nova). When enabled, adds cachePoint to system prompts and messages for cost savings.
BedrockUrl:
Type: String
Default: ""
Description: Optional custom endpoint URL for Bedrock control plane (e.g., VPC endpoint URL)
BedrockRuntimeUrl:
Type: String
Default: ""
Description: Optional custom endpoint URL for Bedrock runtime plane (e.g., VPC endpoint URL)
Resources:
VPCB9E5F0B4:
Type: AWS::EC2::VPC
Expand Down Expand Up @@ -261,6 +269,12 @@ Resources:
- Name: ENABLE_PROMPT_CACHING
Value:
Ref: EnablePromptCaching
- Name: BEDROCK_URL
Value:
Ref: BedrockUrl
- Name: BEDROCK_RUNTIME_URL
Value:
Ref: BedrockRuntimeUrl
Essential: true
Image:
Ref: ContainerImageUri
Expand Down Expand Up @@ -450,4 +464,3 @@ Outputs:
- ProxyALB87756780
- DNSName
- /api/v1

2 changes: 2 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ services:
- "127.0.0.1:8000:8080"
environment:
- ENABLE_PROMPT_CACHING=true
- BEDROCK_URL=${BEDROCK_URL:-}
- BEDROCK_RUNTIME_URL=${BEDROCK_RUNTIME_URL:-}
Comment on lines +12 to +13
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: ${BEDROCK_URL:-} sets an empty string when unset, not unset.

This works today because _env_or_none() in setting.py converts "" to None. But it's a subtle coupling — if _env_or_none were ever changed, the docker-compose default would break.

Consider using ${BEDROCK_URL-} (without :) so the env var remains unset inside the container when not defined on the host, rather than being set to an empty string. This removes the dependency on _env_or_none() for correctness.

- API_KEY=${OPENAI_API_KEY}
- AWS_PROFILE
- AWS_ACCESS_KEY_ID
Expand Down
22 changes: 22 additions & 0 deletions docs/Usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ Assuming you have set up below environment variables after deployed:
```bash
export OPENAI_API_KEY=<API key>
export OPENAI_BASE_URL=<API base url>
# Optional: use VPC interface endpoints / custom Bedrock endpoints
# export BEDROCK_URL=https://vpce-xxxxxxxx.bedrock.<region>.vpce.amazonaws.com
# export BEDROCK_RUNTIME_URL=https://vpce-xxxxxxxx.bedrock-runtime.<region>.vpce.amazonaws.com
```

**API Example:**
Expand All @@ -23,6 +26,25 @@ You can use this API to get a list of supported model IDs.

Also, you can use this API to refresh the model list if new models are added to Amazon Bedrock.

You can optionally restrict which models are exposed by `/models` and accepted by chat requests using a whitelist JSON config:

```bash
export MODEL_WHITELIST_FILE=/app/config/model-whitelist.json
# or inline JSON
# export MODEL_WHITELIST_JSON='{"families":["anthropic.claude","amazon.nova"],"profile_regions":["us","global"],"model_ids":["arn:aws:bedrock:us-west-2:123456789012:application-inference-profile/my-profile"]}'
```

Example `model-whitelist.json`:

```json
{
"families": ["anthropic.claude", "amazon.nova"],
"profile_regions": ["us", "global"],
"model_ids": [
"arn:aws:bedrock:us-west-2:123456789012:application-inference-profile/my-profile"
]
}
```

**Example Request**

Expand Down
3 changes: 3 additions & 0 deletions docs/Usage_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@
```bash
export OPENAI_API_KEY=<API key>
export OPENAI_BASE_URL=<API base url>
# 可选:使用 VPC Interface Endpoint 或自定义 Bedrock Endpoint
# export BEDROCK_URL=https://vpce-xxxxxxxx.bedrock.<region>.vpce.amazonaws.com
# export BEDROCK_RUNTIME_URL=https://vpce-xxxxxxxx.bedrock-runtime.<region>.vpce.amazonaws.com
```

**API 示例:**
Expand Down
Loading