Skip to content
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
15 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,9 @@ Now, you can try out the proxy APIs. Let's say you want to test Claude 3 Sonnet
```bash
export OPENAI_API_KEY=<API key>
export OPENAI_BASE_URL=<API base url>
# Optional: use VPC interface endpoints / custom Bedrock endpoints
# export BEDROCK_URL=https://vpce-xxxxxxxx.bedrock.<region>.vpce.amazonaws.com
# export BEDROCK_RUNTIME_URL=https://vpce-xxxxxxxx.bedrock-runtime.<region>.vpce.amazonaws.com
# For older versions
# https://github.com/openai/openai-python/issues/624
export OPENAI_API_BASE=<API base url>
Expand Down
2 changes: 2 additions & 0 deletions deployment/BedrockProxy.template
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ Resources:
ENABLE_CROSS_REGION_INFERENCE: "true"
ENABLE_APPLICATION_INFERENCE_PROFILES: "true"
ENABLE_PROMPT_CACHING: !Ref EnablePromptCaching
BEDROCK_URL: ""
BEDROCK_RUNTIME_URL: ""
Comment thread
weisser-dev marked this conversation as resolved.
Outdated
API_ROUTE_PREFIX: /v1
MemorySize: 1024
PackageType: Image
Expand Down
4 changes: 4 additions & 0 deletions deployment/BedrockProxyFargate.template
Original file line number Diff line number Diff line change
Expand Up @@ -261,6 +261,10 @@ Resources:
- Name: ENABLE_PROMPT_CACHING
Value:
Ref: EnablePromptCaching
- Name: BEDROCK_URL
Value: ""
- Name: BEDROCK_RUNTIME_URL
Value: ""
Essential: true
Image:
Ref: ContainerImageUri
Expand Down
2 changes: 2 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ services:
- "127.0.0.1:8000:8080"
environment:
- ENABLE_PROMPT_CACHING=true
- BEDROCK_URL=${BEDROCK_URL:-}
- BEDROCK_RUNTIME_URL=${BEDROCK_RUNTIME_URL:-}
Comment on lines +12 to +13
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: ${BEDROCK_URL:-} sets an empty string when unset, not unset.

This works today because _env_or_none() in setting.py converts "" to None. But it's a subtle coupling — if _env_or_none were ever changed, the docker-compose default would break.

Consider using ${BEDROCK_URL-} (without :) so the env var remains unset inside the container when not defined on the host, rather than being set to an empty string. This removes the dependency on _env_or_none() for correctness.

- API_KEY=${OPENAI_API_KEY}
- AWS_PROFILE
- AWS_ACCESS_KEY_ID
Expand Down
22 changes: 22 additions & 0 deletions docs/Usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ Assuming you have set up below environment variables after deployed:
```bash
export OPENAI_API_KEY=<API key>
export OPENAI_BASE_URL=<API base url>
# Optional: use VPC interface endpoints / custom Bedrock endpoints
# export BEDROCK_URL=https://vpce-xxxxxxxx.bedrock.<region>.vpce.amazonaws.com
# export BEDROCK_RUNTIME_URL=https://vpce-xxxxxxxx.bedrock-runtime.<region>.vpce.amazonaws.com
```

**API Example:**
Expand All @@ -23,6 +26,25 @@ You can use this API to get a list of supported model IDs.

Also, you can use this API to refresh the model list if new models are added to Amazon Bedrock.

You can optionally restrict which models are exposed by `/models` and accepted by chat requests using a whitelist JSON config:

```bash
export MODEL_WHITELIST_FILE=/app/config/model-whitelist.json
# or inline JSON
# export MODEL_WHITELIST_JSON='{"families":["anthropic.claude","amazon.nova"],"profile_regions":["us","global"],"model_ids":["arn:aws:bedrock:us-west-2:123456789012:application-inference-profile/my-profile"]}'
```

Example `model-whitelist.json`:

```json
{
"families": ["anthropic.claude", "amazon.nova"],
"profile_regions": ["us", "global"],
"model_ids": [
"arn:aws:bedrock:us-west-2:123456789012:application-inference-profile/my-profile"
]
}
```

**Example Request**

Expand Down
3 changes: 3 additions & 0 deletions docs/Usage_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@
```bash
export OPENAI_API_KEY=<API key>
export OPENAI_BASE_URL=<API base url>
# 可选:使用 VPC Interface Endpoint 或自定义 Bedrock Endpoint
# export BEDROCK_URL=https://vpce-xxxxxxxx.bedrock.<region>.vpce.amazonaws.com
# export BEDROCK_RUNTIME_URL=https://vpce-xxxxxxxx.bedrock-runtime.<region>.vpce.amazonaws.com
```

**API 示例:**
Expand Down
65 changes: 65 additions & 0 deletions src/api/models/bedrock.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,12 @@
)
from api.setting import (
AWS_REGION,
BEDROCK_RUNTIME_URL,
BEDROCK_URL,
DEBUG,
DEFAULT_MODEL,
MODEL_WHITELIST_FILE,
MODEL_WHITELIST_JSON,
ENABLE_CROSS_REGION_INFERENCE,
ENABLE_APPLICATION_INFERENCE_PROFILES,
ENABLE_PROMPT_CACHING,
Expand All @@ -66,11 +70,13 @@
bedrock_runtime = boto3.client(
service_name="bedrock-runtime",
region_name=AWS_REGION,
endpoint_url=BEDROCK_RUNTIME_URL,
config=config,
)
bedrock_client = boto3.client(
service_name="bedrock",
region_name=AWS_REGION,
endpoint_url=BEDROCK_URL,
config=config,
)

Expand Down Expand Up @@ -107,6 +113,56 @@
}


def _load_model_whitelist() -> dict:
"""Load model whitelist config from env JSON string or JSON file."""
if MODEL_WHITELIST_JSON:
try:
return json.loads(MODEL_WHITELIST_JSON)
except json.JSONDecodeError as e:
logger.warning("Invalid MODEL_WHITELIST_JSON. Ignoring whitelist. error=%s", e)
return {}

if MODEL_WHITELIST_FILE:
try:
with open(MODEL_WHITELIST_FILE, encoding="utf-8") as f:
return json.load(f)
except Exception as e:
logger.warning("Unable to load MODEL_WHITELIST_FILE=%s. Ignoring whitelist. error=%s", MODEL_WHITELIST_FILE, e)
return {}

return {}
Comment thread
weisser-dev marked this conversation as resolved.


def _is_allowed_by_whitelist(model_id: str, whitelist: dict) -> bool:
"""Check if model id is allowed by whitelist rules.

Supported keys:
- model_ids: exact model ids/profile ids
- families: prefix match for foundation model ids (e.g. anthropic.claude, amazon.nova)
- profile_regions: prefix before first '.' for cross-region profiles (e.g. us, eu, apac, global)
"""
if not whitelist:
return True

model_ids = set(whitelist.get("model_ids", []))
families = whitelist.get("families", [])
profile_regions = set(whitelist.get("profile_regions", []))

if model_ids and model_id in model_ids:
return True

if families and any(model_id.startswith(f"{family}.") or model_id.startswith(family) for family in families):
Comment thread
weisser-dev marked this conversation as resolved.
Outdated
return True

if profile_regions and "." in model_id:
prefix = model_id.split(".", 1)[0]
if prefix in profile_regions:
return True

# If any selector is configured, default deny.
return not any((model_ids, families, profile_regions))
Comment thread
weisser-dev marked this conversation as resolved.


def list_bedrock_models() -> dict:
"""Automatically getting a list of supported models.

Expand All @@ -116,6 +172,7 @@ def list_bedrock_models() -> dict:
- Application Inference Profiles (if enabled via Env)
"""
model_list = {}
whitelist = _load_model_whitelist()
Comment thread
weisser-dev marked this conversation as resolved.
Outdated
try:
if ENABLE_CROSS_REGION_INFERENCE:
# List system defined inference profile IDs and store underlying model mapping
Expand Down Expand Up @@ -203,6 +260,14 @@ def list_bedrock_models() -> dict:
# In case stack not updated.
model_list[DEFAULT_MODEL] = {"modalities": ["TEXT", "IMAGE"]}

if whitelist:
model_list = {
model_id: metadata
for model_id, metadata in model_list.items()
if _is_allowed_by_whitelist(model_id, whitelist)
}
logger.info("Applied model whitelist, allowed_models=%d", len(model_list))
Comment thread
weisser-dev marked this conversation as resolved.
Outdated

return model_list


Expand Down
4 changes: 4 additions & 0 deletions src/api/setting.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,12 @@

DEBUG = os.environ.get("DEBUG", "false").lower() != "false"
AWS_REGION = os.environ.get("AWS_REGION", "us-west-2")
BEDROCK_URL = os.environ.get("BEDROCK_URL")
BEDROCK_RUNTIME_URL = os.environ.get("BEDROCK_RUNTIME_URL")
Comment thread
weisser-dev marked this conversation as resolved.
Outdated
DEFAULT_MODEL = os.environ.get("DEFAULT_MODEL", "anthropic.claude-3-sonnet-20240229-v1:0")
DEFAULT_EMBEDDING_MODEL = os.environ.get("DEFAULT_EMBEDDING_MODEL", "cohere.embed-multilingual-v3")
ENABLE_CROSS_REGION_INFERENCE = os.environ.get("ENABLE_CROSS_REGION_INFERENCE", "true").lower() != "false"
ENABLE_APPLICATION_INFERENCE_PROFILES = os.environ.get("ENABLE_APPLICATION_INFERENCE_PROFILES", "true").lower() != "false"
ENABLE_PROMPT_CACHING = os.environ.get("ENABLE_PROMPT_CACHING", "false").lower() != "false"
MODEL_WHITELIST_FILE = os.environ.get("MODEL_WHITELIST_FILE")
MODEL_WHITELIST_JSON = os.environ.get("MODEL_WHITELIST_JSON")