Skip to content

ADD VPC Endpoint Support & Model Whitelisting#231

Open
weisser-dev wants to merge 15 commits into
aws-samples:mainfrom
HUK-COBURG:main
Open

ADD VPC Endpoint Support & Model Whitelisting#231
weisser-dev wants to merge 15 commits into
aws-samples:mainfrom
HUK-COBURG:main

Conversation

@weisser-dev
Copy link
Copy Markdown

@weisser-dev weisser-dev commented Mar 4, 2026

Motivation

  • Allow deploying the proxy to environments that require custom Bedrock endpoints (e.g., VPC interface endpoints / PrivateLink) by making Bedrock service endpoints configurable at runtime.
  • Provide a way to restrict which Bedrock models and cross-region profiles the proxy exposes and accepts so operators can limit available models by family and region.
    Allow operators to supply the whitelist via environment (inline JSON) or a JSON file for flexible deployment configurations.
  • Ensure the /models discovery and request validation respect the configured whitelist so non-allowed models are effectively blocked.
    Description

Description

  • Added BEDROCK_URL and BEDROCK_RUNTIME_URL environment variables in src/api/setting.py to expose endpoint overrides.
  • Passed the new settings into boto3 client initialization for bedrock and bedrock-runtime using the endpoint_url parameter in src/api/models/bedrock.py.
  • Updated deployment artifacts to propagate the new env vars: deployment/BedrockProxy.template, deployment/BedrockProxyFargate.template, and docker-compose.yml now include BEDROCK_URL and BEDROCK_RUNTIME_URL entries.
  • Documented optional env vars in README.md, docs/Usage.md, and docs/Usage_CN.md with examples and comments for VPC interface endpoints.
  • Added environment settings MODEL_WHITELIST_FILE and MODEL_WHITELIST_JSON in src/api/setting.py to accept whitelist configuration.
  • Implemented _load_model_whitelist() and _is_allowed_by_whitelist() in src/api/models/bedrock.py to read and evaluate the whitelist.
  • Applied whitelist filtering in list_bedrock_models() so the discovered model_list is reduced to allowed models (defaults to no filtering when no whitelist is provided).
  • Updated docs/Usage.md with examples showing how to configure MODEL_WHITELIST_FILE/MODEL_WHITELIST_JSON and a sample model-whitelist.json schema.
  • Testing

Testing

  • Ran Python bytecode compilation over the API package with python -m compileall src/api, which completed successfully.
  • Try it in our K8s Cluster with 4 Replicas and on production - works fine

@weisser-dev weisser-dev changed the title ADD VPC Endpoint Support ADD VPC Endpoint Support & Model Whitelisting Mar 5, 2026
Copy link
Copy Markdown
Member

@zxkane zxkane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution — VPC endpoint support and model whitelisting are both genuinely useful features. The design is clean and the docs are well updated.

However, there is one bug that will cause a service crash in the default deployment configuration, so requesting changes before this can be merged.

Comment thread src/api/setting.py Outdated
Comment thread src/api/models/bedrock.py Outdated
Comment thread src/api/models/bedrock.py Outdated
@weisser-dev
Copy link
Copy Markdown
Author

Thx @zxkane for the feedback, should be fixed now, since we only use VPC Endpoints I not tried it out what happens without setting them, good points from your side!

Copy link
Copy Markdown
Member

@zxkane zxkane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! The VPC endpoint support and model whitelisting features are valuable additions. However, there are some important issues to address before merging:

Critical:

  • The whitelist fails open on misconfiguration — invalid JSON or missing file silently falls back to exposing all models. A security control must fail closed (raise at startup).

Important:

  • No schema validation on whitelist JSON — typos in key names (e.g., famlies instead of families) silently allow all models through.
  • CloudFormation templates hardcode empty strings instead of using !Ref parameters, inconsistent with the existing pattern.
  • Inconsistent or None handling between URL settings and whitelist settings.

Suggestions:

  • Add a warning/error log when the whitelist filters out ALL models (likely misconfiguration).
  • Consider basic URL validation for endpoint URLs (e.g., must start with https://).

See inline comments for details and suggested fixes.

Comment thread src/api/models/bedrock.py
Comment thread src/api/models/bedrock.py
Comment thread deployment/BedrockProxy.template Outdated
Comment thread src/api/setting.py Outdated
Comment thread src/api/models/bedrock.py Outdated
@zxkane
Copy link
Copy Markdown
Member

zxkane commented Mar 10, 2026

Hi @weisser-dev, thanks for addressing the first round of feedback — the or None fix, module-level whitelist caching, and simplified startswith check all look good now.

Just a heads-up: there are 5 open review threads from the second review that still need to be addressed before this can be merged:

  1. Critical: _load_model_whitelist() fails open on misconfiguration — invalid JSON or missing file silently allows all models through. A security control should fail closed (raise at startup).
  2. Important: No schema validation on whitelist JSON — typos in key names (e.g., famlies) or wrong value types silently allow all models.
  3. Important: CloudFormation templates hardcode empty strings for BEDROCK_URL/BEDROCK_RUNTIME_URL instead of using !Ref parameters, inconsistent with the existing pattern.
  4. Important: MODEL_WHITELIST_FILE and MODEL_WHITELIST_JSON don't use the or None pattern, inconsistent with the endpoint URL settings.
  5. Suggestion: Log at error/warning level when the whitelist filters out ALL models (likely misconfiguration).

Please check the inline comments for details and suggested fixes. Let me know if you have any questions!

@weisser-dev
Copy link
Copy Markdown
Author

weisser-dev commented Mar 17, 2026

@zxkane i also fixed some bugs like: [ERROR] Bedrock validation error for model qwen.qwen3-235b-a22b-2507-v1:0: An error occurred (ValidationException) when calling the ConverseStream operation: This model doesn't support the stopSequences field. Remove stopSequences and try again.
and fixed sonnet bugs, cause in the usage with continue the newest models seems like be broken, so these fixes are here included... I guess this is a big advantage for your users, so maybe you could approve now? it works fine on our side... even if I provide a whitelisted model list, or not.

maybe as an info - we used it with 4 pods, on k8s to have access via vpc endpoints to bedrock models for ai assisted coding... and this is used by many people very very stable...

Copy link
Copy Markdown
Member

@zxkane zxkane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing all the previous review feedback — the whitelist validation, fail-closed behavior, and _env_or_none pattern all look solid now.

A few new items from the latest push:

Important (2):

  • _safe_text silently rewrites empty/whitespace messages with proxy-injected text — this changes model behavior without the caller knowing
  • Whitelist only filters /models discovery but not request-time validation — a user who knows a model ID can bypass the whitelist

Minor (3):

  • Substring matching in NO_STOP_SEQUENCES_MODELS (low risk but worth noting)
  • Unrelated bug fixes (prefill models, stop sequences, safe text) bundled with VPC/whitelist feature — consider separating commits
  • docker-compose.yml uses ${VAR:-} (empty string) instead of ${VAR-} (unset) — subtle coupling with _env_or_none

The VPC endpoint and whitelist core implementation look good and production-ready.

Comment thread src/api/models/bedrock.py
Comment on lines +1300 to +1304
def _safe_text(text: str | None) -> str:
if text is None:
return ""
return text if text.strip() else "[empty message omitted by proxy]"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important: _safe_text silently mutates user input — this can change model behavior.

If a user intentionally sends " " or "", the proxy rewrites it to "[empty message omitted by proxy]" — a visible string the model will interpret as actual content. This changes the semantics of the request without the caller knowing.

A few concerns:

  1. The replacement text "[empty message omitted by proxy]" will be treated as a real instruction/content by the model.
  2. text: str | NoneTextContent.text should never be None per the OpenAI chat completions schema. If there's a real case causing this, it would be good to document what client/scenario produces it.

Suggestion: either pass the empty string through and let Bedrock return a validation error naturally (so the caller can fix their request), or skip the empty content part entirely instead of injecting replacement text.

def _safe_text(text: str | None) -> str:
    return text if text else ""

Comment thread src/api/models/bedrock.py
Comment on lines +913 to +918
# Some models reject stopSequences entirely (ValidationException)
if any(no_stop_model in model_lower for no_stop_model in NO_STOP_SEQUENCES_MODELS):
if DEBUG:
logger.info(f"Skipped stopSequences for {chat_request.model} (not supported by model)")
else:
inference_config["stopSequences"] = stop
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Substring matching via in is fragile.

any(no_stop_model in model_lower ...) does a substring match. If a future model ID happens to contain qwen3-235b-a22b-2507-v1:0 as a substring, it would incorrectly match.

The existing patterns elsewhere in the codebase (e.g., TEMPERATURE_TOPP_CONFLICT_MODELS) use the same substring approach, and the full versioned ID with :0 makes accidental collision unlikely — but worth being aware of.

Comment thread src/api/models/bedrock.py
Comment on lines 109 to 111
# For these models, if conversation ends with assistant message (e.g., "continue response"),
# a user message will be added to ask the model to continue
NO_ASSISTANT_PREFILL_MODELS = {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: These additions are unrelated to VPC endpoint / whitelist features.

Adding claude-sonnet-4-5 and claude-sonnet-4-6 to NO_ASSISTANT_PREFILL_MODELS is a legitimate fix, but bundling unrelated bug fixes with the VPC/whitelist feature makes the PR harder to review and git bisect. Consider splitting into separate commits at minimum.

Comment thread src/api/models/bedrock.py
Comment on lines 301 to +312
# In case stack not updated.
model_list[DEFAULT_MODEL] = {"modalities": ["TEXT", "IMAGE"]}

if whitelist:
model_list = {
model_id: metadata
for model_id, metadata in model_list.items()
if _is_allowed_by_whitelist(model_id, whitelist)
}
if not model_list:
logger.error("Model whitelist filtered out ALL models. Check whitelist configuration.")
else:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Whitelist only filters /models listing, not request-time validation.

The PR description states "request validation respect the configured whitelist so non-allowed models are effectively blocked", but _parse_request() / chat() in BedrockModel does not check _MODEL_WHITELIST. A user who knows a model ID can still send a chat request to a non-whitelisted model — the whitelist is only enforced on discovery.

If the whitelist is intended as a security control (restricting which models can be invoked), consider also adding a check in _parse_request():

if _MODEL_WHITELIST and not _is_allowed_by_whitelist(chat_request.model, _MODEL_WHITELIST):
    raise HTTPException(status_code=403, detail=f"Model {chat_request.model} is not allowed by whitelist")

Comment thread docker-compose.yml
Comment on lines +12 to +13
- BEDROCK_URL=${BEDROCK_URL:-}
- BEDROCK_RUNTIME_URL=${BEDROCK_RUNTIME_URL:-}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: ${BEDROCK_URL:-} sets an empty string when unset, not unset.

This works today because _env_or_none() in setting.py converts "" to None. But it's a subtle coupling — if _env_or_none were ever changed, the docker-compose default would break.

Consider using ${BEDROCK_URL-} (without :) so the env var remains unset inside the container when not defined on the host, rather than being set to an empty string. This removes the dependency on _env_or_none() for correctness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants