ADD VPC Endpoint Support & Model Whitelisting#231
Conversation
Add configurable Bedrock endpoint URLs via environment variables
…nfiguration Add configurable Bedrock model whitelist filtering
zxkane
left a comment
There was a problem hiding this comment.
Thanks for the contribution — VPC endpoint support and model whitelisting are both genuinely useful features. The design is clean and the docs are well updated.
However, there is one bug that will cause a service crash in the default deployment configuration, so requesting changes before this can be merged.
|
Thx @zxkane for the feedback, should be fixed now, since we only use VPC Endpoints I not tried it out what happens without setting them, good points from your side! |
zxkane
left a comment
There was a problem hiding this comment.
Thanks for the contribution! The VPC endpoint support and model whitelisting features are valuable additions. However, there are some important issues to address before merging:
Critical:
- The whitelist fails open on misconfiguration — invalid JSON or missing file silently falls back to exposing all models. A security control must fail closed (raise at startup).
Important:
- No schema validation on whitelist JSON — typos in key names (e.g.,
famliesinstead offamilies) silently allow all models through. - CloudFormation templates hardcode empty strings instead of using
!Refparameters, inconsistent with the existing pattern. - Inconsistent
or Nonehandling between URL settings and whitelist settings.
Suggestions:
- Add a warning/error log when the whitelist filters out ALL models (likely misconfiguration).
- Consider basic URL validation for endpoint URLs (e.g., must start with
https://).
See inline comments for details and suggested fixes.
|
Hi @weisser-dev, thanks for addressing the first round of feedback — the Just a heads-up: there are 5 open review threads from the second review that still need to be addressed before this can be merged:
Please check the inline comments for details and suggested fixes. Let me know if you have any questions! |
…claude-sonnet-4.6 Fix Claude Sonnet 4.5/4.6 ConverseStream validation failures
…rtant-review-feedback Harden model whitelist handling and parameterize Bedrock endpoint URLs
…ror-for-qwen-model Codex-generated pull request
merge main into our branch
|
@zxkane i also fixed some bugs like: [ERROR] Bedrock validation error for model qwen.qwen3-235b-a22b-2507-v1:0: An error occurred (ValidationException) when calling the ConverseStream operation: This model doesn't support the stopSequences field. Remove stopSequences and try again. maybe as an info - we used it with 4 pods, on k8s to have access via vpc endpoints to bedrock models for ai assisted coding... and this is used by many people very very stable... |
zxkane
left a comment
There was a problem hiding this comment.
Thanks for addressing all the previous review feedback — the whitelist validation, fail-closed behavior, and _env_or_none pattern all look solid now.
A few new items from the latest push:
Important (2):
_safe_textsilently rewrites empty/whitespace messages with proxy-injected text — this changes model behavior without the caller knowing- Whitelist only filters
/modelsdiscovery but not request-time validation — a user who knows a model ID can bypass the whitelist
Minor (3):
- Substring matching in
NO_STOP_SEQUENCES_MODELS(low risk but worth noting) - Unrelated bug fixes (prefill models, stop sequences, safe text) bundled with VPC/whitelist feature — consider separating commits
docker-compose.ymluses${VAR:-}(empty string) instead of${VAR-}(unset) — subtle coupling with_env_or_none
The VPC endpoint and whitelist core implementation look good and production-ready.
| def _safe_text(text: str | None) -> str: | ||
| if text is None: | ||
| return "" | ||
| return text if text.strip() else "[empty message omitted by proxy]" | ||
|
|
There was a problem hiding this comment.
Important: _safe_text silently mutates user input — this can change model behavior.
If a user intentionally sends " " or "", the proxy rewrites it to "[empty message omitted by proxy]" — a visible string the model will interpret as actual content. This changes the semantics of the request without the caller knowing.
A few concerns:
- The replacement text
"[empty message omitted by proxy]"will be treated as a real instruction/content by the model. text: str | None—TextContent.textshould never beNoneper the OpenAI chat completions schema. If there's a real case causing this, it would be good to document what client/scenario produces it.
Suggestion: either pass the empty string through and let Bedrock return a validation error naturally (so the caller can fix their request), or skip the empty content part entirely instead of injecting replacement text.
def _safe_text(text: str | None) -> str:
return text if text else ""| # Some models reject stopSequences entirely (ValidationException) | ||
| if any(no_stop_model in model_lower for no_stop_model in NO_STOP_SEQUENCES_MODELS): | ||
| if DEBUG: | ||
| logger.info(f"Skipped stopSequences for {chat_request.model} (not supported by model)") | ||
| else: | ||
| inference_config["stopSequences"] = stop |
There was a problem hiding this comment.
Nit: Substring matching via in is fragile.
any(no_stop_model in model_lower ...) does a substring match. If a future model ID happens to contain qwen3-235b-a22b-2507-v1:0 as a substring, it would incorrectly match.
The existing patterns elsewhere in the codebase (e.g., TEMPERATURE_TOPP_CONFLICT_MODELS) use the same substring approach, and the full versioned ID with :0 makes accidental collision unlikely — but worth being aware of.
| # For these models, if conversation ends with assistant message (e.g., "continue response"), | ||
| # a user message will be added to ask the model to continue | ||
| NO_ASSISTANT_PREFILL_MODELS = { |
There was a problem hiding this comment.
Note: These additions are unrelated to VPC endpoint / whitelist features.
Adding claude-sonnet-4-5 and claude-sonnet-4-6 to NO_ASSISTANT_PREFILL_MODELS is a legitimate fix, but bundling unrelated bug fixes with the VPC/whitelist feature makes the PR harder to review and git bisect. Consider splitting into separate commits at minimum.
| # In case stack not updated. | ||
| model_list[DEFAULT_MODEL] = {"modalities": ["TEXT", "IMAGE"]} | ||
|
|
||
| if whitelist: | ||
| model_list = { | ||
| model_id: metadata | ||
| for model_id, metadata in model_list.items() | ||
| if _is_allowed_by_whitelist(model_id, whitelist) | ||
| } | ||
| if not model_list: | ||
| logger.error("Model whitelist filtered out ALL models. Check whitelist configuration.") | ||
| else: |
There was a problem hiding this comment.
Suggestion: Whitelist only filters /models listing, not request-time validation.
The PR description states "request validation respect the configured whitelist so non-allowed models are effectively blocked", but _parse_request() / chat() in BedrockModel does not check _MODEL_WHITELIST. A user who knows a model ID can still send a chat request to a non-whitelisted model — the whitelist is only enforced on discovery.
If the whitelist is intended as a security control (restricting which models can be invoked), consider also adding a check in _parse_request():
if _MODEL_WHITELIST and not _is_allowed_by_whitelist(chat_request.model, _MODEL_WHITELIST):
raise HTTPException(status_code=403, detail=f"Model {chat_request.model} is not allowed by whitelist")| - BEDROCK_URL=${BEDROCK_URL:-} | ||
| - BEDROCK_RUNTIME_URL=${BEDROCK_RUNTIME_URL:-} |
There was a problem hiding this comment.
Minor: ${BEDROCK_URL:-} sets an empty string when unset, not unset.
This works today because _env_or_none() in setting.py converts "" to None. But it's a subtle coupling — if _env_or_none were ever changed, the docker-compose default would break.
Consider using ${BEDROCK_URL-} (without :) so the env var remains unset inside the container when not defined on the host, rather than being set to an empty string. This removes the dependency on _env_or_none() for correctness.
…rors-in-models Codex-generated pull request
Motivation
Allow operators to supply the whitelist via environment (inline JSON) or a JSON file for flexible deployment configurations.
Description
Description
BEDROCK_URLandBEDROCK_RUNTIME_URLenvironment variables insrc/api/setting.pyto expose endpoint overrides.bedrockandbedrock-runtimeusing theendpoint_urlparameter insrc/api/models/bedrock.py.deployment/BedrockProxy.template,deployment/BedrockProxyFargate.template, anddocker-compose.ymlnow includeBEDROCK_URLandBEDROCK_RUNTIME_URLentries.README.md,docs/Usage.md, anddocs/Usage_CN.mdwith examples and comments for VPC interface endpoints.Testing
python -m compileall src/api, which completed successfully.