feat: Add Bedrock provider support to inference configuration#24
Conversation
WalkthroughAdds boto3 to the container image dependencies, introduces a new remote::bedrock inference provider in the build specification, and configures a bedrock-inference provider in the runtime with AWS credential and retry/timeout settings via environment variables. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant Orchestrator as Inference Orchestrator
participant Provider as Bedrock Provider (remote::bedrock)
participant AWS as AWS Bedrock
Client->>Orchestrator: Submit inference request
Orchestrator->>Provider: Route request (model, payload)
Note over Provider: Load AWS config from env\n(keys, region, retries/timeouts)
Provider->>AWS: Invoke model inference
AWS-->>Provider: Response / Error
alt Success
Provider-->>Orchestrator: Inference result
Orchestrator-->>Client: Result
else Error/Retry
Provider->>AWS: Retry per retry_mode/attempts
AWS-->>Provider: Final response
Provider-->>Orchestrator: Result or failure
Orchestrator-->>Client: Result or error
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Pre-merge checks (3 passed)✅ Passed checks (3 passed)
Poem
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. 📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (2)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (1)
✨ Finishing touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
5c0df6b to
97c7e6d
Compare
|
cc: @leseb , @nathan-weinberg |
There is no need to cc us, the CODEOWNERS mechanism already auto-adds repo owners |
nathan-weinberg
left a comment
There was a problem hiding this comment.
This looks good to me, would like to wait for final approval from @leseb before merging
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
distribution/Containerfile (1)
11-11: Pin boto3 and ensure this change is generator-backed.
- Consider pinning to avoid surprise major upgrades:
- boto3 \ + boto3>=1.34,<2 \
- This file is auto-generated; please confirm build.py (or the generator inputs) now include boto3, otherwise a regen may drop it.
distribution/run.yaml (2)
22-34: Prefer AWS default credential chain; avoid passing empty strings.If the provider forwards empty strings to boto3 Session, it can short-circuit the default chain. Either:
- Ensure the provider treats empty values as None/unset, or
- Drop the explicit credential keys and rely on env/IMDS/profile by default.
Optionally simplify config (keep region/timeouts) and only set creds when needed.
30-34: Set safer defaults for retries/timeouts for inference.Suggested tweaks:
- total_max_attempts: ${env.AWS_MAX_ATTEMPTS:=} - retry_mode: ${env.AWS_RETRY_MODE:=} - connect_timeout: ${env.AWS_CONNECT_TIMEOUT:=60} - read_timeout: ${env.AWS_READ_TIMEOUT:=60} + total_max_attempts: ${env.AWS_MAX_ATTEMPTS:=8} + retry_mode: ${env.AWS_RETRY_MODE:=standard} + connect_timeout: ${env.AWS_CONNECT_TIMEOUT:=10} + read_timeout: ${env.AWS_READ_TIMEOUT:=300}This balances faster fail on connect with enough read headroom for streaming responses.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
distribution/Containerfile(1 hunks)distribution/build.yaml(1 hunks)distribution/run.yaml(1 hunks)
🔇 Additional comments (3)
distribution/build.yaml (1)
7-7: Bedrock provider registration looks good.The addition is syntactically correct and aligns with how other providers are listed.
distribution/Containerfile (1)
11-11: Sanity check Python baseline mismatch.build.yaml references a Python 3.11 UBI image, while this Containerfile uses Python 3.12. Please confirm llama-stack==0.2.18 and the Bedrock provider are 3.12-compatible, or align baselines.
distribution/run.yaml (1)
22-34: No change required — remote::bedrock is present in llama-stack v0.2.18Verified: remote::bedrock is included in v0.2.18 (support present since at least v0.2.10).
Added remote::bedrock provider to both build.yaml and run.yaml with AWS configuration options including credentials, region, retry settings, and connection timeouts.
97c7e6d to
e9d51fd
Compare
What does this PR do?
Adds AWS Bedrock provider support to the llama-stack distribution by configuring the remote::bedrock provider in both build.yaml and run.yaml files. This enables users to leverage AWS Bedrock models for inference through the llama-stack framework.
The changes include:
remote::bedrockprovider type to build.yaml inference providersSummary by CodeRabbit
New Features
Chores