feat: add bedrock inference provider to redhat distribution#15
feat: add bedrock inference provider to redhat distribution#15
Conversation
Configure AWS Bedrock as an inference provider with environment-based credentials and connection settings including timeouts and session TTL.
WalkthroughAdded a new inference provider entry Changes
Sequence Diagram(s)Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
|
@leseb @nathan-weinberg fyi, I created this PR as per our slack |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
redhat-distribution/run.yaml (3)
22-31: Expose retry controls to handle Bedrock throttling/transient failures.
Add max_attempts/retry_mode to avoid brittle behavior under load.Apply:
read_timeout: ${env.AWS_READ_TIMEOUT:=60} session_ttl: ${env.AWS_SESSION_TTL:=3600} + max_attempts: ${env.AWS_MAX_ATTEMPTS:=5} + retry_mode: ${env.AWS_RETRY_MODE:=standard}
22-31: Time-out defaults are conservative; consider faster connect fail.
connect_timeout=60s is high for containerized deployments; 5–10s typically improves failure recovery without harming success rates.Apply:
- connect_timeout: ${env.AWS_CONNECT_TIMEOUT:=60} + connect_timeout: ${env.AWS_CONNECT_TIMEOUT:=10}
22-31: Add an endpoint override knob (air-gapped/alt partitions).
Optional but handy for Gov/ISO partitions, private VPC endpoints, or proxies.Apply:
session_ttl: ${env.AWS_SESSION_TTL:=3600} + endpoint_url: ${env.AWS_BEDROCK_ENDPOINT_URL:=}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
redhat-distribution/build.yaml(1 hunks)redhat-distribution/run.yaml(1 hunks)
🔇 Additional comments (3)
redhat-distribution/build.yaml (1)
7-7: Bedrock provider added to distribution spec — LGTM.
Placement right after vLLM keeps the provider list orderly and predictable.redhat-distribution/run.yaml (2)
22-31: Models section lacks a Bedrock-backed LLM entry.
As-is, only vLLM is selectable for LLM. Add a Bedrock model mapped to provider_id: bedrock-inference to truly “enable use alongside vLLM.”Suggested addition under models (alongside the existing first LLM entry):
- metadata: {} model_id: ${env.BEDROCK_MODEL_ID:=} provider_id: bedrock-inference provider_model_id: ${env.BEDROCK_PROVIDER_MODEL_ID:=} model_type: llmIf your model routing expects only model_id, set INFERENCE_MODEL to the Bedrock model_id at deploy time.
22-31: Omit empty‐string AWS credential fields to enable default fallback
The${env.AWS_ACCESS_KEY_ID:=},${env.AWS_SECRET_ACCESS_KEY:=}and${env.AWS_SESSION_TOKEN:=}assignments will emit empty‐string values when unset, which most AWS SDKs treat as explicit (invalid) credentials rather than falling back to the default chain. Conditionally render or remove lines 25–27 inredhat-distribution/run.yamlwhen their env vars aren’t provided.
| aws_access_key_id: ${env.AWS_ACCESS_KEY_ID:=} | ||
| aws_secret_access_key: ${env.AWS_SECRET_ACCESS_KEY:=} | ||
| aws_session_token: ${env.AWS_SESSION_TOKEN:=} | ||
| region: ${env.AWS_REGION:=us-east-1} |
There was a problem hiding this comment.
this is not present downstream please remove the default var. default is None in llama-stack
|
@skamenan7 please rebase! |
|
There have been two duplicate PRs opened after this one - I'm assuming this one is irrelevant and can be closed |
…e-llama Add dockerfile and config files for llama-stack-core
Configure AWS Bedrock as an inference provider with environment-based credentials and connection settings including timeouts and session TTL.
What does this PR do?
Adds AWS Bedrock inference provider to the RedHat distribution. Users can now use Bedrock models alongside the existing vLLM option.
Test Plan
Summary by CodeRabbit