Skip to content

feat: add bedrock inference provider to redhat distribution#15

Closed
skamenan7 wants to merge 1 commit intomainfrom
feat/aws-bedrock-inference-provider
Closed

feat: add bedrock inference provider to redhat distribution#15
skamenan7 wants to merge 1 commit intomainfrom
feat/aws-bedrock-inference-provider

Conversation

@skamenan7
Copy link
Copy Markdown
Collaborator

@skamenan7 skamenan7 commented Sep 2, 2025

Configure AWS Bedrock as an inference provider with environment-based credentials and connection settings including timeouts and session TTL.

What does this PR do?

Adds AWS Bedrock inference provider to the RedHat distribution. Users can now use Bedrock models alongside the existing vLLM option.

Test Plan

  • The build script ran without errors and generated the Containerfile successfully.
  • The YAML config loads properly and recognizes all 3 inference providers (vllm, bedrock, sentence-transformers) even without AWS credentials set

Summary by CodeRabbit

  • New Features
    • Added support for an AWS Bedrock remote inference provider.
    • Configurable via distribution and runtime settings, alongside existing providers.
    • Supports region, connection/read timeouts, session TTL, and optional AWS credentials.
    • Sensible defaults included (e.g., region us-east-1, 60s timeouts, 3600s TTL).

Configure AWS Bedrock as an inference provider with environment-based
credentials and connection settings including timeouts and session TTL.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Sep 2, 2025

Walkthrough

Added a new inference provider entry remote::bedrock to Red Hat distribution configs. It’s inserted in build.yaml within distribution_spec.providers.inference and appended in run.yaml with AWS credential fields, region, timeouts, and session TTL defaults. No other provider lists or control flow changed.

Changes

Cohort / File(s) Summary of modifications
Red Hat distribution config
redhat-distribution/build.yaml, redhat-distribution/run.yaml
Introduced remote::bedrock inference provider. In build.yaml, added to distribution_spec.providers.inference. In run.yaml, appended detailed bedrock-inference entry with AWS creds (empty defaults), region=us-east-1, connect/read timeouts=60, session_ttl=3600. Existing providers unchanged.

Sequence Diagram(s)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

A bunny taps the YAML stone,
Adds Bedrock to the inference zone.
Credentials blank, the region set,
Timeouts ticking—steady bet.
In fields of config, carrots grow;
New hops unlocked—away we go! 🥕🚀

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/aws-bedrock-inference-provider

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@skamenan7
Copy link
Copy Markdown
Collaborator Author

@leseb @nathan-weinberg fyi, I created this PR as per our slack

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
redhat-distribution/run.yaml (3)

22-31: Expose retry controls to handle Bedrock throttling/transient failures.
Add max_attempts/retry_mode to avoid brittle behavior under load.

Apply:

       read_timeout: ${env.AWS_READ_TIMEOUT:=60}
       session_ttl: ${env.AWS_SESSION_TTL:=3600}
+      max_attempts: ${env.AWS_MAX_ATTEMPTS:=5}
+      retry_mode: ${env.AWS_RETRY_MODE:=standard}

22-31: Time-out defaults are conservative; consider faster connect fail.
connect_timeout=60s is high for containerized deployments; 5–10s typically improves failure recovery without harming success rates.

Apply:

-      connect_timeout: ${env.AWS_CONNECT_TIMEOUT:=60}
+      connect_timeout: ${env.AWS_CONNECT_TIMEOUT:=10}

22-31: Add an endpoint override knob (air-gapped/alt partitions).
Optional but handy for Gov/ISO partitions, private VPC endpoints, or proxies.

Apply:

       session_ttl: ${env.AWS_SESSION_TTL:=3600}
+      endpoint_url: ${env.AWS_BEDROCK_ENDPOINT_URL:=}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5061e56 and 8cdccbc.

📒 Files selected for processing (2)
  • redhat-distribution/build.yaml (1 hunks)
  • redhat-distribution/run.yaml (1 hunks)
🔇 Additional comments (3)
redhat-distribution/build.yaml (1)

7-7: Bedrock provider added to distribution spec — LGTM.
Placement right after vLLM keeps the provider list orderly and predictable.

redhat-distribution/run.yaml (2)

22-31: Models section lacks a Bedrock-backed LLM entry.
As-is, only vLLM is selectable for LLM. Add a Bedrock model mapped to provider_id: bedrock-inference to truly “enable use alongside vLLM.”

Suggested addition under models (alongside the existing first LLM entry):

- metadata: {}
  model_id: ${env.BEDROCK_MODEL_ID:=}
  provider_id: bedrock-inference
  provider_model_id: ${env.BEDROCK_PROVIDER_MODEL_ID:=}
  model_type: llm

If your model routing expects only model_id, set INFERENCE_MODEL to the Bedrock model_id at deploy time.


22-31: Omit empty‐string AWS credential fields to enable default fallback
The ${env.AWS_ACCESS_KEY_ID:=}, ${env.AWS_SECRET_ACCESS_KEY:=} and ${env.AWS_SESSION_TOKEN:=} assignments will emit empty‐string values when unset, which most AWS SDKs treat as explicit (invalid) credentials rather than falling back to the default chain. Conditionally render or remove lines 25–27 in redhat-distribution/run.yaml when their env vars aren’t provided.

aws_access_key_id: ${env.AWS_ACCESS_KEY_ID:=}
aws_secret_access_key: ${env.AWS_SECRET_ACCESS_KEY:=}
aws_session_token: ${env.AWS_SESSION_TOKEN:=}
region: ${env.AWS_REGION:=us-east-1}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not present downstream please remove the default var. default is None in llama-stack

@leseb
Copy link
Copy Markdown
Collaborator

leseb commented Sep 3, 2025

@skamenan7 please rebase!

@nathan-weinberg
Copy link
Copy Markdown
Collaborator

There have been two duplicate PRs opened after this one - I'm assuming this one is irrelevant and can be closed

@nathan-weinberg nathan-weinberg deleted the feat/aws-bedrock-inference-provider branch September 11, 2025 17:36
leseb pushed a commit to leseb/llama-stack-distribution that referenced this pull request Jan 9, 2026
…e-llama

Add dockerfile and config files for llama-stack-core
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants