feat: allow for more customization around embedding model by nathan-weinberg · Pull Request #157 · opendatahub-io/llama-stack-distribution

nathan-weinberg · 2025-12-09T18:51:10Z

What does this PR do?

make embedding dimension, model_id, and provider_model_id configurable fields. prev hardcoded values are now defaults.

update TrustyAI config to use TRUSTYAI_EMBEDDING_MODEL instead of EMBEDDING_MODEL

Summary by CodeRabbit

Chores
- Renamed the environment variable selecting the TrustyAI embedding model for clarity (now TRUSTYAI_EMBEDDING_MODEL).
- Made embedding configuration environment-driven: embedding dimension, default model ID, and provider model ID now configurable via environment variables.
- Minor configuration formatting adjustments to align related settings.
Documentation
- Updated distribution README to reflect the new environment variable name and environment-driven defaults.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

nathan-weinberg · 2025-12-09T18:51:23Z

holding for signoff from RAG and TrustyAI teams

coderabbitai · 2025-12-09T18:51:29Z

Walkthrough

Environment variable references updated: EMBEDDING_MODEL → TRUSTYAI_EMBEDDING_MODEL in TrustyAI eval entries; several embedding-related fields in distribution/run.yaml were parameterized to use environment variables with defaults (embedding dimension, model_id, provider_model_id).

Changes

Cohort / File(s)	Summary
Docs & README update `distribution/README.md`	Replaced `EMBEDDING_MODEL` with `TRUSTYAI_EMBEDDING_MODEL` for the inline::trustyai_ragas entry.
Runtime configuration updates `distribution/run.yaml`	Switched eval/provider embedding model references from `${env.EMBEDDING_MODEL}` to `${env.TRUSTYAI_EMBEDDING_MODEL}` (affecting `trustyai_lmeval` and `trustyai_ragas` inline & remote). Parameterized previously hard-coded model metadata to env vars with defaults: `metadata.embedding_dimension` → `${env.EMBEDDING_DIMENSION:=768}`, `model_id` → `${env.EMBEDDING_MODEL:=granite-embedding-125m-english}`, `provider_model_id` → `${env.EMBEDDING_PROVIDER_MODEL_ID:=ibm-granite/granite-embedding-125m-english}`. Minor indentation alignment in `trustyai_lmeval` block.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Verify all TrustyAI eval blocks (trustyai_lmeval, trustyai_ragas inline & remote) consistently reference TRUSTYAI_EMBEDDING_MODEL.
Confirm env var defaults (EMBEDDING_DIMENSION:=768, EMBEDDING_MODEL:=granite-embedding-125m-english, EMBEDDING_PROVIDER_MODEL_ID:=ibm-granite/...) match deployment requirements.
Check README and run.yaml remain synchronized and indentation changes preserve YAML semantics.

Poem

🐰 I nibbled strings in config light,
Swapped a name and set defaults right,
TRUSTYAI now finds its tune,
Embeddings hum beneath the moon. ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title 'feat: allow for more customization around embedding model' accurately summarizes the main change: making embedding-related configurations (dimension, model_id, provider_model_id) configurable via environment variables instead of hardcoded values.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d2ac45e and 7b5e788.

📒 Files selected for processing (2)

distribution/README.md (1 hunks)
distribution/run.yaml (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

distribution/README.md

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-test-push (linux/amd64)
GitHub Check: Summary

🔇 Additional comments (3)

distribution/run.yaml (3)

125-126: ✓ Indentation formatting corrected.

The trustyai_lmeval configuration fields are now properly indented under the config: block, improving YAML structure consistency.

127-127: Verify embedding model environment variable renaming doesn't break external documentation or tooling.

The PR updates TrustyAI embedding references from EMBEDDING_MODEL to TRUSTYAI_EMBEDDING_MODEL. Lines 127 (provider activation), 131, and 136 (embedding_model references) now use the new variable name. Confirm that:

All documentation (README, deployment guides, examples) has been updated to reflect TRUSTYAI_EMBEDDING_MODEL

Any automation, tests, or CI/CD pipelines using EMBEDDING_MODEL in TrustyAI contexts have been updated accordingly

The conditional activation syntax on line 127 (using ${env.TRUSTYAI_EMBEDDING_MODEL:+trustyai_ragas_inline}) correctly enables the provider only when the variable is set.

Also applies to: 131-131, 136-136

273-276: ✓ Embedding model parameters successfully parameterized.

The embedding configuration now supports environment variable overrides for:

embedding_dimension (default: 768)

model_id (default: granite-embedding-125m-english)

provider_model_id (default: ibm-granite/granite-embedding-125m-english)

All defaults preserve previous hardcoded behavior, and the full model name in the defaults correctly reflects the complete HuggingFace model identifier per previous review discussion. Environment variable syntax with the env. prefix is consistent with the rest of the file.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9d5e171 and 8dbcb31.

📒 Files selected for processing (2)

distribution/README.md (1 hunks)
distribution/run.yaml (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-test-push (linux/amd64)
GitHub Check: Summary

🔇 Additional comments (3)

distribution/README.md (1)

17-17: Documentation update correctly reflects the environment variable rename.

The change aligns with the run.yaml updates that use TRUSTYAI_EMBEDDING_MODEL for TrustyAI eval configurations.

distribution/run.yaml (2)

133-145: Verify eval provider configuration changes.

Lines 133-134 appear to be new fields added to the trustyai_lmeval provider config. Please confirm:

Are use_k8s and base_url new fields required by the updated provider version?

Are the environment variable defaults correct?

Additionally, the changes to lines 135 and 144 correctly switch from EMBEDDING_MODEL to TRUSTYAI_EMBEDDING_MODEL, aligning with the PR objective and README update.

281-285: Verify embedding dimension and model_id defaults are appropriate.

The externalization of embedding model metadata (lines 281-284) looks good and aligns with the PR objective. The defaults appear sensible:

EMBEDDING_DIMENSION:=768 (standard dimension)

EMBEDDING_MODEL:=granite-embedding-125m (matches previous hardcoded value)

EMBEDDING_PROVIDER_MODEL_ID:=ibm-granite/granite-embedding-125m-english (matches previous hardcoded value)

However, please confirm that these defaults match the previously hardcoded values and are appropriate for the distribution's use case.

distribution/run.yaml

skamenan7

LGTM except that code rabbit pointed out at "Comment on lines R281 to R284"

jgarciao

LGTM. Let's wait until Francisco and/or Bill answer about the question of the embedding model name

distribution/run.yaml

distribution/README.md

ruivieira · 2025-12-10T10:39:26Z

/lgtm

Elbehery · 2025-12-10T12:41:13Z

LGTM

I don't wanna approve till the question above has been resolved, otherwise the bot will merge it 👍🏽

nathan-weinberg · 2025-12-10T12:43:52Z

LGTM

I don't wanna approve till the question above has been resolved, otherwise the bot will merge it 👍🏽

Bot will not merge since I have the do-not-merge label set, so you can safely approve 😄

make embedding dimension, model_id, and provider_model_id configurable fields. prev hardcoded values are now defaults. update TrustyAI config to use TRUSTYAI_EMBEDDING_MODEL instead of EMBEDDING_MODEL Signed-off-by: Nathan Weinberg <nweinber@redhat.com>

leseb · 2025-12-10T16:02:12Z

distribution/run.yaml

    module: llama_stack_provider_ragas.remote
    config:
-      embedding_model: ${env.EMBEDDING_MODEL:=}
+      embedding_model: ${env.TRUSTYAI_EMBEDDING_MODEL:=}


how is this model going to be served? i don't see it being optionally registered down below, is it expected?

This was not happening before so AFAIK yes this is expected

distribution/run.yaml

cdoern · 2025-12-10T16:03:41Z

holding for the comments.

nathan-weinberg requested a review from a team December 9, 2025 18:51

nathan-weinberg requested a review from kelbrown20 as a code owner December 9, 2025 18:51

nathan-weinberg added the do-not-merge Apply to PRs that should not be merged (yet) label Dec 9, 2025

coderabbitai bot reviewed Dec 9, 2025

View reviewed changes

distribution/run.yaml Outdated Show resolved Hide resolved

skamenan7 approved these changes Dec 9, 2025

View reviewed changes

nathan-weinberg force-pushed the custom-embed branch from 8dbcb31 to 2c1eb85 Compare December 9, 2025 20:06

jgarciao approved these changes Dec 10, 2025

View reviewed changes

distribution/run.yaml Outdated Show resolved Hide resolved

distribution/README.md Show resolved Hide resolved

nathan-weinberg force-pushed the custom-embed branch from 2c1eb85 to d2ac45e Compare December 10, 2025 14:58

nathan-weinberg force-pushed the custom-embed branch from d2ac45e to 7b5e788 Compare December 10, 2025 15:56

nathan-weinberg removed the do-not-merge Apply to PRs that should not be merged (yet) label Dec 10, 2025

leseb reviewed Dec 10, 2025

View reviewed changes

cdoern approved these changes Dec 10, 2025

View reviewed changes

derekhiggins reviewed Dec 10, 2025

View reviewed changes

distribution/run.yaml Show resolved Hide resolved

cdoern added the do-not-merge Apply to PRs that should not be merged (yet) label Dec 10, 2025

derekhiggins approved these changes Dec 10, 2025

View reviewed changes

alinaryan approved these changes Dec 10, 2025

View reviewed changes

nathan-weinberg removed the do-not-merge Apply to PRs that should not be merged (yet) label Dec 11, 2025

mergify bot merged commit 26f30f9 into opendatahub-io:main Dec 11, 2025
6 checks passed

nathan-weinberg deleted the custom-embed branch December 11, 2025 15:57

Conversation

nathan-weinberg commented Dec 9, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Summary by CodeRabbit

Uh oh!

nathan-weinberg commented Dec 9, 2025

Uh oh!

coderabbitai bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

skamenan7 left a comment

Choose a reason for hiding this comment

Uh oh!

jgarciao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ruivieira commented Dec 10, 2025

Uh oh!

Elbehery commented Dec 10, 2025

Uh oh!

nathan-weinberg commented Dec 10, 2025

Uh oh!

leseb Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

nathan-weinberg Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cdoern commented Dec 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

nathan-weinberg commented Dec 9, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 9, 2025 •

edited

Loading