[LEADS-443] OPENAI_API_KEY environmental variable failure by bsatapat-jpg · Pull Request #260 · lightspeed-core/lightspeed-evaluation

bsatapat-jpg · 2026-06-24T08:43:45Z

Description

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: Cluade-4.6-high
Generated by: Cursor

Related Tickets & Documents

Related Issue # https://redhat.atlassian.net/browse/LEADS-443
Closes #

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

New Features
- Added an embedding provider readiness check via ensure_ready() and improved initialization readiness behavior.
- Made Ragas embedding/metric dependencies load lazily and cache on first use.
Bug Fixes
- Validation is deferred until readiness is actually required, preventing premature startup failures.
- Expanded metric evaluation error handling to gracefully handle embedding/evaluation failures.
Improvements
- Switched readiness/runtime messaging to structured logging.
Tests
- Added/extended unit tests for idempotent readiness, lazy instantiation, concurrent access safety, and evaluation failure/unsupported metric responses.

coderabbitai · 2026-06-24T08:43:57Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 40e17488-960e-49b2-8d7e-1878664c065a

📥 Commits

Reviewing files that changed from the base of the PR and between f589513 and 69215ed.

📒 Files selected for processing (6)

src/lightspeed_evaluation/core/embedding/manager.py
src/lightspeed_evaluation/core/embedding/ragas.py
src/lightspeed_evaluation/core/metrics/ragas.py
tests/unit/core/embedding/test_manager.py
tests/unit/core/metrics/conftest.py
tests/unit/core/metrics/test_ragas.py

🚧 Files skipped from review as they are similar to previous changes (3)

src/lightspeed_evaluation/core/embedding/ragas.py
src/lightspeed_evaluation/core/metrics/ragas.py
src/lightspeed_evaluation/core/embedding/manager.py

Walkthrough

EmbeddingManager now defers provider validation to ensure_ready(). RagasEmbeddingManager calls that check during initialization. RagasMetrics lazily constructs and caches its embedding manager and expands evaluate() exception handling.

Changes

Lazy Embedding Validation

Layer / File(s)	Summary
EmbeddingManager readiness check `src/lightspeed_evaluation/core/embedding/manager.py`	Adds module-level logging and changes `EmbeddingManager` to track `_validated` and validate only through `ensure_ready()`, which logs readiness instead of printing.
Ragas initialization and metrics caching `src/lightspeed_evaluation/core/embedding/ragas.py`, `src/lightspeed_evaluation/core/metrics/ragas.py`	`RagasEmbeddingManager.__init__` calls `embedding_manager.ensure_ready()` and documents raised errors. `RagasMetrics` stores the embedding manager privately, lazily creates and caches `RagasEmbeddingManager`, and catches additional evaluation exceptions.
Readiness and evaluation tests `tests/unit/core/embedding/test_manager.py`, `tests/unit/core/metrics/conftest.py`, `tests/unit/core/metrics/test_ragas.py`	Adds unit tests for `ensure_ready()` idempotency and failure cases, plus lazy property caching, fixture setup, and supported exception handling in `RagasMetrics.evaluate()`.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant RagasMetrics
  participant RagasEmbeddingManager
  participant EmbeddingManager
  Caller->>RagasMetrics: __init__(embedding_manager)
  RagasMetrics->>RagasMetrics: store _embedding_manager, set _ragas_embedding_manager=None
  Caller->>RagasMetrics: access embedding_manager
  RagasMetrics->>RagasEmbeddingManager: construct from _embedding_manager
  RagasEmbeddingManager->>EmbeddingManager: ensure_ready()
  RagasMetrics->>RagasMetrics: cache RagasEmbeddingManager
  Caller->>RagasMetrics: evaluate(scope)
  RagasMetrics->>RagasMetrics: catch EvaluationError / EmbeddingError and return failure tuple

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

lightspeed-core/lightspeed-evaluation#239: Also changes src/lightspeed_evaluation/core/embedding/ragas.py and the RagasEmbeddingManager initialization path.

Suggested reviewers

VladimirKadlec
asamal4

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title points to the env-var validation failure that is part of the embedding readiness changes, though it is narrower than the full PR.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

src/lightspeed_evaluation/core/embedding/manager.py (1)
46-47: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add the constructor return type and Google-style Args section.

__init__ is part of the public API surface here, so it should include -> None and document config in Google style.
Proposed refactor
-    def __init__(self, config: EmbeddingConfig):
-        """Initialize with config. Validation is deferred until ensure_ready() is called."""
+    def __init__(self, config: EmbeddingConfig) -> None:
+        """Initialize with config.
+
+        Args:
+            config: Embedding configuration. Validation is deferred until
+                ensure_ready() is called.
+        """
As per coding guidelines, src/**/*.py public functions and methods need type hints and Google-style docstrings.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lightspeed_evaluation/core/embedding/manager.py` around lines 46 - 47,
The public constructor in EmbeddingManager.__init__ is missing the explicit
return annotation and Google-style parameter docs. Update __init__ to declare ->
None and expand its docstring with a Google-style Args section documenting
config, keeping the existing note about deferred validation.
Source: Coding guidelines

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lightspeed_evaluation/core/embedding/manager.py`:
- Around line 64-68: The Embedding Manager startup log is exposing raw
provider_kwargs, which may contain secrets. Update the logging in the Embedding
Manager initialization/ready message to avoid printing the full provider_kwargs
dict; keep the provider and model, and either omit provider_kwargs entirely or
replace it with a safe summary (for example, only non-sensitive keys or a
count). Use the logger.info call in the Embedding Manager class to make this
change.

In `@src/lightspeed_evaluation/core/metrics/ragas.py`:
- Around line 97-100: The lazy initialization in evaluate via
RagasEmbeddingManager(self._embedding_manager) can now raise provider validation
errors that bypass the existing failure handling. Update evaluate in
RagasMetrics to catch the project’s embedding/configuration exceptions during
this manager creation path, using EmbeddingError and ConfigurationError (or the
shared exception base if available), so missing credentials return the existing
(None, "...failed...") result instead of aborting evaluation.

---

Nitpick comments:
In `@src/lightspeed_evaluation/core/embedding/manager.py`:
- Around line 46-47: The public constructor in EmbeddingManager.__init__ is
missing the explicit return annotation and Google-style parameter docs. Update
__init__ to declare -> None and expand its docstring with a Google-style Args
section documenting config, keeping the existing note about deferred validation.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2708eea4-352b-4fc3-9188-f453377c938b

📥 Commits

Reviewing files that changed from the base of the PR and between a625cab and 62df1f7.

📒 Files selected for processing (3)

src/lightspeed_evaluation/core/embedding/manager.py
src/lightspeed_evaluation/core/embedding/ragas.py
src/lightspeed_evaluation/core/metrics/ragas.py

bsatapat-jpg · 2026-06-24T09:00:33Z

@coderabbitai review

xmican10

Thanks! I have two things:

RagasMetrics is shared across threads, so the lazy embedding_manager property and ensure_ready() can race on first access. The codebase already uses litellm_state_lock for the same pattern — should we add a lock here too?
The new lazy validation and exception handling don't have test coverage yet. I think we should add tests for ensure_ready() idempotency/error cases, the lazy property behavior, and the new exception types in evaluate(), wdyt?

bsatapat-jpg · 2026-06-24T10:06:58Z

Thanks! I have two things:

RagasMetrics is shared across threads, so the lazy embedding_manager property and ensure_ready() can race on first access. The codebase already uses litellm_state_lock for the same pattern — should we add a lock here too?

The new lazy validation and exception handling don't have test coverage yet. I think we should add tests for ensure_ready() idempotency/error cases, the lazy property behavior, and the new exception types in evaluate(), wdyt?

Thanks for reviewing it Eva,

RagasMetrics is shared across threads via the ThreadPoolExecutor. However, the race here is harmless. At worst, two threads create a RagasEmbeddingManager and one gets discarded. The underlying ensure_ready() is idempotent, and CPython's GIL makes the assignment atomic means no corruption can happen.
The litellm_state_lock solves a different problem (process-global litellm.cache mutation), whereas this is a simple lazy init with no side effects. A lock here would add contention without a real benefit.
Agreed. I will be adding the test cases.

xmican10 · 2026-06-24T10:18:06Z

RagasMetrics is shared across threads via the ThreadPoolExecutor. However, the race here is harmless. At worst, two threads create a RagasEmbeddingManager and one gets discarded. The underlying ensure_ready() is idempotent, and CPython's GIL makes the assignment atomic means no corruption can happen.
The litellm_state_lock solves a different problem (process-global litellm.cache mutation), whereas this is a simple lazy init with no side effects. A lock here would add contention without a real benefit.

Thanks for the clear up. Does this also apply to the Huggingface's embedding models which are downloaded and running on the local machine?

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

tests/unit/core/metrics/test_ragas.py (1)
98-130: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Add tests for BrokenPipeError and OSError evaluate branches.

evaluate() has dedicated messaging for these branches, but the parametrized matrix here doesn’t cover them, so those paths can regress silently.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/core/metrics/test_ragas.py` around lines 98 - 130, The exception
matrix in test_catches_exception_gracefully is missing coverage for the
dedicated BrokenPipeError and OSError branches in RagasMetrics.evaluate. Extend
the parametrized cases in test_catches_exception_gracefully to include
BrokenPipeError and OSError with representative messages, and keep asserting
that evaluate() returns None plus an error reason containing both the specific
message and the generic “evaluation failed” text.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lightspeed_evaluation/core/metrics/ragas.py`:
- Around line 77-79: Update the RagasMetrics.__init__ Google-style docstring to
reflect that embedding validation is deferred until lazy initialization rather
than completed up front. In the argument/contract text for embedding_manager,
remove language implying it is already validated and instead describe that it is
stored for later use by RagasEmbeddingManager and checked when the embedding
manager is first accessed. Use the RagasMetrics.__init__ and
_ragas_embedding_manager symbols to locate the docs and keep the wording aligned
with the new lazy behavior.

In `@tests/unit/core/embedding/test_manager.py`:
- Around line 19-31: The idempotency test for EmbeddingManager.ensure_ready() is
only patching _validate_config after the first two calls, so it can miss a
validation happening on the second call. Update test_second_call_is_noop in
test_manager.py to patch manager._validate_config before the first
ensure_ready() call, then invoke ensure_ready() twice and assert the validation
mock was called exactly once. Use the existing EmbeddingManager and
_validate_config symbols to keep the test focused on the idempotency behavior.
- Line 1: Remove the pylint suppression in test_manager and update the test to
validate the public behavior of the embedding manager instead of reading
protected state directly. In the test cases that currently access the protected
member, assert the observable retry outcome through the relevant manager methods
and retry-related side effects so the same contract is covered without
protected-access usage.

In `@tests/unit/core/metrics/test_ragas.py`:
- Line 1: The test module currently uses a file-level pylint suppression to mask
redefined-outer-name, too-many-arguments, and too-many-positional-arguments
warnings. Remove the global disable from the test file and refactor the affected
fixtures/test functions in test_ragas.py so their signatures no longer trigger
those lint rules, using the specific pytest fixture and test names in that file
to locate and update the problematic patterns.

---

Nitpick comments:
In `@tests/unit/core/metrics/test_ragas.py`:
- Around line 98-130: The exception matrix in test_catches_exception_gracefully
is missing coverage for the dedicated BrokenPipeError and OSError branches in
RagasMetrics.evaluate. Extend the parametrized cases in
test_catches_exception_gracefully to include BrokenPipeError and OSError with
representative messages, and keep asserting that evaluate() returns None plus an
error reason containing both the specific message and the generic “evaluation
failed” text.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2ac44954-98f5-4d45-9e3e-2d21601792a3

📥 Commits

Reviewing files that changed from the base of the PR and between 62df1f7 and f589513.

📒 Files selected for processing (5)

src/lightspeed_evaluation/core/embedding/manager.py
src/lightspeed_evaluation/core/embedding/ragas.py
src/lightspeed_evaluation/core/metrics/ragas.py
tests/unit/core/embedding/test_manager.py
tests/unit/core/metrics/test_ragas.py

🚧 Files skipped from review as they are similar to previous changes (2)

src/lightspeed_evaluation/core/embedding/ragas.py
src/lightspeed_evaluation/core/embedding/manager.py

bsatapat-jpg · 2026-06-24T10:55:20Z

RagasMetrics is shared across threads via the ThreadPoolExecutor. However, the race here is harmless. At worst, two threads create a RagasEmbeddingManager and one gets discarded. The underlying ensure_ready() is idempotent, and CPython's GIL makes the assignment atomic means no corruption can happen.
The litellm_state_lock solves a different problem (process-global litellm.cache mutation), whereas this is a simple lazy init with no side effects. A lock here would add contention without a real benefit.

Thanks for the clear up. Does this also apply to the Huggingface's embedding models which are downloaded and running on the local machine?

Yes...Same argument applies. No issues for any provider including HuggingFace local models.

asamal4 · 2026-06-24T23:48:46Z

@bsatapat-jpg The impact is not about data corruption - rather memory usage, this is not a major concern for cloud provider. But for huggingface it is going to be a problem as we load the embedding model. If we use more threads, then the memory usage will spike.
Immediately it is harmless (considering this is not a recommended metric + mostly we are using cloud provider). But the problem is real..
To fix the problem quickly - we are also making it a run time check which violates our early validation use-case.. If embedding model is not setup correctly, then it is possible that it will fail after few metrics calculation depending upon how the conversation set up is done..

I see two options as follow up

gather all metrics info during load - it will give us control about which config/framework import to be done
add a lock

cc: @xmican10

bsatapat-jpg · 2026-06-25T06:31:44Z

@bsatapat-jpg The impact is not about data corruption - rather memory usage, this is not a major concern for cloud provider. But for huggingface it is going to be a problem as we load the embedding model. If we use more threads, then the memory usage will spike. Immediately it is harmless (considering this is not a recommended metric + mostly we are using cloud provider). But the problem is real.. To fix the problem quickly - we are also making it a run time check which violates our early validation use-case.. If embedding model is not setup correctly, then it is possible that it will fail after few metrics calculation depending upon how the conversation set up is done..
I see two options as follow up

gather all metrics info during load - it will give us control about which config/framework import to be done

add a lock

cc: @xmican10

I will add a lock to the lazy embedding_manager property (quick fix for thread-safety)

bsatapat-jpg · 2026-06-25T06:41:38Z

@bsatapat-jpg The impact is not about data corruption - rather memory usage, this is not a major concern for cloud provider. But for huggingface it is going to be a problem as we load the embedding model. If we use more threads, then the memory usage will spike. Immediately it is harmless (considering this is not a recommended metric + mostly we are using cloud provider). But the problem is real.. To fix the problem quickly - we are also making it a run time check which violates our early validation use-case.. If embedding model is not setup correctly, then it is possible that it will fail after few metrics calculation depending upon how the conversation set up is done..

I see two options as follow up

gather all metrics info during load - it will give us control about which config/framework import to be done

add a lock

cc: @xmican10

Added a lock to address the thread-safety concern for HuggingFace local models. The "gather metrics info at load time" approach is tracked as a follow-up.

VladimirKadlec · 2026-06-25T06:41:56Z

@bsatapat-jpg wrote:

I will add a lock to the lazy embedding_manager property (quick fix for thread-safety)

Yes, IMO the best solution for now. You really don't want to have any race(s) in the code.

@asamal4 wrote:

If we use more threads, then the memory usage will spike.

Not true for this particular scenario at least on Linux. Linux uses copy on write for memory allocation, the embedding models are typically read-only. However I definitely agree that the problem is real and it has to be solved.

bsatapat-jpg · 2026-06-25T06:47:54Z

@bsatapat-jpg wrote:

I will add a lock to the lazy embedding_manager property (quick fix for thread-safety)

Yes, IMO the best solution for now. You really don't want to have any race(s) in the code.

Thanks Kada... Added a threading.Lock with double-checked locking on the embedding_manager property.
PTAL

VladimirKadlec

LGTM, thank you.

coderabbitai Bot reviewed Jun 24, 2026

View reviewed changes

Comment thread src/lightspeed_evaluation/core/embedding/manager.py Outdated

Comment thread src/lightspeed_evaluation/core/metrics/ragas.py Outdated

bsatapat-jpg force-pushed the dev branch from 62df1f7 to 33ade07 Compare June 24, 2026 08:55

xmican10 reviewed Jun 24, 2026

View reviewed changes

bsatapat-jpg force-pushed the dev branch from 33ade07 to f589513 Compare June 24, 2026 10:20

coderabbitai Bot reviewed Jun 24, 2026

View reviewed changes

Comment thread src/lightspeed_evaluation/core/metrics/ragas.py

Comment thread tests/unit/core/embedding/test_manager.py Outdated

Comment thread tests/unit/core/embedding/test_manager.py Outdated

Comment thread tests/unit/core/metrics/test_ragas.py Outdated

bsatapat-jpg force-pushed the dev branch from f589513 to d941a5a Compare June 24, 2026 10:42

[LEADS-443] OPENAI_API_KEY environmental variable failure

69215ed

bsatapat-jpg force-pushed the dev branch from d941a5a to 69215ed Compare June 25, 2026 06:35

bsatapat-jpg requested review from VladimirKadlec and asamal4 June 25, 2026 06:46

VladimirKadlec approved these changes Jun 25, 2026

View reviewed changes

asamal4 merged commit 30f5c59 into lightspeed-core:main Jun 25, 2026
17 checks passed

bsatapat-jpg deleted the dev branch June 26, 2026 12:38

Uh oh!

Conversation

bsatapat-jpg commented Jun 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

bsatapat-jpg commented Jun 24, 2026

Uh oh!

xmican10 left a comment

Choose a reason for hiding this comment

Uh oh!

bsatapat-jpg commented Jun 24, 2026

Uh oh!

xmican10 commented Jun 24, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bsatapat-jpg commented Jun 24, 2026

Uh oh!

asamal4 commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bsatapat-jpg commented Jun 25, 2026

Uh oh!

bsatapat-jpg commented Jun 25, 2026

Uh oh!

VladimirKadlec commented Jun 25, 2026

Uh oh!

bsatapat-jpg commented Jun 25, 2026

Uh oh!

VladimirKadlec left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bsatapat-jpg commented Jun 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading

asamal4 commented Jun 24, 2026 •

edited

Loading