Skip to content

[LEADS-443] OPENAI_API_KEY environmental variable failure#260

Merged
asamal4 merged 1 commit into
lightspeed-core:mainfrom
bsatapat-jpg:dev
Jun 25, 2026
Merged

[LEADS-443] OPENAI_API_KEY environmental variable failure#260
asamal4 merged 1 commit into
lightspeed-core:mainfrom
bsatapat-jpg:dev

Conversation

@bsatapat-jpg

@bsatapat-jpg bsatapat-jpg commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Description

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Unit tests improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: Cluade-4.6-high
  • Generated by: Cursor

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features
    • Added an embedding provider readiness check via ensure_ready() and improved initialization readiness behavior.
    • Made Ragas embedding/metric dependencies load lazily and cache on first use.
  • Bug Fixes
    • Validation is deferred until readiness is actually required, preventing premature startup failures.
    • Expanded metric evaluation error handling to gracefully handle embedding/evaluation failures.
  • Improvements
    • Switched readiness/runtime messaging to structured logging.
  • Tests
    • Added/extended unit tests for idempotent readiness, lazy instantiation, concurrent access safety, and evaluation failure/unsupported metric responses.

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 40e17488-960e-49b2-8d7e-1878664c065a

📥 Commits

Reviewing files that changed from the base of the PR and between f589513 and 69215ed.

📒 Files selected for processing (6)
  • src/lightspeed_evaluation/core/embedding/manager.py
  • src/lightspeed_evaluation/core/embedding/ragas.py
  • src/lightspeed_evaluation/core/metrics/ragas.py
  • tests/unit/core/embedding/test_manager.py
  • tests/unit/core/metrics/conftest.py
  • tests/unit/core/metrics/test_ragas.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/lightspeed_evaluation/core/embedding/ragas.py
  • src/lightspeed_evaluation/core/metrics/ragas.py
  • src/lightspeed_evaluation/core/embedding/manager.py

Walkthrough

EmbeddingManager now defers provider validation to ensure_ready(). RagasEmbeddingManager calls that check during initialization. RagasMetrics lazily constructs and caches its embedding manager and expands evaluate() exception handling.

Changes

Lazy Embedding Validation

Layer / File(s) Summary
EmbeddingManager readiness check
src/lightspeed_evaluation/core/embedding/manager.py
Adds module-level logging and changes EmbeddingManager to track _validated and validate only through ensure_ready(), which logs readiness instead of printing.
Ragas initialization and metrics caching
src/lightspeed_evaluation/core/embedding/ragas.py, src/lightspeed_evaluation/core/metrics/ragas.py
RagasEmbeddingManager.__init__ calls embedding_manager.ensure_ready() and documents raised errors. RagasMetrics stores the embedding manager privately, lazily creates and caches RagasEmbeddingManager, and catches additional evaluation exceptions.
Readiness and evaluation tests
tests/unit/core/embedding/test_manager.py, tests/unit/core/metrics/conftest.py, tests/unit/core/metrics/test_ragas.py
Adds unit tests for ensure_ready() idempotency and failure cases, plus lazy property caching, fixture setup, and supported exception handling in RagasMetrics.evaluate().

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant RagasMetrics
  participant RagasEmbeddingManager
  participant EmbeddingManager
  Caller->>RagasMetrics: __init__(embedding_manager)
  RagasMetrics->>RagasMetrics: store _embedding_manager, set _ragas_embedding_manager=None
  Caller->>RagasMetrics: access embedding_manager
  RagasMetrics->>RagasEmbeddingManager: construct from _embedding_manager
  RagasEmbeddingManager->>EmbeddingManager: ensure_ready()
  RagasMetrics->>RagasMetrics: cache RagasEmbeddingManager
  Caller->>RagasMetrics: evaluate(scope)
  RagasMetrics->>RagasMetrics: catch EvaluationError / EmbeddingError and return failure tuple
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • VladimirKadlec
  • asamal4
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title points to the env-var validation failure that is part of the embedding readiness changes, though it is narrower than the full PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/lightspeed_evaluation/core/embedding/manager.py (1)

46-47: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add the constructor return type and Google-style Args section.

__init__ is part of the public API surface here, so it should include -> None and document config in Google style.

Proposed refactor
-    def __init__(self, config: EmbeddingConfig):
-        """Initialize with config. Validation is deferred until ensure_ready() is called."""
+    def __init__(self, config: EmbeddingConfig) -> None:
+        """Initialize with config.
+
+        Args:
+            config: Embedding configuration. Validation is deferred until
+                ensure_ready() is called.
+        """

As per coding guidelines, src/**/*.py public functions and methods need type hints and Google-style docstrings.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lightspeed_evaluation/core/embedding/manager.py` around lines 46 - 47,
The public constructor in EmbeddingManager.__init__ is missing the explicit
return annotation and Google-style parameter docs. Update __init__ to declare ->
None and expand its docstring with a Google-style Args section documenting
config, keeping the existing note about deferred validation.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lightspeed_evaluation/core/embedding/manager.py`:
- Around line 64-68: The Embedding Manager startup log is exposing raw
provider_kwargs, which may contain secrets. Update the logging in the Embedding
Manager initialization/ready message to avoid printing the full provider_kwargs
dict; keep the provider and model, and either omit provider_kwargs entirely or
replace it with a safe summary (for example, only non-sensitive keys or a
count). Use the logger.info call in the Embedding Manager class to make this
change.

In `@src/lightspeed_evaluation/core/metrics/ragas.py`:
- Around line 97-100: The lazy initialization in evaluate via
RagasEmbeddingManager(self._embedding_manager) can now raise provider validation
errors that bypass the existing failure handling. Update evaluate in
RagasMetrics to catch the project’s embedding/configuration exceptions during
this manager creation path, using EmbeddingError and ConfigurationError (or the
shared exception base if available), so missing credentials return the existing
(None, "...failed...") result instead of aborting evaluation.

---

Nitpick comments:
In `@src/lightspeed_evaluation/core/embedding/manager.py`:
- Around line 46-47: The public constructor in EmbeddingManager.__init__ is
missing the explicit return annotation and Google-style parameter docs. Update
__init__ to declare -> None and expand its docstring with a Google-style Args
section documenting config, keeping the existing note about deferred validation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2708eea4-352b-4fc3-9188-f453377c938b

📥 Commits

Reviewing files that changed from the base of the PR and between a625cab and 62df1f7.

📒 Files selected for processing (3)
  • src/lightspeed_evaluation/core/embedding/manager.py
  • src/lightspeed_evaluation/core/embedding/ragas.py
  • src/lightspeed_evaluation/core/metrics/ragas.py

Comment thread src/lightspeed_evaluation/core/embedding/manager.py Outdated
Comment thread src/lightspeed_evaluation/core/metrics/ragas.py Outdated
@bsatapat-jpg

Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@xmican10 xmican10 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have two things:

  • RagasMetrics is shared across threads, so the lazy embedding_manager property and ensure_ready() can race on first access. The codebase already uses litellm_state_lock for the same pattern — should we add a lock here too?
  • The new lazy validation and exception handling don't have test coverage yet. I think we should add tests for ensure_ready() idempotency/error cases, the lazy property behavior, and the new exception types in evaluate(), wdyt?

@bsatapat-jpg

Copy link
Copy Markdown
Collaborator Author

Thanks! I have two things:

  • RagasMetrics is shared across threads, so the lazy embedding_manager property and ensure_ready() can race on first access. The codebase already uses litellm_state_lock for the same pattern — should we add a lock here too?
  • The new lazy validation and exception handling don't have test coverage yet. I think we should add tests for ensure_ready() idempotency/error cases, the lazy property behavior, and the new exception types in evaluate(), wdyt?

Thanks for reviewing it Eva,

  1. RagasMetrics is shared across threads via the ThreadPoolExecutor. However, the race here is harmless. At worst, two threads create a RagasEmbeddingManager and one gets discarded. The underlying ensure_ready() is idempotent, and CPython's GIL makes the assignment atomic means no corruption can happen.
    The litellm_state_lock solves a different problem (process-global litellm.cache mutation), whereas this is a simple lazy init with no side effects. A lock here would add contention without a real benefit.

  2. Agreed. I will be adding the test cases.

@xmican10

Copy link
Copy Markdown
Collaborator

RagasMetrics is shared across threads via the ThreadPoolExecutor. However, the race here is harmless. At worst, two threads create a RagasEmbeddingManager and one gets discarded. The underlying ensure_ready() is idempotent, and CPython's GIL makes the assignment atomic means no corruption can happen.
The litellm_state_lock solves a different problem (process-global litellm.cache mutation), whereas this is a simple lazy init with no side effects. A lock here would add contention without a real benefit.

Thanks for the clear up. Does this also apply to the Huggingface's embedding models which are downloaded and running on the local machine?

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
tests/unit/core/metrics/test_ragas.py (1)

98-130: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Add tests for BrokenPipeError and OSError evaluate branches.

evaluate() has dedicated messaging for these branches, but the parametrized matrix here doesn’t cover them, so those paths can regress silently.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/core/metrics/test_ragas.py` around lines 98 - 130, The exception
matrix in test_catches_exception_gracefully is missing coverage for the
dedicated BrokenPipeError and OSError branches in RagasMetrics.evaluate. Extend
the parametrized cases in test_catches_exception_gracefully to include
BrokenPipeError and OSError with representative messages, and keep asserting
that evaluate() returns None plus an error reason containing both the specific
message and the generic “evaluation failed” text.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lightspeed_evaluation/core/metrics/ragas.py`:
- Around line 77-79: Update the RagasMetrics.__init__ Google-style docstring to
reflect that embedding validation is deferred until lazy initialization rather
than completed up front. In the argument/contract text for embedding_manager,
remove language implying it is already validated and instead describe that it is
stored for later use by RagasEmbeddingManager and checked when the embedding
manager is first accessed. Use the RagasMetrics.__init__ and
_ragas_embedding_manager symbols to locate the docs and keep the wording aligned
with the new lazy behavior.

In `@tests/unit/core/embedding/test_manager.py`:
- Around line 19-31: The idempotency test for EmbeddingManager.ensure_ready() is
only patching _validate_config after the first two calls, so it can miss a
validation happening on the second call. Update test_second_call_is_noop in
test_manager.py to patch manager._validate_config before the first
ensure_ready() call, then invoke ensure_ready() twice and assert the validation
mock was called exactly once. Use the existing EmbeddingManager and
_validate_config symbols to keep the test focused on the idempotency behavior.
- Line 1: Remove the pylint suppression in test_manager and update the test to
validate the public behavior of the embedding manager instead of reading
protected state directly. In the test cases that currently access the protected
member, assert the observable retry outcome through the relevant manager methods
and retry-related side effects so the same contract is covered without
protected-access usage.

In `@tests/unit/core/metrics/test_ragas.py`:
- Line 1: The test module currently uses a file-level pylint suppression to mask
redefined-outer-name, too-many-arguments, and too-many-positional-arguments
warnings. Remove the global disable from the test file and refactor the affected
fixtures/test functions in test_ragas.py so their signatures no longer trigger
those lint rules, using the specific pytest fixture and test names in that file
to locate and update the problematic patterns.

---

Nitpick comments:
In `@tests/unit/core/metrics/test_ragas.py`:
- Around line 98-130: The exception matrix in test_catches_exception_gracefully
is missing coverage for the dedicated BrokenPipeError and OSError branches in
RagasMetrics.evaluate. Extend the parametrized cases in
test_catches_exception_gracefully to include BrokenPipeError and OSError with
representative messages, and keep asserting that evaluate() returns None plus an
error reason containing both the specific message and the generic “evaluation
failed” text.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2ac44954-98f5-4d45-9e3e-2d21601792a3

📥 Commits

Reviewing files that changed from the base of the PR and between 62df1f7 and f589513.

📒 Files selected for processing (5)
  • src/lightspeed_evaluation/core/embedding/manager.py
  • src/lightspeed_evaluation/core/embedding/ragas.py
  • src/lightspeed_evaluation/core/metrics/ragas.py
  • tests/unit/core/embedding/test_manager.py
  • tests/unit/core/metrics/test_ragas.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/lightspeed_evaluation/core/embedding/ragas.py
  • src/lightspeed_evaluation/core/embedding/manager.py

Comment thread src/lightspeed_evaluation/core/metrics/ragas.py
Comment thread tests/unit/core/embedding/test_manager.py Outdated
Comment thread tests/unit/core/embedding/test_manager.py Outdated
Comment thread tests/unit/core/metrics/test_ragas.py Outdated
@bsatapat-jpg

Copy link
Copy Markdown
Collaborator Author

RagasMetrics is shared across threads via the ThreadPoolExecutor. However, the race here is harmless. At worst, two threads create a RagasEmbeddingManager and one gets discarded. The underlying ensure_ready() is idempotent, and CPython's GIL makes the assignment atomic means no corruption can happen.
The litellm_state_lock solves a different problem (process-global litellm.cache mutation), whereas this is a simple lazy init with no side effects. A lock here would add contention without a real benefit.

Thanks for the clear up. Does this also apply to the Huggingface's embedding models which are downloaded and running on the local machine?

Yes...Same argument applies. No issues for any provider including HuggingFace local models.

@asamal4

asamal4 commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

@bsatapat-jpg The impact is not about data corruption - rather memory usage, this is not a major concern for cloud provider. But for huggingface it is going to be a problem as we load the embedding model. If we use more threads, then the memory usage will spike.
Immediately it is harmless (considering this is not a recommended metric + mostly we are using cloud provider). But the problem is real..
To fix the problem quickly - we are also making it a run time check which violates our early validation use-case.. If embedding model is not setup correctly, then it is possible that it will fail after few metrics calculation depending upon how the conversation set up is done..

I see two options as follow up

  1. gather all metrics info during load - it will give us control about which config/framework import to be done
  2. add a lock

cc: @xmican10

@bsatapat-jpg

Copy link
Copy Markdown
Collaborator Author

@bsatapat-jpg The impact is not about data corruption - rather memory usage, this is not a major concern for cloud provider. But for huggingface it is going to be a problem as we load the embedding model. If we use more threads, then the memory usage will spike. Immediately it is harmless (considering this is not a recommended metric + mostly we are using cloud provider). But the problem is real.. To fix the problem quickly - we are also making it a run time check which violates our early validation use-case.. If embedding model is not setup correctly, then it is possible that it will fail after few metrics calculation depending upon how the conversation set up is done..
I see two options as follow up

  1. gather all metrics info during load - it will give us control about which config/framework import to be done
  2. add a lock

cc: @xmican10

I will add a lock to the lazy embedding_manager property (quick fix for thread-safety)

@bsatapat-jpg

Copy link
Copy Markdown
Collaborator Author

@bsatapat-jpg The impact is not about data corruption - rather memory usage, this is not a major concern for cloud provider. But for huggingface it is going to be a problem as we load the embedding model. If we use more threads, then the memory usage will spike. Immediately it is harmless (considering this is not a recommended metric + mostly we are using cloud provider). But the problem is real.. To fix the problem quickly - we are also making it a run time check which violates our early validation use-case.. If embedding model is not setup correctly, then it is possible that it will fail after few metrics calculation depending upon how the conversation set up is done..

I see two options as follow up

  1. gather all metrics info during load - it will give us control about which config/framework import to be done
  2. add a lock

cc: @xmican10

Added a lock to address the thread-safety concern for HuggingFace local models. The "gather metrics info at load time" approach is tracked as a follow-up.

@VladimirKadlec

Copy link
Copy Markdown
Member

@bsatapat-jpg wrote:

I will add a lock to the lazy embedding_manager property (quick fix for thread-safety)

Yes, IMO the best solution for now. You really don't want to have any race(s) in the code.

@asamal4 wrote:

If we use more threads, then the memory usage will spike.

Not true for this particular scenario at least on Linux. Linux uses copy on write for memory allocation, the embedding models are typically read-only. However I definitely agree that the problem is real and it has to be solved.

@bsatapat-jpg

Copy link
Copy Markdown
Collaborator Author

@bsatapat-jpg wrote:

I will add a lock to the lazy embedding_manager property (quick fix for thread-safety)

Yes, IMO the best solution for now. You really don't want to have any race(s) in the code.

Thanks Kada... Added a threading.Lock with double-checked locking on the embedding_manager property.
PTAL

@VladimirKadlec VladimirKadlec left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you.

@asamal4 asamal4 merged commit 30f5c59 into lightspeed-core:main Jun 25, 2026
17 checks passed
@bsatapat-jpg bsatapat-jpg deleted the dev branch June 26, 2026 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants