Skip to content

Conversation

@ogabrielluiz
Copy link
Contributor

@ogabrielluiz ogabrielluiz commented Dec 16, 2025

Summary

  • Implements lazy loading for individual model classes instead of loading all providers at once
  • Reduces startup time and memory usage when only specific providers are needed
  • Adds get_model_class(class_name) function for loading specific providers on demand

Changes

  • Added get_model_class(class_name) function that imports only the requested provider
  • Updated get_llm() to use the new lazy loading function
  • get_model_classes() now uses the individual loader internally
  • Added pragma comments for false positive secret detection in variable names

Benefits

  • Faster import times when using a single provider
  • Reduced memory footprint
  • Avoids ImportError for uninstalled optional dependencies

Test plan

  • Verify model loading works with each provider (OpenAI, Anthropic, Google, Ollama, WatsonX)
  • Verify startup time improvement

Summary by CodeRabbit

  • Chores
    • Optimized model initialization to reduce startup overhead and improve handling of optional dependencies.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions github-actions bot added the community Pull Request from an external contributor label Dec 16, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 16, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Introduced a per-provider lazy loading mechanism for model classes via a new get_model_class() function that imports specific model classes on demand, replacing the previous all-at-once loader. The provider-class registry is reworked with an internal _MODEL_CLASS_NAMES list, and all call sites are updated to use the new targeted loader.

Changes

Cohort / File(s) Summary
Lazy loading refactor
src/lfx/src/lfx/base/models/unified_models.py
Added get_model_class(class_name: str) function for per-provider lazy importing; introduced _MODEL_CLASS_NAMES registry; refactored get_model_classes() to use targeted loader; updated get_llm() and normalize_model_names_to_dicts() to call new loader; added pragma comments for API key variable mappings.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

  • Verify get_model_class() correctly handles all five providers (OpenAI, Anthropic, Google Generative AI, Ollama, Watsonx) and error cases for unknown classes
  • Confirm all call sites that retrieve model classes have been updated and no references to the old pattern remain
  • Ensure backward compatibility of get_model_classes() return type and behavior
  • Review error handling and import failure scenarios in the lazy loader
  • Validate that pragma comments on API key mappings do not mask legitimate security concerns

Pre-merge checks and finishing touches

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (2 warnings, 1 inconclusive)
Check name Status Explanation Resolution
Test Quality And Coverage ⚠️ Warning Test file exists but lacks comprehensive coverage for new lazy loading implementation, missing tests for get_model_class(), provider imports, and integration. Add dedicated unit tests validating lazy loading functionality, provider module import deferral, error handling, and integration with get_llm() and normalize_model_names_to_dicts().
Test File Naming And Structure ⚠️ Warning Test file exists but does not cover new lazy-loading functions (get_model_class() and get_model_classes()) introduced in this PR. Test plan items remain incomplete and verification of model loading and startup time improvements is not done. Create comprehensive tests for lazy-loading functions covering all five providers, error handling, edge cases, and performance benchmarks. Execute and verify all test plan items before merging.
Test Coverage For New Implementations ❓ Inconclusive No verification output provided to convert Please provide the verification output that needs to be converted to the required JSON format
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: implementing lazy loading for individual model classes, which is the core objective of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.
Excessive Mock Usage Warning ✅ Passed Test file uses minimal mocking approach, validating actual behavior through real function calls and genuine returned data across 8 focused test functions.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added the enhancement New feature or request label Dec 16, 2025
@ogabrielluiz ogabrielluiz removed the community Pull Request from an external contributor label Dec 16, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Dec 16, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Dec 16, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 16, 2025

Frontend Unit Test Coverage Report

Coverage Summary

Lines Statements Branches Functions
Coverage: 17%
16.64% (4686/28150) 9.99% (2179/21792) 10.93% (676/6181)

Unit Test Results

Tests Skipped Failures Errors Time
1829 0 💤 0 ❌ 0 🔥 24.018s ⏱️

@codecov
Copy link

codecov bot commented Dec 16, 2025

Codecov Report

❌ Patch coverage is 17.39130% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 33.10%. Comparing base (b86351f) to head (603f32c).

Files with missing lines Patch % Lines
src/lfx/src/lfx/base/models/unified_models.py 17.39% 19 Missing ⚠️

❌ Your project check has failed because the head coverage (39.25%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main   #11051      +/-   ##
==========================================
- Coverage   33.11%   33.10%   -0.01%     
==========================================
  Files        1389     1389              
  Lines       65714    65727      +13     
  Branches     9730     9735       +5     
==========================================
+ Hits        21760    21761       +1     
- Misses      42836    42848      +12     
  Partials     1118     1118              
Flag Coverage Δ
frontend 15.34% <ø> (ø)
lfx 39.25% <17.39%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/lfx/src/lfx/base/models/unified_models.py 11.72% <17.39%> (-0.13%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/lfx/src/lfx/base/models/unified_models.py (2)

22-48: Good lazy loading implementation, but consider adding type hints and error handling.

The per-provider lazy loading is well-implemented and achieves the stated goals. However, consider these improvements:

  1. Missing return type annotation: Add a return type hint for better type safety.
  2. Import error handling: When an optional dependency is missing, the bare import will raise ImportError. Consider wrapping imports in try-except blocks to provide clearer error messages about missing packages.

Apply this diff to add type hints:

-def get_model_class(class_name: str):
+def get_model_class(class_name: str) -> type | None:
     """Lazy load a specific model class to avoid importing unused dependencies.

Consider adding error handling for optional dependencies:

 def get_model_class(class_name: str) -> type | None:
     """Lazy load a specific model class to avoid importing unused dependencies.

     This imports only the requested provider, not all providers at once.
     """
     if class_name == "ChatOpenAI":
-        from langchain_openai import ChatOpenAI
-
-        return ChatOpenAI
+        try:
+            from langchain_openai import ChatOpenAI
+            return ChatOpenAI
+        except ImportError as e:
+            msg = f"Cannot import {class_name}. Install langchain-openai: pip install langchain-openai"
+            raise ImportError(msg) from e
     if class_name == "ChatAnthropic":
-        from langchain_anthropic import ChatAnthropic
-
-        return ChatAnthropic
+        try:
+            from langchain_anthropic import ChatAnthropic
+            return ChatAnthropic
+        except ImportError as e:
+            msg = f"Cannot import {class_name}. Install langchain-anthropic: pip install langchain-anthropic"
+            raise ImportError(msg) from e
     # ... similar pattern for other providers

60-66: Consider restoring @lru_cache for consistency with similar utility functions.

While get_model_classes() does import all provider packages (already documented in the docstring), restoring the @lru_cache(maxsize=1) decorator would align with the pattern used by get_embedding_classes() and get_model_provider_metadata() in the same module. This is a minor optimization since the function has no parameters and is called infrequently (once per batch operation in batch_run.py), but caching the resulting dict avoids recreating it on repeated invocations.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b86351f and 1fe94d5.

📒 Files selected for processing (1)
  • src/lfx/src/lfx/base/models/unified_models.py (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/lfx/src/lfx/base/models/unified_models.py (1)
src/lfx/src/lfx/base/models/google_generative_ai_model.py (1)
  • ChatGoogleGenerativeAIFixed (4-38)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Update Starter Projects
  • GitHub Check: Update Component Index
🔇 Additional comments (4)
src/lfx/src/lfx/base/models/unified_models.py (4)

50-57: Clean internal registry for model class names.

This list provides a clear registry of available model classes and keeps them in sync with get_model_class().


902-907: Excellent lazy loading implementation in get_llm().

This change is the key improvement that enables lazy loading. By calling get_model_class(model_class_name) instead of get_model_classes().get(...), the function now imports only the specific provider needed rather than all providers upfront.

The error handling is appropriate, with a clear ValueError raised when the model class cannot be loaded.


297-300: Appropriate use of pragma comments for false-positive suppression.

The # pragma: allowlist secret comments correctly suppress false-positive secret detection on variable and parameter names that contain "api_key" or similar strings. These are configuration keys, not actual secrets.

Also applies to: 654-654, 665-665, 679-679, 855-855


755-860: Note: Inconsistency with AI summary.

The AI summary states that normalize_model_names_to_dicts() was updated to use get_model_class instead of get_model_classes(), but no changes are marked in this function. Upon inspection, this function doesn't appear to directly call either function—it uses get_unified_models_detailed() and static mappings instead.

This doesn't indicate a code issue, but there may be a discrepancy in the summary.

@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Dec 16, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Dec 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants