Skip to content

feat: nvidia_nim encoder #582

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 20, 2025
Merged

feat: nvidia_nim encoder #582

merged 9 commits into from
Jun 20, 2025

Conversation

Joshua-Briggs
Copy link
Member

@Joshua-Briggs Joshua-Briggs commented Apr 9, 2025

PR Type

  • Enhancement
  • Tests
  • Documentation

Description

  • Added Jina, NVIDIA NIM, Voyage encoder classes.

  • Integrated new encoder types in AutoEncoder.

  • Updated defaults, schema and tests accordingly.

  • Provided new notebooks for encoder usage.


Changes walkthrough 📝

Relevant files
Enhancement
6 files
__init__.py
Integrate new Jina, Nim, Voyage encoder imports                   
+12/-0   
jina.py
Add new JinaEncoder implementation class                                 
+34/-0   
nvidia_nim.py
Add new NimEncoder for Nvidia NIM service                               
+85/-0   
voyage.py
Add new VoyageEncoder class for Voyage models                       
+85/-0   
schema.py
Extend EncoderType enum with new encoder types                     
+10/-0   
defaults.py
Add default configurations for new encoders                           
+12/-0   
Bug fix
1 files
litellm.py
Refine name splitting for encoder instantiation                   
+1/-1     
Tests
1 files
test_lite_encoders.py
Add test matrix entries for new encoder types                       
+36/-2   
Documentation
4 files
jina-encoder.ipynb
Document Jina encoder usage with notebook                               
+417/-0 
mistral-encoder.ipynb
Update mistral version from 1.1.6 to 0.1.8                             
+2/-2     
nvidia_nim-encoder.ipynb
Include notebook for Nvidia NIM encoder demonstration       
+407/-0 
voyage-encoder.ipynb
Add notebook documenting Voyage encoder usage                       
+393/-0 

Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • Copy link

    github-actions bot commented Apr 9, 2025

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 PR contains tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Docstring Inconsistency

    The docstring for the initializer refers to a parameter named "jina_api_key" while the actual parameter is "api_key". Consider aligning the documentation with the code for clarity.

    type: str = "jina"
    
    def __init__(
        self,
        name: str | None = None,
        api_key: str | None = None,
        score_threshold: float = 0.4,
    ):
        """Initialize the JinaEncoder.
    
        :param name: The name of the embedding model to use such as "jina-embeddings-v3".
        :param jina_api_key: The Jina API key.
        :type jina_api_key: str
        """
    
        if name is None:
            name = f"jina_ai/{EncoderDefault.JINA.value['embedding_model']}"
        elif not name.startswith("jina_ai/"):
            name = f"jina_ai/{name}"
        super().__init__(
            name=name,
            score_threshold=score_threshold,
            api_key=api_key,
        )

    Copy link

    github-actions bot commented Apr 9, 2025

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Correct dimension description

    Update the dimension description to match the actual output from
    rl.index.dimensions.

    docs/encoders/nvidia_nim-encoder.ipynb [259-265]

    -"We do have 256-dimensional vectors. Now let's test them:"
    +"We do have 1024-dimensional vectors. Now let's test them:"
    Suggestion importance[1-10]: 7

    __

    Why: The suggestion corrects a factual mismatch in the notebook's narrative by updating the described vector dimensionality from 256 to 1024, which aligns with the actual output. This improves clarity and accuracy in documentation.

    Medium
    Align return value description

    Revise the explanation to accurately reflect that an unmatched query returns a
    RouteChoice with a None name.

    docs/encoders/voyage-encoder.ipynb [358-363]

    -"In this case, we return `None` because no matches were identified. We always recommend optimizing your `RouteLayer` for optimal performance, you can see how in [this notebook](https://github.com/aurelio-labs/semantic-router/blob/main/docs/06-threshold-optimization.ipynb)."
    +"In this case, we return a RouteChoice with no match (with `name` set to None) because no valid route was identified. We always recommend optimizing your `RouteLayer` for optimal performance, as shown in [this notebook](https://github.com/aurelio-labs/semantic-router/blob/main/docs/06-threshold-optimization.ipynb)."
    Suggestion importance[1-10]: 4

    __

    Why: The suggestion refines the explanatory text to indicate that an unmatched query yields a RouteChoice with a None name rather than a raw None value. While this improves precision in the narrative, its impact is limited to documentation clarity.

    Low
    General
    Fix docstring parameter naming

    Update the docstring parameter name to match the actual constructor parameter
    (api_key).

    semantic_router/encoders/jina.py [13-24]

     def __init__(
         self,
         name: str | None = None,
         api_key: str | None = None,
         score_threshold: float = 0.4,
     ):
         """Initialize the JinaEncoder.
     
         :param name: The name of the embedding model to use such as "jina-embeddings-v3".
    -    :param jina_api_key: The Jina API key.
    -    :type jina_api_key: str
    +    :param api_key: The Jina API key.
    +    :type api_key: str
         """
    Suggestion importance[1-10]: 3

    __

    Why: This change corrects the incorrect parameter name in the docstring, aligning it with the constructor signature. Although it only affects documentation, it improves clarity.

    Low
    Update NimEncoder docstring

    Change the docstring parameter (nim_api_key) to use the actual parameter name
    (api_key).

    semantic_router/encoders/nvidia_nim.py [15-26]

     def __init__(
         self,
         name: str | None = None,
         api_key: str | None = None,
         score_threshold: float = 0.4,
     ):
         """Initialize the NimEncoder.
     
         :param name: The name of the embedding model to use such as "nv-embedqa-e5-v5".
    -    :param nim_api_key: The Nim API key.
    -    :type nim_api_key: str
    +    :param api_key: The Nim API key.
    +    :type api_key: str
         """
    Suggestion importance[1-10]: 3

    __

    Why: The suggested update replaces the misleading parameter name in the NimEncoder docstring with the correct one, enhancing the documentation's accuracy with minimal impact.

    Low
    Correct VoyageEncoder docstring

    Update the docstring parameter name (voyage_api_key) to match the actual parameter
    (api_key).

    semantic_router/encoders/voyage.py [15-26]

     def __init__(
         self,
         name: str | None = None,
         api_key: str | None = None,
         score_threshold: float = 0.4,
     ):
         """Initialize the VoyageEncoder.
     
         :param name: The name of the embedding model to use such as "voyage-embed".
    -    :param voyage_api_key: The Voyage API key.
    -    :type voyage_api_key: str
    +    :param api_key: The Voyage API key.
    +    :type api_key: str
         """
    Suggestion importance[1-10]: 3

    __

    Why: By updating the docstring parameter from "voyage_api_key" to "api_key," the change improves consistency with the function signature, representing a minor yet useful documentation fix.

    Low

    Copy link

    codecov bot commented Apr 9, 2025

    Codecov Report

    Attention: Patch coverage is 58.53659% with 17 lines in your changes missing coverage. Please review.

    Project coverage is 74.25%. Comparing base (900a0c9) to head (47797df).
    Report is 7 commits behind head on main.

    Files with missing lines Patch % Lines
    semantic_router/encoders/nvidia_nim.py 57.14% 15 Missing ⚠️
    semantic_router/encoders/__init__.py 33.33% 2 Missing ⚠️
    Additional details and impacted files
    @@           Coverage Diff           @@
    ##             main     #582   +/-   ##
    =======================================
      Coverage   74.25%   74.25%           
    =======================================
      Files          48       48           
      Lines        4373     4374    +1     
    =======================================
    + Hits         3247     3248    +1     
      Misses       1126     1126           

    ☔ View full report in Codecov by Sentry.
    📢 Have feedback on the report? Share it here.

    🚀 New features to boost your workflow:
    • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
    • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

    Copy link

    @cursor cursor bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Bug: Incorrect Default Model for Embeddings

    The default embedding_model for NVIDIA_NIM is incorrectly set to "meta/llama3-70b-instruct". This model is a language model, not an embedding model, which causes failures when generating embeddings with default settings. The default should be an actual embedding model, such as "nvidia/nv-embedqa-e5-v5", as indicated in documentation and usage examples.

    semantic_router/utils/defaults.py#L31-L39

    }
    NVIDIA_NIM = {
    "embedding_model": os.getenv(
    "NVIDIA_NIM_MODEL_NAME", "meta/llama3-70b-instruct"
    ),
    "language_model": os.getenv(
    "NVIDIA_NIM_CHAT_MODEL_NAME", "meta/llama3-70b-instruct"
    ),
    }

    Fix in Cursor


    Was this report helpful? Give feedback by reacting with 👍 or 👎

    @jamescalam jamescalam merged commit cbdf422 into main Jun 20, 2025
    8 of 10 checks passed
    @jamescalam jamescalam deleted the josh/nvidia_nim-encoder branch June 20, 2025 11:42
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants