Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(embedderconfig): add embedding_openai_endpoint argument to EmbedderConfig for custom OpenAI deployments #406

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dutchfarao
Copy link

Summary

This pull request adds a embedding_openai_endpoint argument to the EmbedderConfig class, allowing users to set a custom base_url when using an OpenAI client. This is essential for custom LLM deployments that follow the OpenAI API convention.

Details

  • Added embedding_openai_endpoint argument to EmbedderConfig class.
  • Ensured backward compatibility by providing a default value for base_url.

Testing

I have verified that the changes work for our custom deployment. I have also verified that the change does not break Pipeline if no embedding_openai_endpoint is provided.

from unstructured_ingest.v2.processes.embedder import EmbedderConfig
from unstructured_ingest.embed.openai import OpenAIEmbeddingConfig, OpenAIEmbeddingEncoder

embedder_config=EmbedderConfig(
        embedding_provider='openai',
        embedding_api_key="your_api_key",
        embedding_openai_endpoint='https://custom-openai-instance.com'
        )
# Initialize the OpenAI client with the custom config as done in EmbedderConfig.get_openai_embedder()
config_kwargs = {
            "api_key": embedder_config.embedding_api_key,
            "base_url": embedder_config.embedding_openai_endpoint,
        }
        
if model_name := embedder_config.embedding_model_name:
            config_kwargs["model_name"] = model_name

client = OpenAIEmbeddingEncoder(config=OpenAIEmbeddingConfig.model_validate(config_kwargs))
# Verify that the base URL is correctly set
assert client.config.base_url == "https://custom-openai-instance.com"

@dutchfarao dutchfarao force-pushed the feat-openai-custom-base-url branch from 19ab181 to 9fb48c1 Compare March 3, 2025 11:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant