feat: Add Venice AI provider integration #3079
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Venice AI Integration
Overview
This PR adds Venice AI as a new LLM provider to Letta, enabling users to use Venice AI models for chat completions, streaming, tool calling, and embeddings generation.
Changes
Core Implementation
letta/llm_api/venice_client.py(NEW): Venice AI LLM client implementationLLMClientBasewith all required abstract methodsletta/llm_api/venice.py(NEW): Helper function for Venice API model listingvenice_get_model_list_async(): Queries Venice API for available modelsletta/schemas/providers/venice.py(NEW): Venice provider implementationVeniceProvider: Dynamically lists models from Venice APIvenice_api_keyis configuredConfiguration
letta/schemas/enums.py: AddedvenicetoProviderTypeenumletta/llm_api/llm_client.py: Registered Venice in LLM client factoryletta/settings.py: Addedvenice_api_keytoModelSettingsletta/server/server.py: Auto-registers VeniceProvider when API key is setletta/schemas/providers/__init__.py: Exports VeniceProviderletta/schemas/llm_config.py: Added"venice"tomodel_endpoint_typeLiteralletta/schemas/model.py: Added"venice"tomodel_endpoint_typeLiteral (deprecated schema)letta/schemas/embedding_config.py: Added"venice"toembedding_endpoint_typeLiteralTesting
tests/test_venice_client.py: Comprehensive unit tests for VeniceClient (100% coverage)tests/test_venice_provider.py: Unit tests for VeniceProvider (100% coverage)tests/test_venice_helper.py: Unit tests for helper functions (100% coverage)tests/test_venice_coverage_comprehensive.py: Additional coverage tests for edge casestests/test_venice_live_api.py: Live API integration tests (6/6 passing with real API key)tests/integration_test_venice.py: E2E tests (requires database setup)Features
✅ Chat Completions: Synchronous and asynchronous requests
✅ Streaming: Server-Sent Events (SSE) streaming support
✅ Tool Calling: OpenAI-compatible function/tool calling
✅ Embeddings: Batch embeddings generation
✅ Error Handling: Comprehensive error mapping with retry logic
✅ Dynamic Model Listing: Models discovered from Venice API
✅ BYOK Support: Bring Your Own Key provider support
✅ 100% Test Coverage: Unit, integration, and E2E tests
API Compatibility
Venice AI uses OpenAI-compatible API format, so integration is seamless:
Configuration
Users can configure Venice AI by setting:
Or in settings:
Models are automatically discovered and available as
venice/{model_id}.Testing
Documentation
User documentation has been prepared (see
VENICE_USER_DOCS.mdandVENICE_SETUP_GUIDE.mdin workspace root - not committed to repo).Breaking Changes
None. This is a purely additive change.
Checklist
Related Issues
Notes
is_reasoning_model()queries the Venice API to check model traits. Models with reasoning capabilities (indicated by traits like "reasoning", "reasoner", "thinking", "o1", "o3", "o4") are correctly identified as reasoning models./modelsendpoint doesn't return embedding models, but embeddings work via the/embeddingsendpoint. We hardcode common embedding models (text-embedding-ada-002, text-embedding-3-small, text-embedding-3-large, text-embedding-bge-m3) similar to OpenAI's approach.