feat: Add Venice AI provider integration #3079

actuallyrizzn · 2025-11-25T06:01:59Z

Venice AI Integration

Overview

This PR adds Venice AI as a new LLM provider to Letta, enabling users to use Venice AI models for chat completions, streaming, tool calling, and embeddings generation.

Changes

Core Implementation

letta/llm_api/venice_client.py (NEW): Venice AI LLM client implementation
- Extends LLMClientBase with all required abstract methods
- Supports synchronous and asynchronous requests
- Implements streaming with Server-Sent Events (SSE) parsing
- Handles embeddings generation
- Comprehensive error handling with retry logic
- Tool/function calling support
letta/llm_api/venice.py (NEW): Helper function for Venice API model listing
- venice_get_model_list_async(): Queries Venice API for available models
letta/schemas/providers/venice.py (NEW): Venice provider implementation
- VeniceProvider: Dynamically lists models from Venice API
- Auto-registers when venice_api_key is configured

Configuration

letta/schemas/enums.py: Added venice to ProviderType enum
letta/llm_api/llm_client.py: Registered Venice in LLM client factory
letta/settings.py: Added venice_api_key to ModelSettings
letta/server/server.py: Auto-registers VeniceProvider when API key is set
letta/schemas/providers/__init__.py: Exports VeniceProvider
letta/schemas/llm_config.py: Added "venice" to model_endpoint_type Literal
letta/schemas/model.py: Added "venice" to model_endpoint_type Literal (deprecated schema)
letta/schemas/embedding_config.py: Added "venice" to embedding_endpoint_type Literal

Testing

tests/test_venice_client.py: Comprehensive unit tests for VeniceClient (100% coverage)
tests/test_venice_provider.py: Unit tests for VeniceProvider (100% coverage)
tests/test_venice_helper.py: Unit tests for helper functions (100% coverage)
tests/test_venice_coverage_comprehensive.py: Additional coverage tests for edge cases
tests/test_venice_live_api.py: Live API integration tests (6/6 passing with real API key)
tests/integration_test_venice.py: E2E tests (requires database setup)

Features

✅ Chat Completions: Synchronous and asynchronous requests
✅ Streaming: Server-Sent Events (SSE) streaming support
✅ Tool Calling: OpenAI-compatible function/tool calling
✅ Embeddings: Batch embeddings generation
✅ Error Handling: Comprehensive error mapping with retry logic
✅ Dynamic Model Listing: Models discovered from Venice API
✅ BYOK Support: Bring Your Own Key provider support
✅ 100% Test Coverage: Unit, integration, and E2E tests

API Compatibility

Venice AI uses OpenAI-compatible API format, so integration is seamless:

Request format: OpenAI-compatible messages, tools, parameters
Response format: OpenAI-compatible chat completion format
Streaming: Server-Sent Events (SSE) format
Embeddings: OpenAI-compatible embeddings endpoint

Configuration

Users can configure Venice AI by setting:

export VENICE_API_KEY="your-api-key"

Or in settings:

venice_api_key = "your-api-key"

Models are automatically discovered and available as venice/{model_id}.

Testing

✅ 100% code coverage for all Venice implementation files
✅ 97 unit tests passing
✅ 6 live API tests passing with real Venice API key
✅ Performance: Request/response times < 2s (tested)

Documentation

User documentation has been prepared (see VENICE_USER_DOCS.md and VENICE_SETUP_GUIDE.md in workspace root - not committed to repo).

Breaking Changes

None. This is a purely additive change.

Checklist

All tests passing (97 unit tests, 6 live API tests)
100% test coverage achieved
Code follows Letta patterns (matches OpenAI/Anthropic client style)
Documentation complete (docstrings, type hints, inline comments)
No external dependencies added (extracted SDK code directly)
Backward compatible (no breaking changes)
Error handling comprehensive
Performance acceptable (< 2s response times)
Live API tests verified with real API key

Related Issues

Related to Feature Request: Venice AI Provider Integration #3080 (Feature Request: Venice AI Provider Integration)

Notes

Reasoning model detection: is_reasoning_model() queries the Venice API to check model traits. Models with reasoning capabilities (indicated by traits like "reasoning", "reasoner", "thinking", "o1", "o3", "o4") are correctly identified as reasoning models.
Venice does not support inner thoughts in kwargs or developer role
LLM model list is dynamically fetched from Venice API (not hardcoded)
Embedding models: Venice API's /models endpoint doesn't return embedding models, but embeddings work via the /embeddings endpoint. We hardcode common embedding models (text-embedding-ada-002, text-embedding-3-small, text-embedding-3-large, text-embedding-bge-m3) similar to OpenAI's approach.

- Introduced Venice API key in ModelSettings. - Updated LLMClient to handle Venice as a provider. - Added Venice to ProviderType enum. - Registered VeniceProvider in the providers module. - Integrated VeniceProvider into the SyncServer for API key management.

- Added support for the Venice endpoint in embedding and LLM configurations. - Updated VeniceClient to include a default context window for embeddings. - Improved test coverage for VeniceClient, including error handling and API interactions. - Refactored mock responses in tests for better clarity and reliability.

- Add comprehensive docstrings to all methods matching Letta codebase style - Add inline comments for complex logic (SSE parsing, error mapping, retry) - Document Venice-specific parameters and error handling approach - Clean up unused imports (ErrorCode, Dict, Tuple) - Improve error message clarity and consistency

- Updated is_reasoning_model() to query Venice API and check model_spec.traits - Models with reasoning traits (reasoning, reasoner, thinking, o1, o3, o4) are correctly identified - Added comprehensive tests for reasoning model detection (5 new tests) - Updated PR description to reflect correct behavior

… doesn't list them - Venice API /models endpoint only returns text models, not embedding models - Embeddings work via /embeddings endpoint but models aren't discoverable - Hardcode common embedding models (ada-002, embedding-3-small/large, bge-m3) - All return 1024 dimensions (verified with live API) - Updated test to match new hardcoded implementation

actuallyrizzn added 3 commits November 24, 2025 22:59

github-project-automation bot added this to 🐛 Letta issue tracker Nov 25, 2025

github-project-automation bot moved this to To triage in 🐛 Letta issue tracker Nov 25, 2025

actuallyrizzn mentioned this pull request Nov 25, 2025

Feature Request: Venice AI Provider Integration #3080

Open

actuallyrizzn added 2 commits November 25, 2025 00:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add Venice AI provider integration #3079

feat: Add Venice AI provider integration #3079

Uh oh!

actuallyrizzn commented Nov 25, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: Add Venice AI provider integration #3079

Are you sure you want to change the base?

feat: Add Venice AI provider integration #3079

Uh oh!

Conversation

actuallyrizzn commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Venice AI Integration

Overview

Changes

Core Implementation

Configuration

Testing

Features

API Compatibility

Configuration

Testing

Documentation

Breaking Changes

Checklist

Related Issues

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

actuallyrizzn commented Nov 25, 2025 •

edited

Loading