Skip to content

Feature Request: Qwen Provider Integration #3101

@xiziqijoke

Description

@xiziqijoke

Summary
Add Qwen (Alibaba Cloud's large language model) as a new LLM provider option in Letta, enabling users to use Qwen series models for chat completions, streaming, tool calling, and embeddings generation.

Motivation

Qwen provides access to a series of high-performance LLM models (including Qwen-2, Qwen-1.5, etc.) through a unified and standardized API. Integrating Qwen as a provider would give Letta users:

  • Access to a diverse set of high-quality, Chinese-optimized models through a single provider

  • Standardized API format that aligns with industry common practices for seamless integration

  • Dynamic model discovery (models listed from Qwen API, not hardcoded), adapting to new model releases in real-time

  • Comprehensive support for core LLM capabilities: chat completions, streaming, tool calling, and embeddings generation

Proposed Solution

Implement a Qwen provider that:

  • Extends Letta's existing LLMClientBase following the established provider patterns

  • Supports all core LLM operations: chat completions, streaming responses, embeddings generation, and tool calling

  • Dynamically fetches and lists available Qwen models from the official Qwen API

  • Implements robust error handling and retry mechanisms for API stability

  • Maintains 100% test coverage (unit, integration, and E2E tests)

Benefits

  • Rich Model Ecosystem: Users gain access to Qwen's full model catalog, including general-purpose models, specialized models, and different parameter scales (e.g., Qwen-2-7B, Qwen-2-72B, Qwen-1.5-Chat)

  • Seamless Integration: Aligns with Letta's existing agent and LLM infrastructure, requiring no major adjustments to user workflows

  • Chinese-Optimized Support: Qwen models have excellent performance in Chinese language understanding and generation, expanding Letta's applicability in Chinese-speaking scenarios

  • Dynamic Model Discovery: Automatically synchronizes new Qwen models via API, eliminating the need for manual code updates to add new models

  • Full Feature Parity: Supports all core capabilities required by Letta, including streaming and tool calling, ensuring a consistent user experience across providers

Implementation Details

The implementation would include:

  • QwenClient: An LLM client class extending LLMClientBase, responsible for interacting with Qwen's API

  • QwenProvider: A provider class for dynamic model listing and provider registration

  • Configuration support via environment variables or Letta's settings file (for API key management)

  • Comprehensive test coverage: unit tests for client logic, integration tests for API interaction, and E2E tests for end-to-end user workflows

  • Detailed documentation: configuration guide, supported models list, and usage examples

Configuration

Users would configure Qwen by setting the API key via environment variable or Letta's settings:

Via environment variable:

export QWEN_API_KEY="your-api-key"

Or in Letta's settings:

qwen_api_key = "your-api-key"

Models would be automatically discovered and available in the formatqwen/{model_id} (e.g., qwen/qwen-2-7b-chat, qwen/qwen-1.5-14b-chat).

Additional Context

Qwen provides a standardized RESTful API that supports industry-common request/response formats, which means:

  • Chat completion and tool calling request/response formats are compatible with Letta's existing adapter patterns

  • Streaming responses use Server-Sent Events (SSE) format, consistent with Letta's existing streaming handling logic

  • Embeddings generation uses a dedicated, easy-to-integrate endpoint with clear input/output specifications

These characteristics make the integration of Qwen straightforward, low-cost, and maintainable for future iterations.

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions