Skip to content

feat: add LiteLLM as chat model provider for 100+ LLM backends#456

Open
RheagalFire wants to merge 4 commits into
xorbitsai:mainfrom
RheagalFire:feat/add-litellm-provider
Open

feat: add LiteLLM as chat model provider for 100+ LLM backends#456
RheagalFire wants to merge 4 commits into
xorbitsai:mainfrom
RheagalFire:feat/add-litellm-provider

Conversation

@RheagalFire
Copy link
Copy Markdown

Summary

Adds LiteLLM as a new chat model provider alongside OpenAI, Claude, DeepSeek, Gemini, and Zhipu, enabling access to 100+ LLM backends through LiteLLM's unified interface.

Changes

File What
src/xagent/core/model/chat/basic/litellm.py New LiteLLM(BaseLLM) with async chat() and stream_chat(). Uses litellm.acompletion() with drop_params=True. Supports tool calling, response format, token usage tracking. Maps litellm exceptions to LLMRetryableError/LLMTimeoutError.
src/xagent/core/model/chat/basic/__init__.py Import and export LiteLLM.
src/xagent/core/model/chat/basic/adapter.py Added "litellm" provider branch in create_base_llm() factory.
tests/core/model/chat/basic/test_litellm.py 20 unit tests covering init, chat, tool calling, error handling, factory.

Tests

Unit tests -- 20 passed:

tests/core/model/chat/basic/test_litellm.py::TestLiteLLMInit          7 passed
tests/core/model/chat/basic/test_litellm.py::TestLiteLLMChat          7 passed
tests/core/model/chat/basic/test_litellm.py::TestLiteLLMToolCalling   1 passed
tests/core/model/chat/basic/test_litellm.py::TestLiteLLMErrors        4 passed
tests/core/model/chat/basic/test_litellm.py::TestLiteLLMFactory       1 passed
20 passed in 1.55s

Live e2e via Anthropic:

from xagent.core.model.chat.basic.litellm import LiteLLM
llm = LiteLLM(model_name='anthropic/claude-sonnet-4-6')
result = await llm.chat([{'role': 'user', 'content': 'What is 2+2?'}])
# Result: 4

Lint:

$ ruff check src/xagent/core/model/chat/basic/litellm.py
All checks passed!

Example usage

from xagent.core.model.chat.basic import LiteLLM

llm = LiteLLM(
    model_name="anthropic/claude-sonnet-4-6",
    # api_key omitted -- reads from ANTHROPIC_API_KEY env var
)

response = await llm.chat(
    messages=[{"role": "user", "content": "What is 2+2?"}],
    temperature=0.7,
)

Any LiteLLM model string works. See LiteLLM docs for full list.

@RheagalFire
Copy link
Copy Markdown
Author

cc @yiboyasss

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new LiteLLM provider to the xagent core model, enabling access to numerous LLM providers through a unified interface. The implementation includes the LiteLLM class, updates to the model adapter factory, and a suite of unit tests. Feedback highlights several critical issues in the new implementation: the chat method fails to include the mandatory raw field in tool call responses, the stream_chat method uses incorrect chunk types and lacks support for tool calls, and there is a risk of an IndexError when accessing response choices without prior validation.

},
}
)
return {"type": "tool_call", "tool_calls": tool_calls}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The tool call response dictionary is missing the raw field. According to the BaseLLM.chat interface documentation (see base.py line 163), the returned dictionary for tool calls must include the full response JSON in a raw field to allow for debugging and advanced processing.

            return {
                "type": "tool_call",
                "tool_calls": tool_calls,
                "raw": response.model_dump() if hasattr(response, "model_dump") else dict(response),
            }

Comment on lines +194 to +200
async for chunk in response:
delta = chunk.choices[0].delta if chunk.choices else None
if delta is None:
continue
content = getattr(delta, "content", None)
if content:
yield StreamChunk(type=ChunkType.TEXT, content=content)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The stream_chat implementation has several issues that break consistency and functionality:

  1. It uses ChunkType.TEXT instead of the standard ChunkType.TOKEN used in the rest of the codebase (e.g., in BaseLLM.stream_chat).
  2. It omits the delta field in StreamChunk, which is expected by consumers.
  3. It completely ignores tool calls in the stream, which will cause them to be dropped silently during streaming.

Updating the loop to handle these cases ensures compatibility with the expected streaming behavior and supports tool calling in streaming mode.

        async for chunk in response:
            if not chunk.choices:
                continue
            delta = chunk.choices[0].delta

            # Handle text content
            content = getattr(delta, "content", None)
            if content:
                yield StreamChunk(type=ChunkType.TOKEN, content=content, delta=content)

            # Handle tool calls
            if hasattr(delta, "tool_calls") and delta.tool_calls:
                tool_calls = []
                for tc in delta.tool_calls:
                    tool_calls.append({
                        "index": getattr(tc, "index", 0),
                        "id": getattr(tc, "id", None),
                        "type": "function",
                        "function": {
                            "name": getattr(tc.function, "name", None),
                            "arguments": getattr(tc.function, "arguments", ""),
                        },
                    })
                yield StreamChunk(
                    type=ChunkType.TOOL_CALL,
                    tool_calls=tool_calls,
                    raw=chunk.model_dump() if hasattr(chunk, "model_dump") else dict(chunk)
                )

) as e:
raise LLMRetryableError(str(e)) from e

choice = response.choices[0]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The code accesses response.choices[0] without verifying that choices is not empty. While LiteLLM typically returns at least one choice on success, certain edge cases (like content filtering or provider-specific errors) could result in an empty list, leading to an IndexError.

Suggested change
choice = response.choices[0]
if not response.choices:
raise LLMRetryableError("LiteLLM returned an empty response (no choices).")
choice = response.choices[0]

@qinxuye
Copy link
Copy Markdown
Contributor

qinxuye commented May 20, 2026

Could you fix the lint, and resolve the gemini comments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants