Skip to content

feat(llm): add LiteLLM as AI gateway provider#1593

Merged
itomek merged 5 commits into
amd:mainfrom
RheagalFire:feat/add-litellm-provider
Jun 12, 2026
Merged

feat(llm): add LiteLLM as AI gateway provider#1593
itomek merged 5 commits into
amd:mainfrom
RheagalFire:feat/add-litellm-provider

Conversation

@RheagalFire

Copy link
Copy Markdown
Contributor

Summary

Adds LiteLLM as a fourth LLM provider alongside Lemonade, OpenAI, and Claude, giving GAIA users access to 100+ cloud providers (Bedrock, Vertex AI, Groq, DeepSeek, Azure OpenAI, etc.) through a single create_client("litellm") call.

Why

GAIA's LLMClient abstraction covers local inference (Lemonade) and two cloud providers (OpenAI, Claude). Adding providers individually doesn't scale; LiteLLM is one dependency that covers 100+ providers with drop_params=True for cross-provider kwarg compatibility.

Changes

  • src/gaia/llm/providers/litellm.py -- new LiteLLMProvider(LLMClient) with generate(), chat(), embed(), streaming, and drop_params=True default
  • src/gaia/llm/factory.py -- registered "litellm" in _PROVIDERS
  • src/gaia/llm/providers/__init__.py -- export LiteLLMProvider
  • setup.py -- added [litellm] optional extra (litellm>=1.35.0,<2.0)
  • tests/unit/test_litellm_provider.py -- 10 unit tests

Test plan

  • python -m pytest tests/unit/test_litellm_provider.py -v -- 10/10 pass
  • python -m pytest tests/unit/test_llm_client_factory.py tests/unit/test_openai_provider.py -v -- existing LLM tests still pass (76 passed)
  • python util/lint.py --all --fix -- clean
  • Live E2E: create_client("litellm") -> LiteLLM proxy -> Azure Foundry (Claude Sonnet 4.6):
Provider: LiteLLM
Generate response: '4'
Chat response: 'OK'
Stream chunks: 2 chunks, text: 'Hello! ...'
=== E2E PASSED ===

Checklist

  • I have linked a GitHub issue above (Closes #N / Fixes #N / Refs #N).
  • I have described why this change is being made, not just what changed.
  • I have run linting and tests locally (python util/lint.py --all, pytest tests/unit/).
  • I have updated documentation if user-visible behavior changed (see CONTRIBUTING.md).

@RheagalFire

Copy link
Copy Markdown
Contributor Author

cc @kovtcharov-amd

@github-actions github-actions Bot added dependencies Dependency updates llm LLM backend changes tests Test changes performance Performance-critical changes labels Jun 11, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Summary

Clean, well-scoped addition of a LiteLLM provider that closely mirrors the existing OpenAIProvider pattern — wired into the factory, package exports, and an optional [litellm] extra, with solid test coverage on the chat/generate paths. The headline chat()/generate() flow is correct and the lazy import litellm keeps the dependency truly optional. The one thing to fix before merge: embed() crashes with a TypeError whenever a caller overrides the model, and that path has zero test coverage.

Issues

🟡 embed() raises TypeError on model override (src/gaia/llm/providers/litellm.py, embed)

call_kwargs is built from **kwargs before model is popped, so when a caller does embed(texts, model="...") the model lands in both the explicit model= arg and **call_kwargslitellm.embedding() got multiple values for keyword argument 'model'. kwargs.pop() mutates kwargs, not the already-copied call_kwargs. Pop first, then build call_kwargs:

    def embed(self, texts: list[str], **kwargs) -> list[list[float]]:
        import litellm

        model = kwargs.pop("model", self._model)
        call_kwargs = {**self._extra_kwargs, **kwargs}
        if self._api_key:
            call_kwargs["api_key"] = self._api_key

        response = litellm.embedding(
            model=model,
            input=texts,
            drop_params=True,
            **call_kwargs,
        )
        return [item["embedding"] for item in response.data]

There's no test exercising embed() at all — please add one (mirroring test_chat_*) covering both the default-model and override-model cases so this doesn't regress.

🟡 No documentation update (docs/sdk/sdks/llm.mdx)

CLAUDE.md requires every new feature to be documented, and the PR checklist leaves the docs box unchecked. The "Cloud Providers" section lists Claude and OpenAI but not LiteLLM. Add a short snippet there:

# LiteLLM gateway (100+ providers)
llm = create_client("litellm", model="anthropic/claude-sonnet-4-6")

🟢 Factory docstring not updated (src/gaia/llm/factory.py:27)

provider: is still documented as ("lemonade", "openai", or "claude") — add litellm so the listed values match _PROVIDERS.

🟢 Redundant drop_params (src/gaia/llm/providers/litellm.py)

drop_params=True is set globally in __init__ (litellm.drop_params = True) and passed per-call in chat/embed. Harmless, but it also means a caller who passes drop_params via **kwargs would hit a duplicate-keyword error. The per-call arg is the safer one to keep; the global mutation can be dropped.

Strengths

  • Faithful reuse of the OpenAIProvider shape — generate() delegating to chat(), identical _handle_stream, and vision()/load_model() correctly left to inherit NotSupportedError rather than reimplemented.
  • Properly registered in all three places (_PROVIDERS, __init__ export, __all__) and gated behind an optional extra, so core installs stay lean.
  • Good test coverage on the primary paths: factory wiring (incl. case-insensitivity), system-prompt prepend, api_key omission, and model override are all asserted.

Verdict

Approve with suggestions — the chat path is correct and well-tested. Fix the embed() model-override bug (with a test) and add the brief docs mention before merge; the rest are minor nits.

@github-actions

Copy link
Copy Markdown
Contributor

🟡 src/gaia/llm/providers/litellm.py:129embed() will raise TypeError if a caller passes model in kwargs

call_kwargs = {**self._extra_kwargs, **kwargs}   # model lands in call_kwargs
...
response = litellm.embedding(
    model=kwargs.pop("model", self._model),       # also passed explicitly
    **call_kwargs,                                # duplicate keyword argument → TypeError
)

call_kwargs is built from kwargs before the pop, so model ends up in both places. Fix: pop from call_kwargs, not kwargs:

        call_kwargs = {**self._extra_kwargs, **kwargs}
        if self._api_key:
            call_kwargs["api_key"] = self._api_key
        model = call_kwargs.pop("model", self._model)
        response = litellm.embedding(
            model=model,
            input=texts,
            drop_params=True,
            **call_kwargs,
        )

There's also no test covering embed() at all — worth adding one since this path would have caught the bug.

@kovtcharov-amd

Copy link
Copy Markdown
Collaborator

Thanks for the contribution, @RheagalFire! Can you please address the comments from claude bot and fix the failing lint tests and we'll get this in the next release!

@itomek itomek enabled auto-merge June 12, 2026 13:25
@github-actions

Copy link
Copy Markdown
Contributor

🟡 src/gaia/llm/providers/litellm.py:79-88embed() will raise TypeError when a model override is passed.

call_kwargs is built from **kwargs on line 79, so if the caller passes model="text-embedding-ada-002", call_kwargs already contains it. Then kwargs.pop("model", ...) on line 84 removes it from kwargs but not from call_kwargs, so litellm.embedding() receives model= both as an explicit argument and inside **call_kwargsTypeError: got multiple values for keyword argument 'model'.

        call_kwargs = {**self._extra_kwargs, **kwargs}
        if self._api_key:
            call_kwargs["api_key"] = self._api_key
        model_name = call_kwargs.pop("model", self._model)

        response = litellm.embedding(
            model=model_name,
            input=texts,
            drop_params=True,
            **call_kwargs,
        )

Also: the test file has no test for embed() at all — CLAUDE.md requires tests for every new feature.

@itomek itomek added this pull request to the merge queue Jun 12, 2026
Merged via the queue into amd:main with commit 881ebcf Jun 12, 2026
40 checks passed
alexey-tyurin pushed a commit to alexey-tyurin/gaia that referenced this pull request Jun 12, 2026
…view items (amd#1626)

Closes amd#1625

## Why this matters
After amd#1593 added the LiteLLM provider, `LiteLLMProvider.embed()`
crashed with `TypeError: embedding() got multiple values for keyword
argument 'model'` the moment a caller overrode the model — `model` was
passed both explicitly and inside the spread `**call_kwargs`, and that
path had zero test coverage. Now `embed()` works with and without a
model override, and the rest of the amd#1593 review is closed out: the
provider is documented, the `create_client` docstring lists it, and the
redundant `drop_params` handling is collapsed to a single per-call
default callers can override.

## Test plan
- [x] `pytest tests/unit/test_litellm_provider.py` → 12 passed (incl. 2
new `embed()` tests; the override test reproduced the `TypeError` before
the fix)
- [x] `util/lint.py --black --isort --flake8` → all PASS
- [ ] CI green after maintainer approves the workflow run

---------

Co-authored-by: Tomasz Iniewicz <heaters-nays0p@icloud.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Dependency updates llm LLM backend changes performance Performance-critical changes tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants