fix(tests): use class-level AsyncHTTPHandler mock in vertex GPT-OSS tests by jquinter · Pull Request #21428 · BerriAI/litellm

jquinter · 2026-02-17T23:52:51Z

Summary

Fixes test_vertex_ai_gpt_oss_simple_request and test_vertex_ai_gpt_oss_reasoning_effort CI failures with 401 ACCESS_TOKEN_TYPE_UNSUPPORTED
Replaces instance-level patch.object(client, "post", side_effect=...) with class-level patch("litellm.llms.custom_httpx.http_handler.AsyncHTTPHandler") and AsyncMock
Removes client=client argument from litellm.acompletion() calls so the internally-created handler is intercepted by the class-level mock

Root Cause

The original tests patched the post method on a specific AsyncHTTPHandler instance and passed client=client through to acompletion(). In CI (where real Google credentials exist), the mock wasn't reliably intercepting HTTP calls, resulting in real requests to aiplatform.googleapis.com and 401 ACCESS_TOKEN_TYPE_UNSUPPORTED errors.

Fix

Follow the same pattern used in test_vertex_gemma_transformation.py:

Patch AsyncHTTPHandler at the class level
Set mock_http_handler.return_value.post = AsyncMock(return_value=mock_response)
Do not pass client=client to acompletion() — let the code create its own instance, which is intercepted by the class-level mock

Test plan

All 4 tests in test_vertex_ai_gpt_oss_transformation.py pass locally
No production code changed — test-only fix

🤖 Generated with Claude Code

…ests Replace instance-level patch.object(client, "post", side_effect=...) with class-level patch of AsyncHTTPHandler and AsyncMock to reliably intercept HTTP calls in CI where real Google credentials are available. The old approach patched a specific instance's post method and passed client=client to acompletion(). In CI, the mock wasn't intercepting actual HTTP calls, causing 401 ACCESS_TOKEN_TYPE_UNSUPPORTED errors. The new approach patches AsyncHTTPHandler at the class level so any instance created internally by get_async_httpx_client() is also mocked. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

vercel · 2026-02-17T23:52:56Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Feb 18, 2026 0:04am

greptile-apps · 2026-02-17T23:57:13Z

Greptile Summary

This PR fixes flaky CI failures in the Vertex AI GPT-OSS tests by switching from instance-level patch.object(client, "post") to class-level patch("...AsyncHTTPHandler"), matching the pattern used in test_vertex_gemma_transformation.py. The client=client argument is removed from acompletion() calls so the internally-created handler is intercepted by the mock. No production code is changed.

Mocking approach improved: Replaces fragile instance-level mock with class-level AsyncHTTPHandler mock and AsyncMock, preventing real HTTP calls to aiplatform.googleapis.com in CI environments with Google credentials
Missing cache flush fixture: The reference test (test_vertex_gemma_transformation.py) includes a _reset_litellm_http_client_cache autouse fixture that flushes litellm.in_memory_llm_clients_cache before each test. This file omits that fixture, which could cause the same class of flaky failures when a cached real HTTP client from an earlier test bypasses the AsyncHTTPHandler class mock
Unused import cleanup: Removes the httpx import that was only needed for the old AsyncHTTPHandler() instantiation

Confidence Score: 3/5

Test-only change that improves mock reliability but has a gap in test isolation that could cause intermittent failures
The core approach is correct and follows established patterns in the codebase. However, the missing _reset_litellm_http_client_cache fixture — which the reference test explicitly includes — creates a risk of intermittent CI failures when a cached real HTTP client bypasses the class-level mock. This is the exact same category of issue (real HTTP calls leaking through mocks) that this PR aims to fix.
test_vertex_ai_gpt_oss_transformation.py needs the _reset_litellm_http_client_cache fixture added to ensure reliable test isolation

Important Files Changed

Filename	Overview
tests/test_litellm/llms/vertex_ai/vertex_ai_partner_models/gpt_oss/test_vertex_ai_gpt_oss_transformation.py	Switches to class-level AsyncHTTPHandler mock (matching reference pattern), but is missing the `_reset_litellm_http_client_cache` fixture that the reference test uses, which could cause test failures when cached HTTP clients bypass the mock.

Flowchart

flowchart TD
    A["test calls litellm.acompletion()"] --> B["VertexAIPartnerModels.completion()"]
    B --> C["BaseLLMHTTPHandler.completion()"]
    C --> D{client is None?}
    D -->|Yes| E["get_async_httpx_client()"]
    E --> F{Cached client exists?}
    F -->|Yes - cache hit| G["Returns cached real client ⚠️"]
    F -->|No - cache miss| H["AsyncHTTPHandler()"]
    H --> I["Class-level mock intercepts ✅"]
    I --> J["Mock returns fake response"]
    G --> K["Real HTTP call - test may fail ❌"]
    D -->|No - old approach| L["Uses passed client instance"]

_{Last reviewed commit: f0fc44c}

greptile-apps

_{1 file reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-17T23:57:26Z

Additional Comments (1)

tests/test_litellm/llms/vertex_ai/vertex_ai_partner_models/gpt_oss/test_vertex_ai_gpt_oss_transformation.py
Missing HTTP client cache flush fixture

The reference test file (test_vertex_gemma_transformation.py) includes a _reset_litellm_http_client_cache fixture that flushes litellm.in_memory_llm_clients_cache before each test. This file is missing this fixture, which can cause intermittent CI failures.

Here's why: the GPT-OSS code path calls get_async_httpx_client() (in llm_http_handler.py:299), which checks the in_memory_llm_clients_cache before calling AsyncHTTPHandler(). With asyncio_default_fixture_loop_scope = "session" in pyproject.toml, the event loop is shared across the entire test session. If any earlier test in the session causes a real AsyncHTTPHandler to be cached for the vertex_ai provider key, the class-level AsyncHTTPHandler mock will be bypassed because the cached (real) client is returned directly.

Consider adding the same fixture used in test_vertex_gemma_transformation.py:

@pytest.fixture(autouse=True)
def clean_vertex_env():
    """Clear Google/Vertex AI environment variables before each test to prevent test isolation issues."""
    saved_env = {}
    env_vars_to_clear = [
        "GOOGLE_APPLICATION_CREDENTIALS",
        "GOOGLE_CLOUD_PROJECT",
        "VERTEXAI_PROJECT",
        "VERTEXAI_LOCATION",
        "VERTEXAI_CREDENTIALS",
        "VERTEX_PROJECT",
        "VERTEX_LOCATION",
        "VERTEX_AI_PROJECT",
    ]
    for var in env_vars_to_clear:
        if var in os.environ:
            saved_env[var] = os.environ[var]
            del os.environ[var]

    yield

    # Restore saved environment variables
    for var, value in saved_env.items():
        os.environ[var] = value


@pytest.fixture(autouse=True)
def _reset_litellm_http_client_cache():
    """Ensure each test gets a fresh async HTTP client mock."""
    from litellm import in_memory_llm_clients_cache

    in_memory_llm_clients_cache.flush_cache()

Add _reset_litellm_http_client_cache autouse fixture (matching test_vertex_gemma_transformation.py) to flush in_memory_llm_clients_cache before each test. Without this, a cached real AsyncHTTPHandler from an earlier test could bypass the class-level mock and cause real HTTP calls. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

jquinter · 2026-02-18T00:04:28Z

@greptile-apps re-review this PR

greptile-apps · 2026-02-18T00:06:43Z

Greptile Summary

This PR fixes flaky CI failures (401 ACCESS_TOKEN_TYPE_UNSUPPORTED) in two Vertex AI GPT-OSS tests by switching from instance-level HTTP mock patching to the class-level AsyncHTTPHandler mock pattern already established in test_vertex_gemma_transformation.py.

Replaces patch.object(client, "post", side_effect=...) + client=client passthrough with patch("litellm.llms.custom_httpx.http_handler.AsyncHTTPHandler") class-level mock
Adds _reset_litellm_http_client_cache autouse fixture to flush in_memory_llm_clients_cache between tests, preventing stale cached HTTP clients from bypassing the mock
Removes unused httpx import and unnecessary VertexAIError patch
All changes are test-only — no production code modified

Confidence Score: 5/5

This PR is safe to merge — it is a test-only change that aligns with an established mocking pattern in the codebase.
Score of 5 reflects: (1) no production code changes, (2) the new pattern exactly matches the proven approach in test_vertex_gemma_transformation.py, (3) the cache flush fixture prevents test isolation issues, and (4) all four tests in the file are designed to be mock-only with no real network calls.
No files require special attention.

Important Files Changed

Filename	Overview
tests/test_litellm/llms/vertex_ai/vertex_ai_partner_models/gpt_oss/test_vertex_ai_gpt_oss_transformation.py	Replaces fragile instance-level HTTP mock with class-level AsyncHTTPHandler patch, adds cache flush fixture, and removes client passthrough — matching the established pattern in test_vertex_gemma_transformation.py. No issues found.

Flowchart

flowchart TD
    A["Test calls litellm.acompletion()"] --> B["Class-level patch intercepts\nAsyncHTTPHandler construction"]
    B --> C["_reset_litellm_http_client_cache\nflushes cached clients"]
    C --> D["New AsyncHTTPHandler instance\ncreated internally"]
    D --> E["mock_http_handler.return_value.post\nreturns AsyncMock response"]
    E --> F["Test verifies URL, request body,\nand response structure"]
    
    style B fill:#d4edda,stroke:#155724
    style C fill:#d4edda,stroke:#155724
    style E fill:#d4edda,stroke:#155724

_{Last reviewed commit: ea0cfac}

greptile-apps

_{1 file reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

vercel Bot deployed to Preview February 17, 2026 23:54 View deployment

greptile-apps Bot reviewed Feb 17, 2026

View reviewed changes

vercel Bot deployed to Preview February 18, 2026 00:04 View deployment

greptile-apps Bot reviewed Feb 18, 2026

View reviewed changes

jquinter merged commit faa16ef into main Feb 18, 2026
17 of 25 checks passed

ishaan-berri deleted the fix/vertex-gpt-oss-test-isolation branch March 26, 2026 22:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(tests): use class-level AsyncHTTPHandler mock in vertex GPT-OSS tests#21428

fix(tests): use class-level AsyncHTTPHandler mock in vertex GPT-OSS tests#21428
jquinter merged 2 commits intomainfrom
fix/vertex-gpt-oss-test-isolation

jquinter commented Feb 17, 2026

Uh oh!

vercel Bot commented Feb 17, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Feb 17, 2026

Important Files Changed

Uh oh!

greptile-apps Bot left a comment

Uh oh!

greptile-apps Bot commented Feb 17, 2026

Uh oh!

jquinter commented Feb 18, 2026

Uh oh!

greptile-apps Bot commented Feb 18, 2026

Important Files Changed

Uh oh!

greptile-apps Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jquinter commented Feb 17, 2026

Summary

Root Cause

Fix

Test plan

Uh oh!

vercel Bot commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented Feb 17, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Feb 17, 2026

Uh oh!

jquinter commented Feb 18, 2026

Uh oh!

greptile-apps Bot commented Feb 18, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Feb 17, 2026 •

edited

Loading