Skip to content

fix(tests): use class-level AsyncHTTPHandler mock in vertex GPT-OSS tests#21428

Merged
jquinter merged 2 commits intomainfrom
fix/vertex-gpt-oss-test-isolation
Feb 18, 2026
Merged

fix(tests): use class-level AsyncHTTPHandler mock in vertex GPT-OSS tests#21428
jquinter merged 2 commits intomainfrom
fix/vertex-gpt-oss-test-isolation

Conversation

@jquinter
Copy link
Copy Markdown
Contributor

Summary

  • Fixes test_vertex_ai_gpt_oss_simple_request and test_vertex_ai_gpt_oss_reasoning_effort CI failures with 401 ACCESS_TOKEN_TYPE_UNSUPPORTED
  • Replaces instance-level patch.object(client, "post", side_effect=...) with class-level patch("litellm.llms.custom_httpx.http_handler.AsyncHTTPHandler") and AsyncMock
  • Removes client=client argument from litellm.acompletion() calls so the internally-created handler is intercepted by the class-level mock

Root Cause

The original tests patched the post method on a specific AsyncHTTPHandler instance and passed client=client through to acompletion(). In CI (where real Google credentials exist), the mock wasn't reliably intercepting HTTP calls, resulting in real requests to aiplatform.googleapis.com and 401 ACCESS_TOKEN_TYPE_UNSUPPORTED errors.

Fix

Follow the same pattern used in test_vertex_gemma_transformation.py:

  • Patch AsyncHTTPHandler at the class level
  • Set mock_http_handler.return_value.post = AsyncMock(return_value=mock_response)
  • Do not pass client=client to acompletion() — let the code create its own instance, which is intercepted by the class-level mock

Test plan

  • All 4 tests in test_vertex_ai_gpt_oss_transformation.py pass locally
  • No production code changed — test-only fix

🤖 Generated with Claude Code

…ests

Replace instance-level patch.object(client, "post", side_effect=...) with
class-level patch of AsyncHTTPHandler and AsyncMock to reliably intercept
HTTP calls in CI where real Google credentials are available.

The old approach patched a specific instance's post method and passed
client=client to acompletion(). In CI, the mock wasn't intercepting actual
HTTP calls, causing 401 ACCESS_TOKEN_TYPE_UNSUPPORTED errors. The new
approach patches AsyncHTTPHandler at the class level so any instance
created internally by get_async_httpx_client() is also mocked.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented Feb 17, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Feb 18, 2026 0:04am

Request Review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Feb 17, 2026

Greptile Summary

This PR fixes flaky CI failures in the Vertex AI GPT-OSS tests by switching from instance-level patch.object(client, "post") to class-level patch("...AsyncHTTPHandler"), matching the pattern used in test_vertex_gemma_transformation.py. The client=client argument is removed from acompletion() calls so the internally-created handler is intercepted by the mock. No production code is changed.

  • Mocking approach improved: Replaces fragile instance-level mock with class-level AsyncHTTPHandler mock and AsyncMock, preventing real HTTP calls to aiplatform.googleapis.com in CI environments with Google credentials
  • Missing cache flush fixture: The reference test (test_vertex_gemma_transformation.py) includes a _reset_litellm_http_client_cache autouse fixture that flushes litellm.in_memory_llm_clients_cache before each test. This file omits that fixture, which could cause the same class of flaky failures when a cached real HTTP client from an earlier test bypasses the AsyncHTTPHandler class mock
  • Unused import cleanup: Removes the httpx import that was only needed for the old AsyncHTTPHandler() instantiation

Confidence Score: 3/5

  • Test-only change that improves mock reliability but has a gap in test isolation that could cause intermittent failures
  • The core approach is correct and follows established patterns in the codebase. However, the missing _reset_litellm_http_client_cache fixture — which the reference test explicitly includes — creates a risk of intermittent CI failures when a cached real HTTP client bypasses the class-level mock. This is the exact same category of issue (real HTTP calls leaking through mocks) that this PR aims to fix.
  • test_vertex_ai_gpt_oss_transformation.py needs the _reset_litellm_http_client_cache fixture added to ensure reliable test isolation

Important Files Changed

Filename Overview
tests/test_litellm/llms/vertex_ai/vertex_ai_partner_models/gpt_oss/test_vertex_ai_gpt_oss_transformation.py Switches to class-level AsyncHTTPHandler mock (matching reference pattern), but is missing the _reset_litellm_http_client_cache fixture that the reference test uses, which could cause test failures when cached HTTP clients bypass the mock.

Flowchart

flowchart TD
    A["test calls litellm.acompletion()"] --> B["VertexAIPartnerModels.completion()"]
    B --> C["BaseLLMHTTPHandler.completion()"]
    C --> D{client is None?}
    D -->|Yes| E["get_async_httpx_client()"]
    E --> F{Cached client exists?}
    F -->|Yes - cache hit| G["Returns cached real client ⚠️"]
    F -->|No - cache miss| H["AsyncHTTPHandler()"]
    H --> I["Class-level mock intercepts ✅"]
    I --> J["Mock returns fake response"]
    G --> K["Real HTTP call - test may fail ❌"]
    D -->|No - old approach| L["Uses passed client instance"]
Loading

Last reviewed commit: f0fc44c

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Feb 17, 2026

Additional Comments (1)

tests/test_litellm/llms/vertex_ai/vertex_ai_partner_models/gpt_oss/test_vertex_ai_gpt_oss_transformation.py
Missing HTTP client cache flush fixture

The reference test file (test_vertex_gemma_transformation.py) includes a _reset_litellm_http_client_cache fixture that flushes litellm.in_memory_llm_clients_cache before each test. This file is missing this fixture, which can cause intermittent CI failures.

Here's why: the GPT-OSS code path calls get_async_httpx_client() (in llm_http_handler.py:299), which checks the in_memory_llm_clients_cache before calling AsyncHTTPHandler(). With asyncio_default_fixture_loop_scope = "session" in pyproject.toml, the event loop is shared across the entire test session. If any earlier test in the session causes a real AsyncHTTPHandler to be cached for the vertex_ai provider key, the class-level AsyncHTTPHandler mock will be bypassed because the cached (real) client is returned directly.

Consider adding the same fixture used in test_vertex_gemma_transformation.py:

@pytest.fixture(autouse=True)
def clean_vertex_env():
    """Clear Google/Vertex AI environment variables before each test to prevent test isolation issues."""
    saved_env = {}
    env_vars_to_clear = [
        "GOOGLE_APPLICATION_CREDENTIALS",
        "GOOGLE_CLOUD_PROJECT",
        "VERTEXAI_PROJECT",
        "VERTEXAI_LOCATION",
        "VERTEXAI_CREDENTIALS",
        "VERTEX_PROJECT",
        "VERTEX_LOCATION",
        "VERTEX_AI_PROJECT",
    ]
    for var in env_vars_to_clear:
        if var in os.environ:
            saved_env[var] = os.environ[var]
            del os.environ[var]

    yield

    # Restore saved environment variables
    for var, value in saved_env.items():
        os.environ[var] = value


@pytest.fixture(autouse=True)
def _reset_litellm_http_client_cache():
    """Ensure each test gets a fresh async HTTP client mock."""
    from litellm import in_memory_llm_clients_cache

    in_memory_llm_clients_cache.flush_cache()

Add _reset_litellm_http_client_cache autouse fixture (matching
test_vertex_gemma_transformation.py) to flush in_memory_llm_clients_cache
before each test. Without this, a cached real AsyncHTTPHandler from an
earlier test could bypass the class-level mock and cause real HTTP calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jquinter
Copy link
Copy Markdown
Contributor Author

@greptile-apps re-review this PR

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Feb 18, 2026

Greptile Summary

This PR fixes flaky CI failures (401 ACCESS_TOKEN_TYPE_UNSUPPORTED) in two Vertex AI GPT-OSS tests by switching from instance-level HTTP mock patching to the class-level AsyncHTTPHandler mock pattern already established in test_vertex_gemma_transformation.py.

  • Replaces patch.object(client, "post", side_effect=...) + client=client passthrough with patch("litellm.llms.custom_httpx.http_handler.AsyncHTTPHandler") class-level mock
  • Adds _reset_litellm_http_client_cache autouse fixture to flush in_memory_llm_clients_cache between tests, preventing stale cached HTTP clients from bypassing the mock
  • Removes unused httpx import and unnecessary VertexAIError patch
  • All changes are test-only — no production code modified

Confidence Score: 5/5

  • This PR is safe to merge — it is a test-only change that aligns with an established mocking pattern in the codebase.
  • Score of 5 reflects: (1) no production code changes, (2) the new pattern exactly matches the proven approach in test_vertex_gemma_transformation.py, (3) the cache flush fixture prevents test isolation issues, and (4) all four tests in the file are designed to be mock-only with no real network calls.
  • No files require special attention.

Important Files Changed

Filename Overview
tests/test_litellm/llms/vertex_ai/vertex_ai_partner_models/gpt_oss/test_vertex_ai_gpt_oss_transformation.py Replaces fragile instance-level HTTP mock with class-level AsyncHTTPHandler patch, adds cache flush fixture, and removes client passthrough — matching the established pattern in test_vertex_gemma_transformation.py. No issues found.

Flowchart

flowchart TD
    A["Test calls litellm.acompletion()"] --> B["Class-level patch intercepts\nAsyncHTTPHandler construction"]
    B --> C["_reset_litellm_http_client_cache\nflushes cached clients"]
    C --> D["New AsyncHTTPHandler instance\ncreated internally"]
    D --> E["mock_http_handler.return_value.post\nreturns AsyncMock response"]
    E --> F["Test verifies URL, request body,\nand response structure"]
    
    style B fill:#d4edda,stroke:#155724
    style C fill:#d4edda,stroke:#155724
    style E fill:#d4edda,stroke:#155724
Loading

Last reviewed commit: ea0cfac

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@jquinter jquinter merged commit faa16ef into main Feb 18, 2026
17 of 25 checks passed
@ishaan-berri ishaan-berri deleted the fix/vertex-gpt-oss-test-isolation branch March 26, 2026 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant