fix: handle Pydantic MockValSer bug in streaming responses (#18801) by AudreyKj · Pull Request #24298 · BerriAI/litellm

AudreyKj · 2026-03-21T15:45:19Z

Problem

TypeError: 'MockValSer' object cannot be converted to 'SchemaSerializer' when handling streaming responses with SAP AI Core and other providers (vLLM, etc).

Related GitHub Issue: #18801
Related Pydantic Issue: pydantic/pydantic#7713

Root Cause

Pydantic 2.11+ has a bug where the internal MockValSer sentinel is not properly converted to a real SchemaSerializer in certain streaming scenarios. When LiteLLM tries to serialize streaming chunks using model_dump(), it hits this corrupted serializer state and crashes.

The bug occurs when:

A chunk is created from a dictionary and properly serialized
LiteLLM modifies the chunk (e.g., stripping usage data)
A new chunk is reconstructed from the modified dictionary
Pydantic fails to fully initialize the serializer on the new object
Subsequent model_dump() calls crash with MockValSer TypeError

Solution

Added try-catch fallback that uses __dict__ extraction when model_dump() fails with TypeError. This bypasses Pydantic's broken serialization entirely while maintaining all functionality.

The fix is:

Minimal: Only activates when the bug occurs
Backward compatible: Normal path still uses Pydantic serialization
Robust: Tested with complex nested objects and streaming scenarios
Low overhead: Exception handling only triggered when bug occurs

Changes

Modified Files

litellm/litellm_core_utils/streaming_handler.py
- Added fallback in 2 locations where model_dump() is called on streaming chunks
- Location 1 (~line 1859): When stripping usage from response chunks
- Location 2 (~line 2047): When stripping usage from processed chunks
litellm/litellm_core_utils/core_helpers.py
- Added fallback in preserve_upstream_non_openai_attributes() function (~line 273)
- Ensures non-OpenAI attributes are preserved even when serializer is corrupted
tests/test_litellm/litellm_core_utils/test_streaming_handler.py
- Added regression test test_model_dump_fallback_handles_pydantic_serializer_bug
- Simulates the MockValSer bug and verifies fallback behavior

Testing

✅ All 49 streaming handler tests pass

python3 -m pytest tests/test_litellm/litellm_core_utils/test_streaming_handler.py -v
# 49 passed, 70 warnings in 3.65s

✅ Regression test verifies fallback behavior

Mocks model_dump() to raise MockValSer TypeError
Confirms fallback to __dict__ extraction works correctly
Validates that usage stripping still functions properly

Why Not Alternative Solutions?

❌ Downgrade Pydantic: Creates dependency conflicts with LiteLLM 1.82.4+ which requires Pydantic 2.11+
❌ Downgrade LiteLLM: Older versions don't support SAP AI Core provider
❌ mode='python': Still uses the broken serializer internally
❌ Wait for Pydantic fix: Issue refactor: replace regex with string method for whitespace check… #7713 has been open since Oct 2023 with no timeline
✅ __dict__ fallback: Bypasses serialization entirely, works immediately

Impact

This fix resolves streaming issues for:

SAP AI Core provider
vLLM provider
Any other provider that reconstructs chunks during streaming

Users experiencing the MockValSer error will now have streaming work correctly without any configuration changes.

Checklist

Added regression test in tests/test_litellm/litellm_core_utils/test_streaming_handler.py
All tests pass (make test-unit-core-utils equivalent)
Changes follow existing code patterns
Fix is backward compatible
Issue [Bug]: Streaming + logprobs fails for vLLM-backed models (PydanticSerializationError) #18801 will be resolved by this PR

…8801) ## Problem TypeError: 'MockValSer' object cannot be converted to 'SchemaSerializer' when handling streaming responses with SAP AI Core and other providers. Pydantic 2.11+ has a bug where the internal MockValSer sentinel is not properly converted to a real SchemaSerializer in certain streaming scenarios. When LiteLLM tries to serialize chunks using model_dump(), it hits this corrupted serializer state. ## Solution Added try-catch fallback that uses __dict__ extraction when model_dump() fails with TypeError. This bypasses Pydantic's serialization entirely while maintaining functionality. ## Changes - litellm/litellm_core_utils/streaming_handler.py: Added fallback in 2 locations - litellm/litellm_core_utils/core_helpers.py: Added fallback in preserve_upstream_non_openai_attributes - tests/test_litellm/litellm_core_utils/test_streaming_handler.py: Added regression test ## Testing ✅ All 49 streaming handler tests pass ✅ Regression test verifies fallback behavior Related: BerriAI#18801 Related: pydantic/pydantic#7713

vercel · 2026-03-21T15:45:24Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Mar 21, 2026 3:47pm

codspeed-hq · 2026-03-21T15:47:16Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing AudreyKj:fix/streaming-response-bug (34fb901) with main (d8e4fc4)}

greptile-apps · 2026-03-21T15:48:02Z

Greptile Summary

This PR adds try/except TypeError fallbacks around three model_dump() call sites to work around a Pydantic 2.11+ bug where an internal MockValSer sentinel is not properly promoted to a real SchemaSerializer during streaming chunk reconstruction. When the bug occurs, the code falls back to dict(obj.__dict__) to extract field data.

Key changes:

streaming_handler.py (×2): Fallback in __next__ and __anext__ when stripping usage before returning a chunk to the caller
core_helpers.py (×1): Fallback in preserve_upstream_non_openai_attributes when copying non-OpenAI fields to the response
test_streaming_handler.py: New regression test that mocks model_dump() to raise TypeError and verifies the fallback path in return_processed_chunk_logic

Issues found:

The except TypeError guard is broader than necessary — it catches every TypeError from model_dump(), not only the specific MockValSer message. A real type error (e.g., from a custom serializer or a programming mistake) will silently redirect to __dict__ extraction, potentially returning structurally different data since model_dump() recursively serializes nested Pydantic objects while __dict__ returns them as raw model instances.
The exception variable e is captured but never used or logged in both streaming_handler.py locations, discarding diagnostic information.
The regression test only exercises the core_helpers.py fallback; the two fallback paths inside __next__ / __anext__ in streaming_handler.py are not covered by any test.
The test stores original_model_dump for a teardown that was never implemented, leaving the mock permanently on the chunk object for the duration of the test.

Confidence Score: 3/5

The fix addresses a real production bug but the overly broad TypeError catch and incomplete test coverage introduce new risks that should be addressed before merging.
The approach is pragmatic and the happy path is unchanged, but catching all TypeErrors without re-raising non-MockValSer ones risks silently masking unrelated bugs; the dict fallback is also not semantically equivalent to model_dump() for nested Pydantic objects. The test doesn't cover the two main code paths changed in streaming_handler.py.
litellm/litellm_core_utils/streaming_handler.py and litellm/litellm_core_utils/core_helpers.py — both need the TypeError guard narrowed to the specific MockValSer message.

Important Files Changed

Filename	Overview
litellm/litellm_core_utils/streaming_handler.py	Added `try/except TypeError` fallback around two `model_dump()` calls (lines 1862 and 2054) when stripping usage from streaming chunks; the broad catch may swallow unrelated TypeErrors and the `__dict__` output is not structurally equivalent to `model_dump()` output.
litellm/litellm_core_utils/core_helpers.py	Added identical `try/except TypeError` fallback in `preserve_upstream_non_openai_attributes`; same broad-catch concern applies — any TypeError from `model_dump()` silently redirects to `__dict__` extraction.
tests/test_litellm/litellm_core_utils/test_streaming_handler.py	New regression test validates the `core_helpers` fallback but does not exercise the two fallback paths inside `__next__`/`__anext__` in `streaming_handler.py`; also contains an unused `original_model_dump` variable.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant CustomStreamWrapper
    participant Pydantic

    Caller->>CustomStreamWrapper: __next__() / __anext__()
    CustomStreamWrapper->>Pydantic: response.model_dump()
    alt Normal path (Pydantic ≤ 2.10 or no bug)
        Pydantic-->>CustomStreamWrapper: obj_dict (fully serialized)
    else MockValSer bug (Pydantic 2.11+)
        Pydantic-->>CustomStreamWrapper: raises TypeError
        CustomStreamWrapper->>CustomStreamWrapper: obj_dict = dict(response.__dict__)
    end
    CustomStreamWrapper->>CustomStreamWrapper: del obj_dict["usage"]
    CustomStreamWrapper->>CustomStreamWrapper: model_response_creator(chunk=obj_dict)
    CustomStreamWrapper-->>Caller: processed chunk (no usage)

    Caller->>CustomStreamWrapper: return_processed_chunk_logic(...)
    CustomStreamWrapper->>Pydantic: original_chunk.model_dump() [in preserve_upstream_non_openai_attributes]
    alt Normal path
        Pydantic-->>CustomStreamWrapper: obj_dict
    else MockValSer bug
        Pydantic-->>CustomStreamWrapper: raises TypeError
        CustomStreamWrapper->>CustomStreamWrapper: obj_dict = dict(original_chunk.__dict__)
    end
    CustomStreamWrapper->>CustomStreamWrapper: setattr non-OpenAI fields onto model_response
    CustomStreamWrapper-->>Caller: model_response

_{Last reviewed commit: "fix: handle Pydantic..."}

greptile-apps · 2026-03-21T15:48:06Z

litellm/litellm_core_utils/streaming_handler.py

+                        except TypeError as e:
+                            # Fallback: manually extract dict from __dict__ to bypass Pydantic serializer
+                            obj_dict = dict(response.__dict__) if hasattr(response, '__dict__') else {}


Overly broad TypeError catch swallows unrelated errors

The fallback catches every TypeError, not just the MockValSer one. If model_dump() raises a TypeError for a different reason (e.g., a genuine type mismatch in a custom serializer or a programming mistake), the code will silently fall back to __dict__, potentially returning subtly wrong/incomplete data instead of surfacing the real bug.

Additionally, the __dict__ of a Pydantic v2 model and the output of model_dump() are not equivalent: model_dump() recursively serializes nested models to plain dicts/primitives, while __dict__ returns the raw Python objects (nested Pydantic model instances, enums, etc.). Passing this mixed-type dict to model_response_creator could produce unexpected results depending on how the creator handles nested objects.

Consider narrowing the guard to only the known error string:

Suggested change

except TypeError as e:

# Fallback: manually extract dict from __dict__ to bypass Pydantic serializer

obj_dict = dict(response.__dict__) if hasattr(response, '__dict__') else {}

try:

obj_dict = response.model_dump()

except TypeError as e:

if "MockValSer" not in str(e):

raise

# Fallback: manually extract dict from __dict__ to bypass Pydantic serializer

obj_dict = dict(response.__dict__) if hasattr(response, '__dict__') else {}

Same pattern applies to the equivalent catch at line 2056 and in core_helpers.py.

greptile-apps · 2026-03-21T15:48:07Z

litellm/litellm_core_utils/streaming_handler.py

+                        except TypeError as e:
+                            # Fallback: manually extract dict from __dict__ to bypass Pydantic serializer
+                            obj_dict = dict(processed_chunk.__dict__) if hasattr(processed_chunk, '__dict__') else {}


Unused exception variable e

The exception is captured as e but never used or logged. This silently discards the exception information, making future debugging harder if a different TypeError accidentally gets swallowed here.

The same issue exists at line 1864.

Suggested change

except TypeError as e:

# Fallback: manually extract dict from __dict__ to bypass Pydantic serializer

obj_dict = dict(processed_chunk.__dict__) if hasattr(processed_chunk, '__dict__') else {}

except TypeError:

# Fallback: manually extract dict from __dict__ to bypass Pydantic serializer

obj_dict = dict(processed_chunk.__dict__) if hasattr(processed_chunk, '__dict__') else {}

greptile-apps · 2026-03-21T15:48:08Z

tests/test_litellm/litellm_core_utils/test_streaming_handler.py

+    )
+
+    # Mock model_dump to raise TypeError (simulating MockValSer bug)
+    original_model_dump = chunk_with_usage.model_dump


Dead variable original_model_dump

original_model_dump is assigned but never referenced again. It was likely intended for restoring the method after the test (to avoid leaking the mock), but the cleanup was omitted.

If this is intentional, remove the assignment. If teardown was intended, add it:

try: result = initialized_custom_stream_wrapper.return_processed_chunk_logic(...) finally: chunk_with_usage.model_dump = original_model_dump

greptile-apps · 2026-03-21T15:48:09Z

tests/test_litellm/litellm_core_utils/test_streaming_handler.py

+    # Process the chunk through return_processed_chunk_logic which calls model_dump
+    result = initialized_custom_stream_wrapper.return_processed_chunk_logic(
+        completion_obj={"content": "test content"},
+        response_obj={"original_chunk": chunk_with_usage},
+        model_response=chunk_with_usage,
+    )
+
+    # Should not raise TypeError and should successfully process the chunk
+    assert result is not None
+    assert result.choices[0].delta.content == "test content"


Test doesn't cover the streaming_handler.py fallback paths

The test exercises return_processed_chunk_logic, which invokes the fallback in core_helpers.preserve_upstream_non_openai_attributes. However, the two new except TypeError blocks added in streaming_handler.py (lines 1862–1866 and 2054–2058) live inside the synchronous __next__ and asynchronous __anext__ iterators respectively — neither of which is called by return_processed_chunk_logic.

As a result, the regression test does not actually verify the two most direct code paths changed by this PR. Consider adding a test that iterates the wrapper (e.g., via list(wrapper) or async for chunk in wrapper) with model_dump patched to raise, to confirm those paths also survive the bug.

AudreyKj mentioned this pull request Mar 21, 2026

[Bug]: Streaming + logprobs fails for vLLM-backed models (PydanticSerializationError) #18801

Open

vercel bot deployed to Preview March 21, 2026 15:47 View deployment

greptile-apps bot reviewed Mar 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: handle Pydantic MockValSer bug in streaming responses (#18801)#24298

fix: handle Pydantic MockValSer bug in streaming responses (#18801)#24298
AudreyKj wants to merge 1 commit intoBerriAI:mainfrom
AudreyKj:fix/streaming-response-bug

AudreyKj commented Mar 21, 2026

Uh oh!

vercel bot commented Mar 21, 2026 •

edited

Loading

Uh oh!

codspeed-hq bot commented Mar 21, 2026

Uh oh!

greptile-apps bot commented Mar 21, 2026

Important Files Changed

Uh oh!

greptile-apps bot Mar 21, 2026

Uh oh!

greptile-apps bot Mar 21, 2026

Uh oh!

greptile-apps bot Mar 21, 2026

Uh oh!

greptile-apps bot Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

AudreyKj commented Mar 21, 2026

Problem

Root Cause

Solution

Changes

Modified Files

Testing

Why Not Alternative Solutions?

Impact

Checklist

Uh oh!

vercel bot commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq bot commented Mar 21, 2026

Merging this PR will not alter performance

Uh oh!

greptile-apps bot commented Mar 21, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Mar 21, 2026 •

edited

Loading