fix(vercel-ai): strip retry suffix from UI errorText, preserve LLM cache#4869
fix(vercel-ai): strip retry suffix from UI errorText, preserve LLM cache#4869tijmenhammer wants to merge 1 commit intopydantic:mainfrom
Conversation
… via metadata Add RetryPromptPart.error_description() to separate UI-facing error text from the LLM-facing "Fix the errors and try again." suffix. The Vercel AI adapter uses error_description() for errorText display and stores the full model_response() in call_provider_metadata for round-trip cache fidelity.
| yield ToolOutputDeniedChunk(tool_call_id=tool_call_id) | ||
| elif isinstance(part, RetryPromptPart): | ||
| yield ToolOutputErrorChunk(tool_call_id=tool_call_id, error_text=part.model_response()) | ||
| yield ToolOutputErrorChunk(tool_call_id=tool_call_id, error_text=part.error_description()) |
There was a problem hiding this comment.
🔴 Streaming path silently drops retry suffix, breaking cache fidelity for the primary Vercel AI use case
The PR's stated goal is to preserve LLM cache fidelity by storing the full model_response() (with "Fix the errors and try again." suffix) in metadata. This works for the dump_messages path (_adapter.py:660), but the streaming path at _event_stream.py:281 sends error_description() (no suffix) via ToolOutputErrorChunk, which has no provider_metadata field to carry the full text. When a Vercel AI client reconstructs its message history from stream events and sends them back on the next turn, load_messages at _adapter.py:396 does provider_meta.get('model_response') or part.error_text — since no model_response key exists in the stream-derived metadata, it falls back to part.error_text (the suffix-free version). The resulting ToolReturnPart.content now differs from what the model originally saw during the agent run, invalidating prompt caches (e.g. Anthropic) and changing the model's view of its own history.
Before this PR, _event_stream.py:281 used part.model_response() (with suffix), so the stream→load round-trip was consistent. This PR introduces the regression specifically for the streaming path, which is the primary path for Vercel AI SDK (useChat) users.
Prompt for agents
In pydantic_ai_slim/pydantic_ai/ui/vercel_ai/_event_stream.py at line 281, the ToolOutputErrorChunk is created with error_description() (no retry suffix), but there is no way to attach the full model_response() as metadata since ToolOutputErrorChunk lacks a provider_metadata field.
Two possible fixes:
1. (Minimal) Revert line 281 to use part.model_response() instead of part.error_description(). This keeps the streaming path consistent with the pre-PR behavior and ensures cache fidelity, at the cost of showing the LLM-facing suffix in the UI during streaming.
2. (Complete) Add a provider_metadata field to ToolOutputErrorChunk (in response_types.py), populate it with the model_response in handle_function_tool_result, and update load_messages to extract it. This would achieve the PR's goal of clean UI text + cache fidelity for both streaming and dump_messages paths.
The test at tests/test_vercel_ai.py line 2248 (test_run_stream_response_error) would need to be updated accordingly.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
RetryPromptPart.model_response()appends"\n\nFix the errors and try again."— an instruction meant for the LLM, not for end users. This suffix was leaking into Vercel AIerrorTextin both the streaming and dump paths, breaking frontend logic that checkserrorTextvalues (e.g.errorText === "Cancelled"to distinguish cancelled vs failed state).Core change: Add
RetryPromptPart.error_description()— returns the error text without the retry suffix.model_response()now delegates to it, so all existing callers are unaffected.Vercel AI adapter:
error_description()for UI-facingerror_text; stores the fullmodel_response()incall_provider_metadata.pydantic_ai.model_responsefor round-trip cache fidelityprovider_meta['model_response']forToolReturnPart.content, falls back toerror_text— LLM prompt cache is preserved after dump → loaderror_description()directlyThis ensures the LLM sees identical text before and after a dump/load round-trip (no cache break), while the UI gets clean error text without the retry instruction.
Test plan
test_vercel_ai.pytests passruff check/ruff format/pyrightclean on changed files