Skip to content

fix(vercel-ai): strip retry suffix from UI errorText, preserve LLM cache#4869

Open
tijmenhammer wants to merge 1 commit intopydantic:mainfrom
tijmenhammer:fix/vercel-ai-retry-error-text
Open

fix(vercel-ai): strip retry suffix from UI errorText, preserve LLM cache#4869
tijmenhammer wants to merge 1 commit intopydantic:mainfrom
tijmenhammer:fix/vercel-ai-retry-error-text

Conversation

@tijmenhammer
Copy link
Copy Markdown

Summary

RetryPromptPart.model_response() appends "\n\nFix the errors and try again." — an instruction meant for the LLM, not for end users. This suffix was leaking into Vercel AI errorText in both the streaming and dump paths, breaking frontend logic that checks errorText values (e.g. errorText === "Cancelled" to distinguish cancelled vs failed state).

Core change: Add RetryPromptPart.error_description() — returns the error text without the retry suffix. model_response() now delegates to it, so all existing callers are unaffected.

Vercel AI adapter:

  • Dump path: Uses error_description() for UI-facing error_text; stores the full model_response() in call_provider_metadata.pydantic_ai.model_response for round-trip cache fidelity
  • Load path: Prefers provider_meta['model_response'] for ToolReturnPart.content, falls back to error_text — LLM prompt cache is preserved after dump → load
  • Streaming path: Uses error_description() directly

This ensures the LLM sees identical text before and after a dump/load round-trip (no cache break), while the UI gets clean error text without the retry instruction.

Test plan

  • All 106 test_vercel_ai.py tests pass
  • Snapshot tests updated for new metadata field and round-trip content
  • ruff check / ruff format / pyright clean on changed files
  • Verify errorText in Vercel AI frontend no longer contains suffix
  • Verify dump → load → next LLM turn produces identical prompt (cache hit)

… via metadata

Add RetryPromptPart.error_description() to separate UI-facing error text
from the LLM-facing "Fix the errors and try again." suffix. The Vercel AI
adapter uses error_description() for errorText display and stores the full
model_response() in call_provider_metadata for round-trip cache fidelity.
@github-actions github-actions bot added size: S Small PR (≤100 weighted lines) bug Report that something isn't working, or PR implementing a fix labels Mar 26, 2026
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 4 additional findings in Devin Review.

Open in Devin Review

yield ToolOutputDeniedChunk(tool_call_id=tool_call_id)
elif isinstance(part, RetryPromptPart):
yield ToolOutputErrorChunk(tool_call_id=tool_call_id, error_text=part.model_response())
yield ToolOutputErrorChunk(tool_call_id=tool_call_id, error_text=part.error_description())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Streaming path silently drops retry suffix, breaking cache fidelity for the primary Vercel AI use case

The PR's stated goal is to preserve LLM cache fidelity by storing the full model_response() (with "Fix the errors and try again." suffix) in metadata. This works for the dump_messages path (_adapter.py:660), but the streaming path at _event_stream.py:281 sends error_description() (no suffix) via ToolOutputErrorChunk, which has no provider_metadata field to carry the full text. When a Vercel AI client reconstructs its message history from stream events and sends them back on the next turn, load_messages at _adapter.py:396 does provider_meta.get('model_response') or part.error_text — since no model_response key exists in the stream-derived metadata, it falls back to part.error_text (the suffix-free version). The resulting ToolReturnPart.content now differs from what the model originally saw during the agent run, invalidating prompt caches (e.g. Anthropic) and changing the model's view of its own history.

Before this PR, _event_stream.py:281 used part.model_response() (with suffix), so the stream→load round-trip was consistent. This PR introduces the regression specifically for the streaming path, which is the primary path for Vercel AI SDK (useChat) users.

Prompt for agents
In pydantic_ai_slim/pydantic_ai/ui/vercel_ai/_event_stream.py at line 281, the ToolOutputErrorChunk is created with error_description() (no retry suffix), but there is no way to attach the full model_response() as metadata since ToolOutputErrorChunk lacks a provider_metadata field.

Two possible fixes:

1. (Minimal) Revert line 281 to use part.model_response() instead of part.error_description(). This keeps the streaming path consistent with the pre-PR behavior and ensures cache fidelity, at the cost of showing the LLM-facing suffix in the UI during streaming.

2. (Complete) Add a provider_metadata field to ToolOutputErrorChunk (in response_types.py), populate it with the model_response in handle_function_tool_result, and update load_messages to extract it. This would achieve the PR's goal of clean UI text + cache fidelity for both streaming and dump_messages paths.

The test at tests/test_vercel_ai.py line 2248 (test_run_stream_response_error) would need to be updated accordingly.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Report that something isn't working, or PR implementing a fix size: S Small PR (≤100 weighted lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants