Skip to content

feat(ag-ui): preserve thinking signatures, files and tool returns in roundtrip#3971

Open
dsfaccini wants to merge 35 commits intopydantic:mainfrom
dsfaccini:fix-anthropic-thinking-with-agui
Open

feat(ag-ui): preserve thinking signatures, files and tool returns in roundtrip#3971
dsfaccini wants to merge 35 commits intopydantic:mainfrom
dsfaccini:fix-anthropic-thinking-with-agui

Conversation

@dsfaccini
Copy link
Copy Markdown
Collaborator

@dsfaccini dsfaccini commented Jan 9, 2026

Pre-Review Checklist

  • Any AI generated code has been reviewed line-by-line by the human PR author, who stands by it.
  • No breaking changes in accordance with the version policy.
  • Linting and type checking pass per make format and make typecheck.
  • PR title is fit for the release changelog.

Pre-Merge Checklist

  • New tests for any fix or new behavior, maintaining 100% coverage.
  • Updated documentation: don't think we need to

Summary

This is the AG-UI equivalent of #3754 (Vercel AI fix). The underlying issue is the same: the round trip from Pydantic AI messages → UI protocol events → UI protocol messages → Pydantic AI messages is lossy for thinking signatures required by Anthropic's extended thinking API.

Problem

When using AG-UI with Anthropic's extended thinking models, multi-turn conversations fail with:

"messages.1.content.0.type: Expected `thinking` or `redacted_thinking`, but found `text`.
When `thinking` is enabled, a final `assistant` message must start with a thinking block..."

Root cause: AG-UI's AssistantMessage only has content: str - no thinking/reasoning field. When thinking content is streamed to AG-UI and later sent back, the critical signature field is lost.

Solution

Forward-compatible hybrid approach that works with current AG-UI while aligning with the draft reasoning events spec:

Outbound (Pydantic AI → AG-UI)

ThinkingEndEvent now includes:

  • rawEvent.pydantic_ai.* - Full metadata (id, signature, provider_name, provider_details)
  • encryptedContent - Direct signature field for draft spec compatibility

Inbound (AG-UI → Pydantic AI)

AGUIAdapter.load_messages() now handles ActivityMessage with activity_type='pydantic_ai_thinking':

  • Extracts thinking content and metadata from ActivityMessage.content dict
  • Converts to ThinkingPart with full signature preservation

Design Decisions

Why ActivityMessage instead of extending AssistantMessage?

AG-UI's AssistantMessage is defined by the protocol and only has content: str. We can't add a thinking field. However, ActivityMessage has:

  • activity_type: str - Can be set to 'pydantic_ai_thinking'
  • content: Dict[str, Any] - Flexible enough to hold all thinking metadata

This is a protocol-compliant workaround until AG-UI adds native ReasoningMessage support.

Why encryptedContent on events?

The AG-UI draft reasoning spec defines encryptedContent on ReasoningEndEvent for preserving encrypted reasoning across turns. While the current ag-ui-protocol package (v0.1.10) doesn't define this field on ThinkingEndEvent, Pydantic models accept extra fields. Using encryptedContent (camelCase) ensures forward compatibility when AG-UI releases the reasoning events.

Why both rawEvent and encryptedContent?

  • rawEvent.pydantic_ai.* - Contains full metadata (id, signature, provider_name, provider_details) needed for proper round-trips
  • encryptedContent - Simple signature field matching the draft spec pattern

This dual approach ensures:

  1. Full metadata is available for frontends that parse rawEvent
  2. Forward compatibility with future AG-UI reasoning events

Frontend Integration

AG-UI frontends must implement thinking metadata preservation:

  1. Capture: On ThinkingEndEvent, extract encryptedContent and/or rawEvent.pydantic_ai.*
  2. Accumulate: Collect thinking text from ThinkingTextMessageContentEvent deltas
  3. Send back: Include ActivityMessage in subsequent requests:
{
  "role": "activity",
  "id": "unique-id",
  "activityType": "pydantic_ai_thinking",
  "content": {
    "content": "accumulated thinking text",
    "signature": "from encryptedContent or rawEvent.pydantic_ai.signature",
    "provider_name": "from rawEvent.pydantic_ai.provider_name"
  }
}

Test Plan

  • test_thinking_with_signature - Verifies ThinkingEndEvent includes rawEvent and encryptedContent
  • test_activity_message_thinking_roundtrip - Verifies ActivityMessageThinkingPart conversion with signature
  • test_activity_message_other_types_ignored - Verifies non-thinking activity types are silently ignored
  • All 31 existing AG-UI tests pass
  • make lint && make typecheck pass

Related


🤖 Generated with Claude Code

@dsfaccini dsfaccini added bug Report that something isn't working, or PR implementing a fix AG-UI fix thinking labels Jan 9, 2026
Add support for preserving thinking metadata (signature, provider_name, etc.)
through AG-UI round-trips, enabling multi-turn conversations with Anthropic's
extended thinking models.

Fixes pydantic#3911

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@dsfaccini dsfaccini force-pushed the fix-anthropic-thinking-with-agui branch from e62256a to 46be3c4 Compare January 9, 2026 18:45
dsfaccini and others added 2 commits January 9, 2026 14:12
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@DouweM
Copy link
Copy Markdown
Collaborator

DouweM commented Jan 14, 2026

@dsfaccini FYI I'd wait to do much more here until #3754 lands

@dsfaccini
Copy link
Copy Markdown
Collaborator Author

noted, there's honestly not much more to do here, I rewrote the logic a bit more cleanly as well as the tests, but this is now good to go from my side.

@dsfaccini dsfaccini marked this pull request as ready for review January 16, 2026 03:33
@dsfaccini dsfaccini requested a review from DouweM January 16, 2026 03:33
yield ThinkingEndEvent(
type=EventType.THINKING_END,
raw_event={'pydantic_ai': pydantic_ai_meta} if pydantic_ai_meta else None,
encryptedContent=part.signature, # pyright: ignore[reportCallIssue]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we allowed to add custom fields?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeahp, I've added explanations on the new code blocks https://docs.ag-ui.com/concepts/events#activity-events


yield ThinkingEndEvent(
type=EventType.THINKING_END,
raw_event={'pydantic_ai': pydantic_ai_meta} if pydantic_ai_meta else None,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can add custom fields, you should look at #3754 and do this for every single provider_details as e.g. Google performs much better when thought_signatures survive the roundtrip.

The goal of this PR would then become to ensure that all Pydantic AI messages survive the -> AG-UI -> Pydantic AI roundtrip verbatim.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oof let me push the new version first and if you're happy with it I'll start looking into integrating it with the rest of the provider_details

# Frontends receive this and send it back as ActivityMessage, which _adapter.py
# converts back to ThinkingPart. This preserves signature/id needed by providers
# like Anthropic for extended thinking.
# See: https://docs.ag-ui.com/concepts/events#activity-events
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if activity messages/events are the most appropriate one to use for this purpose, as app devs are encouraged to use them themselves, and would need to account for our type.

Maybe https://docs.ag-ui.com/concepts/events#raw is better? If you read the explanation for custom events that's follows it, it's made clear that custom events are for app devs to use, while raw events are not.

Although it looks like those don't have a message type so we can't trust that they'll be sent back to us :(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxkorp Any suggestions here? :)


case ActivityMessage():
pass
case ActivityMessage() as activity_msg:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have to update dump_messages right?

content[field] = value

yield ActivitySnapshotEvent(
activity_type='pydantic_ai_thinking',
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want to make this more generic for attaching any type of Pydantic Ai-specific fields to AG-UI events/messages, as in the Vercel PR.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll need to discuss this one, there's now two of these activity_types

@dsfaccini dsfaccini changed the title fix(ag-ui): preserve thinking signatures for Anthropic extended thinking feat(ag-ui): preserve thinking signatures, files and tool returns in roundtrip Jan 23, 2026
@dsfaccini dsfaccini assigned dsfaccini and unassigned dsfaccini Jan 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

This PR is stale, and will be closed in 7 days if no reply is received.

github-actions[bot]

This comment was marked as resolved.

github-actions[bot]

This comment was marked as resolved.

github-actions[bot]

This comment was marked as resolved.

github-actions[bot]

This comment was marked as resolved.

dsfaccini and others added 2 commits March 17, 2026 00:49
return BinaryInputContent(type='binary', data=item.base64, mime_type=item.media_type)
elif isinstance(item, UploadedFile):
# UploadedFile holds an opaque provider file_id (e.g. 'file-abc123'), not a URL or
# binary data, so it can't be mapped to AG-UI's BinaryInputContent. Skipped like CachePoint.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems lossy :) Can we store them in an activity or smth?

"""Whether to include ``FilePart`` data in message conversion.

When ``True``, ``FilePart`` round-trips as ``ActivityMessage(activity_type='pydantic_ai_file')``.
When ``False`` (default), ``FilePart`` is silently dropped from ``dump_messages`` output
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No double backticks please!


_: KW_ONLY
include_file_parts: bool = False
"""Whether to include ``FilePart`` data in message conversion.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't super clear as a user may think it applies to files going from the user to LLM as well, but I believe this only applies to files generated by the agent.

So this is a case where the field name and docstring should be more user focused, instead of referring to the FilePart implementation detail

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also link to the activities doc, make it clear that that if they use activities themselves they should not handle this specific type.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And "round trips as" below is not very clear to the end user.

yield ThinkingTextMessageStartEvent(type=EventType.THINKING_TEXT_MESSAGE_START)
yield ThinkingTextMessageContentEvent(type=EventType.THINKING_TEXT_MESSAGE_CONTENT, delta=part.content)
self._thinking_text = True
yield ReasoningStartEvent(message_id=self._reasoning_message_id)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on Slack, we need to make sure this is backward compatible with old frontends

def dump_messages(cls, messages: Sequence[ModelMessage], *, include_file_parts: bool = False) -> list[Message]:
"""Transform Pydantic AI messages into AG-UI messages.

Note: The round-trip ``dump_messages`` -> ``load_messages`` is not fully lossless:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double back ticks!

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

claude: no double backticks!

- ``CachePoint`` and ``UploadedFile`` content items are dropped.
- ``FilePart`` is silently dropped unless ``include_file_parts=True``.
- Part ordering within a ``ModelResponse`` may change when text follows tool calls.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lossiness documentation is missing BuiltinToolCallPart and BuiltinToolReturnPart. Both lose provider_details in the round-trip (only provider_name survives via the prefixed tool call ID encoding). Also, BuiltinToolCallPart.id is lost. Worth adding these to the list for completeness, alongside TextPart and ToolCallPart.

github-actions[bot]

This comment was marked as resolved.

if not self._thinking_text:
yield ThinkingTextMessageStartEvent(type=EventType.THINKING_TEXT_MESSAGE_START)
self._thinking_text = True
message_id = self._reasoning_message_id or ''
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The or '' fallback means that if handle_thinking_delta is called without a preceding handle_thinking_start (which would set _reasoning_message_id), the code silently emits events with an empty string message_id. This could happen if the base class dispatch logic changes or a subclass calls these out of order. Consider either:

  • Asserting that _reasoning_message_id is not None (since it's a programming error if it's missing), or
  • Raising an explicit error

The same pattern appears at lines 195 and 203 in handle_thinking_end.

…ta fix

- Add `ag_ui_version: Literal['0.1.10', '0.1.13']` parameter (default '0.1.10')
  for backward-compatible thinking event emission. Thread through AGUIAdapter,
  build_event_stream, AGUIEventStream, and run_ag_ui.
- Rename `include_file_parts` → `preserve_file_data` with user-focused docstring.
- Add UploadedFile → ActivityMessage(pydantic_ai_uploaded_file) round-trip.
- Fix double backticks → single backticks in all docstrings.
- Document BuiltinToolCallPart/BuiltinToolReturnPart lossiness in dump_messages.
- Assert _reasoning_message_id non-None instead of or '' fallback.
- Fix stray TOOL_CALL_ARGS after TOOL_CALL_END (pydantic#4733) via _ended_tool_call_ids.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@dsfaccini
Copy link
Copy Markdown
Collaborator Author

Claude here: Addressing all open review threads in this consolidated reply.

Backward compatibility — THINKING_* vs REASONING_* events

comment, comment

Added ag_ui_version: Literal['0.1.10', '0.1.13'] = '0.1.10' threaded through AGUIAdapterbuild_event_stream()AGUIEventStream (and run_ag_ui()).

  • '0.1.10' (default): emits THINKING_* events, drops ThinkingPart from dump_messages — no breaking change for existing frontends.
  • '0.1.13': emits REASONING_* events with ReasoningEncryptedValueEvent for metadata, and ThinkingPartReasoningMessage in dump_messages.

load_messages always accepts ReasoningMessage regardless of version.

Thinking metadata round-trip via ReasoningMessage

comment, comment, comment, comment, comment

For thinking/reasoning, we now use AG-UI's first-class ReasoningMessage.encrypted_value (0.1.13) instead of activity messages. thinking_encrypted_metadata() collects all non-None fields (id, signature, provider_name, provider_details) into the JSON, surviving the full round-trip. dump_messages emits ReasoningMessage when ag_ui_version='0.1.13'.

Activity messages are still used for FilePart and UploadedFile (gated behind preserve_file_data), with pydantic_ai_* prefixed types.

UploadedFile lossiness

comment

UploadedFile now round-trips as ActivityMessage(activity_type='pydantic_ai_uploaded_file') when preserve_file_data=True, preserving file_id, provider_name, media_type, identifier, and vendor_metadata.

Field naming and docstring quality

comment, comment, comment

  • Renamed include_file_partspreserve_file_data with user-focused docstring (no FilePart references).
  • Links to AG-UI activities docs, warns that pydantic_ai_* activity types should be ignored by frontend handlers.
  • All double backticks → single backticks.

Lossiness documentation

comment

Added BuiltinToolCallPart.id/.provider_details and BuiltinToolReturnPart.provider_details to the dump_messages lossiness notes.

Assert _reasoning_message_id non-None

comment

Replaced self._reasoning_message_id or '' with assert self._reasoning_message_id is not None in both handle_thinking_delta and handle_thinking_end.

devin-ai-integration[bot]

This comment was marked as resolved.

…g-with-agui

# Conflicts:
#	tests/test_ag_ui.py
@github-actions github-actions bot added size: XL Extra large PR (>1500 weighted lines) and removed size: L Large PR (501-1500 weighted lines) labels Mar 25, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
devin-ai-integration[bot]

This comment was marked as resolved.

@github-actions github-actions bot added size: L Large PR (501-1500 weighted lines) and removed size: XL Extra large PR (>1500 weighted lines) labels Mar 26, 2026
Comment on lines +69 to +71
AGUIVersion = Literal['0.1.10', '0.1.13']
"""Supported AG-UI protocol versions for thinking/reasoning event emission."""

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DouweM is this okay or do you want full semver >=

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally the user would be able to set the exact version they're using on the frontend (e.g. 0.1.14) and we would figure out what that means for us, e.g. if the relevant boundaries are only 0.1.10, 0.1.13 etc. I don't think the user needs to set >=.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

claude: can we do full semver checks? i.e. if version >= 0.1.13: ...

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 36 additional findings in Devin Review.

Open in Devin Review

Comment on lines +178 to +200
async def handle_thinking_start(
self, part: ThinkingPart, follows_thinking: bool = False
) -> AsyncIterator[BaseEvent]:
if not follows_thinking:
yield ThinkingStartEvent(type=EventType.THINKING_START)

if part.content:
yield ThinkingTextMessageStartEvent(type=EventType.THINKING_TEXT_MESSAGE_START)
yield ThinkingTextMessageContentEvent(type=EventType.THINKING_TEXT_MESSAGE_CONTENT, delta=part.content)
self._thinking_text = True
self._reasoning_message_id = str(uuid4())
self._reasoning_started = False

if self.ag_ui_version == '0.1.10':
if part.content:
yield ThinkingStartEvent()
self._reasoning_started = True
yield ThinkingTextMessageStartEvent()
yield ThinkingTextMessageContentEvent(delta=part.content)
self._reasoning_text = True
elif self.ag_ui_version == '0.1.13':
if part.content:
yield ReasoningStartEvent(message_id=self._reasoning_message_id)
self._reasoning_started = True
yield ReasoningMessageStartEvent(message_id=self._reasoning_message_id, role='assistant')
yield ReasoningMessageContentEvent(message_id=self._reasoning_message_id, delta=part.content)
self._reasoning_text = True
else:
# exhaustive branching protects against future additions of AG-UI versions
assert_never(self.ag_ui_version)
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: followed_by_thinking parameter is now unused in handle_thinking_end

The followed_by_thinking parameter at line 234 is accepted but never read. Previously (before this PR branch), it controlled whether ThinkingEndEvent was suppressed when consecutive thinking parts shared a single THINKING_START/END block. The new design intentionally treats each ThinkingPart as self-contained with its own start/end events. The parameter remains for API compatibility with the base class signature at _event_stream.py:488-489, so removing it would be a breaking change. This is fine but worth noting as dead code within the method body.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Collaborator Author

@dsfaccini dsfaccini Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude here: Correct, this is intentional. Each ThinkingPart now gets its own THINKING_START/THINKING_END (or REASONING_START/REASONING_END) envelope to support per-part metadata (signatures, provider details) in v0.1.13. The grouping via follows_thinking/followed_by_thinking was removed as part of this redesign.

dsfaccini and others added 2 commits March 26, 2026 17:02
Pydantic validates constructor args, rejecting dirty_equals matchers
as invalid bytes. Use fixture references directly like main does.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ral matching

Replace AGUIVersion Literal type with str + tuple-based semver comparison,
making version checks forward-compatible (e.g. 0.1.15 auto-gets REASONING_* events).
Also fix coverage gaps with 3 new tests and 2 test refactors.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AG-UI bug Report that something isn't working, or PR implementing a fix fix size: L Large PR (501-1500 weighted lines) thinking

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AGUI with Anthropic extended thinking models raises error

2 participants