Skip to content

feat(anthropic): text stream support #266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 44 commits into from
Apr 12, 2025

Conversation

sentiasa
Copy link
Contributor

@sentiasa sentiasa commented Mar 15, 2025

Description

  • Basic stream handling
  • Tool call processing and handling
  • Thinking content extraction
  • Citation handling
  • Rate limit processing

Breaking Changes

none

@sentiasa sentiasa mentioned this pull request Mar 16, 2025
Copy link

kinsta bot commented Mar 16, 2025

Preview deployments for prism ⚡️

Status Branch preview Commit preview
❌ Failed to deploy N/A N/A

Commit: 6e6ffb80db4f6ee6cf3dfabd936fceee968a6d63

Deployment ID: f11057d8-4aee-4394-b266-28464a137769

Static site name: prism-97nz9

@sentiasa
Copy link
Contributor Author

Hi @ChrisB-TL

Thanks for the feedback. I’ve made some changes, but I could use some guidance on the remaining parts.

  1. For thinking, how should we handle the record/return of thinking and thinking signature values? Should we send them through stream? One way I could think of is returning it in yield new Chunk(content: !empty($additionalContent) ? json_encode($additionalContent) : null. Not sure if this is the right direction.

  2. For RateLimits, I saw how it('adds rate limit data to the responseMeta') handles the RateLimiting, however I am not sure how to test it with Stream. Do you know how I can test it?

  3. I don't fully grasp the Step concept (AnthropicText) and how it should play with Stream.

This turned out to be more complex than I anticipated, but with some guidance I believe I can crack it

@sentiasa sentiasa requested a review from ChrisB-TL March 22, 2025 00:19
@sentiasa
Copy link
Contributor Author

I applied some fixes and refactoring based on your feedback, and fixed the formatting. The implementation now handles rate limits correctly, and properly processes citations and thinking signatures.

Any feedback would be very helpful.

Copy link
Contributor

@ChrisB-TL ChrisB-TL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes made directly.

I think I am done with this. In summary I have:

  • Shared error handling code across the text/structured base handler and streaming.
  • Added the concept of a meta chunk - the first and last - which the dev probably won't want to forward to the frontend. This should lay the groundwork as/when we add usage data also.
  • Refactored the processStream method to use a match statement and defer to separate handler methods, and added comments to explain what is going on given its complexity.
  • Clarified what is going on with state.
  • Added yielding of thinking chunks with a chunkType of Thinking (so dev can decide whether or not to forward to frontend).
  • Fixed citations - this was knarly as it does not work quite the same way as a standard response. I've added a note to the docs to explain.
  • Re-used the text handler's build payload method, as its the same other than the stream property.

It is a pretty complex feature for Anthropic given citations, thinking etc. and I think it could do with further cleaning up for readability / maintenance. But I think good enough for now if @sixlive agrees?

@sentiasa
Copy link
Contributor Author

@ChrisB-TL

Thanks a lot for all the refactoring on the PR! You've really shaped the code to align well with the rest of the library. Sorry if I caused some pain due to my lack of knowledge with the rest of the library (and I did underestimate the complexity of Anthropic Stream) 🤦‍♂️

The ChunkType approach you implemented is quite smart. I'm now thinking about how we could extend this to provide better signalling for tool execution to the frontend - specifically for "tool_started" and "tool_finished" events that would help to display tool execution status in the frontend. Would ChunkType be the foundation bricks to achieve this?

@ChrisB-TL
Copy link
Contributor

Not at all - it's a tricky one as Anthropic is so feature rich!

That's a great idea on using the chunkType for tools. You could add metadata about the tool - name, input data etc. - to additionalData.

I'd suggest getting this PR done first, then we can loop back over the stream handlers to do that if @sixlive likes the idea too!

@sentiasa
Copy link
Contributor Author

sentiasa commented Mar 28, 2025

@ChrisB-TL I finally had chance to test it further.

I realised an issue with the handleMessageStop's Chunk. The stop_reason is not in the message_stop event, but in the last message_delta event instead. I introduced $this->stopReason

Also within FinishReasonMap 'message_stop' was introduced by me and is a mistake, because the actual stop reason should be message_delta's stop_reason. I pushed a fix

@sentiasa
Copy link
Contributor Author

I found some more room for improvements.

I was trying to setup history and I had difficulty with multiple tool-call case. Specifically;

assistant:
"doing X text"
[tool_use]
"doing Y text"
[tool_use]
  1. The Stop Chunk (Stream Line 305) is fired for each subsequent request, so it is difficult to catch the true final event. Is returning the Chunk only if ($depth == 0) a good idea?

  2. Once the Stop Chunk is received on the application side, it doesn't contain any information to craft the history (e.g. text, tool_use, text, tool_use). As I understand, the Step approach is built for this use-case. Would it make sense to initiate ResponseBuilder and create a method addStep() similar to Anthropic/Text?


enum ChunkType
{
case Message;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make better sense as Text or Message?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking this makes more sense being Text.

Copy link
Contributor

@sixlive sixlive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One feedback comment. Ready to merge this shortly.

@sentiasa
Copy link
Contributor Author

sentiasa commented Apr 8, 2025

@sixlive

I've been tinkering with this branch for some time.

  1. When multiple tools are called, the FinishReason: Stop is triggered for every depth, making it difficult for developers to catch the final event. For example, with 3 tools called, that's 3 Stop events. I've been testing if ($depth == 0) and it's been working well in my tests. Wouldn't we want the Stop event to be triggered only once?

  2. I wanted to implement feedback for tool call events so users can see progress on the frontend. For example:

image

I tried implementing the Steps approach here: https://github.com/sentiasa/prism-anthropic-stream/pull/2/files. This might be useful for other models' streaming as well. (this branch is half baked and just wanted to show you my approach)


My thoughts:

  • For point 1, I'd appreciate your thoughts on whether you agree this is an issue worth addressing now.
  • For point 2, this could be handled later as it's more of a global feature that would benefit from your supervision and feedback.

@sixlive
Copy link
Contributor

sixlive commented Apr 12, 2025

@sentiasa I think we should probably do something about point 1. Lets punt on point 2 for now. I definitely want to implement chunk types for tools more broadly.

@sixlive sixlive merged commit 7ab7875 into prism-php:main Apr 12, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants