Proposal: Streaming Into the Future with Prism Events #569

sixlive · 2025-08-19T12:54:47Z

sixlive
Aug 19, 2025
Maintainer

Hey Prism community! 👋

I've been thinking a lot about our streaming implementation lately, and I want to share a proposal for a pretty significant refactor that could make building real-time AI applications way better. But first, I need your feedback.

The Current State (And Why I Think We Can Do Better)

Right now, our streaming system works around "Chunks" – objects that contain mixed data types. If you've built streaming UIs with Prism, you've probably written code like this:

foreach ($stream as $chunk) {
    if ($chunk->text) {
        $text .= $chunk->text;
    }
    if ($chunk->toolCalls) {
        // Handle tool calls
    }
    // Usage data... somewhere?
}

This works, but I've been hearing from developers that it creates some challenges:

Usage tracking is inconsistent - Some providers include it, others don't, and it's in different places
Mixed data types are confusing - One chunk might have text, tools, and metadata all mixed together
No proper SSE format - Building web UIs requires custom formatting
State management complexity - Clients have to track what's happening across multiple chunks

Question for you: Does this match your experience? What other pain points have you hit with the current streaming approach?

The Proposal: Event-Driven Streaming

What I'm proposing is a complete refactor to an event-based system. Instead of chunks with mixed data, you'd get specific events for specific things happening in the stream.

Here's what a typical conversation would look like:

event: meta
data: {"id":"msg_123","model":"gpt-4","timestamp":1234567890}

event: text_start
data: {"id":"msg_123","timestamp":1234567891}

event: text_delta
data: {"id":"msg_123","delta":"I am","timestamp":1234567892}

event: text_delta
data: {"id":"msg_123","delta":" Claude","timestamp":1234567893}

event: text_end
data: {"id":"msg_123","timestamp":1234567894}

event: finish
data: {"id":"msg_123","reason":"stop","usage":{"prompt_tokens":10,"completion_tokens":3,"total_tokens":13},"timestamp":1234567895}

Key Changes

Proper SSE format with event: fields and snake_case naming
Dedicated events for text, tools, thinking, and metadata
Consistent usage tracking - Always in the finish event, always there
Clear lifecycle events - start/delta/end patterns for everything

The Magic: asEventStreamResponse()

The part I'm most excited about is this new method:

// Current implementation
Route::get('chat', function () {
    return response()->stream(function() use ($request) {
        $stream = Prism::text()
            ->using('openai', 'gpt-5')
            ->withPrompt(request('message'))
            ->asStream();

        foreach ($stream as $chunk) {
            // Manual SSE formatting
            echo "data: " . json_encode(['text' => $chunk->text]) . "\n\n";
            ob_flush();
            flush();
        }
    }, 200, [
        'Content-Type' => 'text/event-stream',
        'Cache-Control' => 'no-cache',
        // ... more headers
    ]);
});

// Proposed approach (one method call)
Route::get('chat', function () {
  return Prism::text()
      ->using('openai', 'gpt-5')
      ->withPrompt(request('message'))
      ->onStreamEnd(function (Collection $responseMessages) {
          // Persist messages or something
      })
      ->asEventStreamResponse();
});

This automatically handles all the SSE formatting, headers, and streaming mechanics.

Tool Calls Get First-Class Treatment

For those of you building tool-enabled apps, tool calls would get proper event sequences:

event: tool_call_start
data: {"id":"call_001","name":"weather","timestamp":1234567891}

event: tool_call_delta
data: {"id":"call_001","arguments":"{\"city\"","timestamp":1234567892}

event: tool_call_delta
data: {"id":"call_001","arguments":":\"Paris\"}","timestamp":1234567893}

event: tool_call_end
data: {"id":"call_001","timestamp":1234567894}

event: tool_result_start
data: {"id":"result_001","call_id":"call_001","timestamp":1234567895}

event: tool_result_end
data: {"id":"result_001","timestamp":1234567896}

You can watch tool arguments being built incrementally and handle the execution lifecycle properly.

Thinking Events for Reasoning Models

With reasoning models becoming more common, we'd have dedicated events for chain-of-thought:

event: thinking_start
data: {"id":"msg_456","timestamp":1234567891}

event: thinking_delta
data: {"id":"msg_456","delta":"Let me analyze this...","timestamp":1234567892}

event: thinking_end
data: {"id":"msg_456","timestamp":1234567893}

The Developer Experience

On the frontend, you could build UIs like this:

const eventSource = new EventSource('/api/chat-stream?message=Hello');

eventSource.addEventListener('text_delta', (event) => {
    const data = JSON.parse(event.data);
    appendText(data.delta);
});

eventSource.addEventListener('tool_call_start', (event) => {
    const data = JSON.parse(event.data);
    showToolExecution(data.name);
});

eventSource.addEventListener('finish', (event) => {
    const data = JSON.parse(event.data);
    updateUsageStats(data.usage); // Always available!
});

No more weird state management or hunting for usage data across different chunk types.

Backward Compatibility

I want to be transparent: this would be a breaking change. The asStream() method would return StreamEvent objects instead of Chunk objects. However, the method signature stays the same, and the migration path is pretty straightforward:

// Old way
foreach ($stream as $chunk) {
    $text .= $chunk->text;
}

// New way
foreach ($stream as $event) {
    if ($event instanceof TextDeltaEvent) {
        $text .= $event->getData()['delta'];
    }
}

The Vision

My goal is to make Prism the best way to build streaming AI applications. This event system would give you:

Consistent experience across all LLM providers
Production-ready SSE out of the box
Granular control over exactly what you want to handle
Better debugging with clear event sequences

But I don't want to build this in a vacuum. Your feedback will shape how this actually works.

This is a proposal, not a commitment. Based on community feedback, the actual implementation might look different. But I wanted to share the direction I'm thinking and get your input before diving in.

kauffinger · 2025-08-19T14:08:01Z

kauffinger
Aug 19, 2025

I like this. We need an end-to-end way to give to developers so they can build great interfaces.

Currently, when using Livewire, I am basically abstracting a way to pass through chunks to the frontend to then process them in JS.
I pass through the type of the chunk & its content in the field it would be in the chunk.

It makes a lot of sense to give the dev an end-to-end way of consuming the events in the frontend. I've struggled myself moving from just streaming the text to streaming whole events. I tried to keep the frontend processing to a minimum, which was wrong. Today, with thinking and tool calls, you really have to process the full stream in the frontend. There is no good in-between imo.

I'd also like to see a default frontend implementation for people to reference in different stacks, kind of like @pushpak1300 or my starter kits.

0 replies

uchm4n · 2025-08-19T15:02:58Z

uchm4n
Aug 19, 2025

That's an excellent proposal. I completely agree that an event-based system makes a lot more sense and would provide a solid, standardized architecture.
I especially like the asEventStreamResponse() method. It feels very clean and intuitive—a truly "Laravel" way to handle streaming. The dedicated events for text and tools, along with consistent usage tracking, would definitely solve many of the pain points you mentioned. I think it would significantly simplify state management and make building client-side UIs much more straightforward.
This looks like a massive improvement.

0 replies

avengerweb · 2025-08-19T17:27:52Z

avengerweb
Aug 19, 2025

Streaming-focused feedback from production use

I’m squarely focused on production apps in Laravel, not cloning OpenRouter/LiteLLM in PHP. Prism was a great starting point—I built a small lib on top that focuses only on Streaming and Structured Outputs (kept structured outputs unchanged from Prism).

A few lessons from running this at scale in PHP/Laravel:

PHP runtime realities: long-running requests + dropped connections shouldn’t kill a workflow (chat turn), but PHP’s request/response model makes that tricky. Agentic workflows can run minutes, which doesn’t mesh well with FPM.
Scaling: with 1000+ active chats, you’ll exhaust FPM workers quickly. I rely on Reverb + WebSockets to push events instead of holding HTTP workers open.
Laravel-native patterns: I lean heavily on Laravel events for orchestration/observability, queues for processing, and job batching for concurrent tool calls (e.g., Google + GitHub in parallel). Tool calls fail a lot in the wild; HTTP client retry alone isn’t enough—job-level retry/backoff and idempotency matter.

On the proposal itself: it feels closer to TS/Python streaming patterns than to PHP’s operational constraints. That’s not a knock—just a note that production-ready PHP may need adapted patterns vs. ports.

Questions that would help validate the design for PHP/Laravel:

Reasoning streams (multi-track):
Providers (e.g., OpenAI) can emit multiple thinking_* tracks under the same message.
- Is there a stable reasoning_id per track?
- Can thinking_* interleave with text_*/tool_* while preserving ordering guarantees?
- Does finish summarize all tracks or just the primary?
Usage accounting:
- You note “usage at the end,” but providers also send usage mid-turn or on tool-call messages.
- Will finish contain a canonical total for the entire turn (incl. tools), or should clients sum partials?
- Any double-count protection if providers emit partial usage multiple times?
Resumption & idempotency:
- Support for Last-Event-ID so clients can resume an SSE stream after disconnect?
- Stable event_id + seq for idempotent upserts/replays on the backend?
Laravel integration expectations:
- Guidance for persisting text, tool calls/results, and reasoning tracks (at least a recommended shape/fields).
- Recommended flow for queues + batching (tool orchestration) and scheduling (delayed/recurring tasks).
- How do you see asEventStreamResponse() coexisting with Reverb/WebSockets patterns that free FPM workers—any adapter or best practices?

If the design can lock down the above (especially canonical usage and resumability) and provide Laravel-first guidance (queues/events/persistence), it’ll land much closer to what PHP teams need in production.

1 reply

sixlive Aug 19, 2025
Maintainer Author

Thank you so much! I'll definitely take all of this into consideration

bdweix · 2025-08-20T18:06:01Z

bdweix
Aug 20, 2025

Great work @sixlive! The proposed asEventStreamResponse looks much cleaner. A few thoughts:

Ability to pass streaming templates?
It would be great for users to pass a streaming template that will format the stream. That way people can easily do custom stream formats if they want, or if not provided it will use the default format (like what you have outlined above). That might be more involved than it sounds though.
Vercel AI SDK
I've been impressed by the AI SDK (https://vercel.com/blog/ai-sdk-5), and they have a section on their stream protocols (https://ai-sdk.dev/docs/ai-sdk-ui/stream-protocol) which is quite close to many of the structures you laid out above. We are building our components and systems to make use of their UI components as you get a lot of out of the box functionality, and at a minimum, an opinionated way to do things (like frontend tool handling). I wonder if it would make sense to more formally try to support/mirror their output format? I suspect we'll see many companies look to use their packages more often, and they support React, Vue, Angular, and Svelte already.

For example, just this week this was released: https://streamdown.ai/, which is built on those streaming standards and AI Elements (https://ai-sdk.dev/elements/overview/usage). If Prisms default worked out of the box with all of those packages I think that would be a boon to all.

2 replies

kauffinger Aug 22, 2025

About 2: If the protocol is good & has many users, this might be a good move.

sixlive Aug 22, 2025
Maintainer Author

I plan on supporting the Vercel data protocol!

sixlive · 2025-08-28T20:43:48Z

sixlive
Aug 28, 2025
Maintainer Author

Streaming Output

Want to show AI responses to your users in real-time? Prism provides multiple ways to handle streaming AI responses, from simple Server-Sent Events to WebSocket broadcasting for real-time applications.

Warning

When using Laravel Telescope or other packages that intercept Laravel's HTTP client events, they may consume the stream before Prism can emit the stream events. This can cause streaming to appear broken or incomplete. Consider disabling such interceptors when using streaming functionality, or configure them to ignore Prism's HTTP requests.

Quick Start

Server-Sent Events (SSE)

The simplest way to stream AI responses to a web interface:

Route::get('/chat', function () {
    return Prism::text()
        ->using('anthropic', 'claude-3-7-sonnet')
        ->withPrompt(request('message'))
        ->asEventStreamResponse();
});

const eventSource = new EventSource('/chat');

eventSource.addEventListener('text_delta', (event) => {
    const data = JSON.parse(event.data);
    document.getElementById('output').textContent += data.delta;
});

eventSource.addEventListener('stream_end', (event) => {
    const data = JSON.parse(event.data);
    console.log('Stream ended:', data.finish_reason);
    eventSource.close();
});

Vercel AI SDK Integration

For apps using Vercel's AI SDK, use the Data Protocol adapter which provides compatibility with the Vercel AI SDK UI:

Route::post('/api/chat', function () {
    return Prism::text()
        ->using('openai', 'gpt-4')
        ->withPrompt(request('message'))
        ->asDataStreamResponse();
});

Client-side with the useChat hook:

import { useChat } from 'ai/react';

export default function Chat() {
    const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({
        api: '/api/chat',
    });

    return (
        <div>
            <div>
                {messages.map(m => (
                    <div key={m.id}>
                        {m.role}: {m.content}
                    </div>
                ))}
            </div>
            
            <form onSubmit={handleSubmit}>
                <input
                    value={input}
                    placeholder="Say something..."
                    onChange={handleInputChange}
                    disabled={isLoading}
                />
                <button type="submit" disabled={isLoading}>
                    Send
                </button>
            </form>
        </div>
    );
}

For more advanced usage, including tool support and custom options, see the Vercel AI SDK UI documentation.

WebSocket Broadcasting with Background Jobs

For real-time multi-user applications that need to process AI requests in the background:

// Job Class
<?php

namespace App\Jobs;

use Illuminate\Broadcasting\Channel;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use Prism\Prism\Prism;

class ProcessAiStreamJob implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    public function __construct(
        public string $message,
        public string $channel,
        public string $model = 'claude-3-7-sonnet'
    ) {}

    public function handle(): void
    {
        Prism::text()
            ->using('anthropic', $this->model)
            ->withPrompt($this->message)
            ->asBroadcast(new Channel($this->channel));
    }
}

// Controller
Route::post('/chat-broadcast', function () {
    $sessionId = request('session_id') ?? 'session_' . uniqid();
    
    ProcessAiStreamJob::dispatch(
        request('message'),
        "chat.{$sessionId}",
        request('model', 'claude-3-7-sonnet')
    );
    
    return response()->json(['status' => 'processing', 'session_id' => $sessionId]);
});

Client-side with React and useEcho:

import { useEcho } from '@/hooks/useEcho';
import { useState } from 'react';

function ChatComponent() {
    const [currentMessage, setCurrentMessage] = useState('');
    const [currentMessageId, setCurrentMessageId] = useState('');
    const [isComplete, setIsComplete] = useState(false);

    const sessionId = 'session_' + Date.now();

    // Listen for streaming events
    useEcho(`chat.${sessionId}`, {
        '.stream_start': (data) => {
            console.log('Stream started:', data);
            setCurrentMessage('');
            setIsComplete(false);
        },
        
        '.text_start': (data) => {
            console.log('Text start event received:', data);
            setCurrentMessage('');
            setCurrentMessageId(data.message_id || Date.now().toString());
        },
        
        '.text_delta': (data) => {
            console.log('Text delta received:', data);
            setCurrentMessage(prev => prev + data.delta);
        },
        
        '.text_complete': (data) => {
            console.log('Text complete:', data);
        },
        
        '.tool_call': (data) => {
            console.log('Tool called:', data.tool_name, data.arguments);
        },
        
        '.tool_result': (data) => {
            console.log('Tool result:', data.result);
        },
        
        '.stream_end': (data) => {
            console.log('Stream ended:', data.finish_reason);
            setIsComplete(true);
        },
        
        '.error': (data) => {
            console.error('Stream error:', data.message);
        }
    });

    const sendMessage = async (message) => {
        await fetch('/chat-broadcast', {
            method: 'POST',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify({ 
                message, 
                session_id: sessionId,
                model: 'claude-3-7-sonnet' 
            })
        });
    };

    return (
        <div>
            <div className="message-display">
                {currentMessage}
                {!isComplete && <span className="cursor">|</span>}
            </div>
            
            <button onClick={() => sendMessage("What's the weather in Detroit?")}>
                Send Message
            </button>
        </div>
    );
}

Event Types

All streaming approaches emit the same core events with consistent data structures:

Available Events

stream_start - Stream initialization with model and provider info
text_start - Beginning of a text message
text_delta - Incremental text chunks as they're generated
text_complete - End of a complete text message
thinking_start - Beginning of AI reasoning/thinking session
thinking_delta - Reasoning content as it's generated
thinking_complete - End of reasoning session
tool_call - Tool invocation with arguments
tool_result - Tool execution results
error - Error handling with recovery information
stream_end - Stream completion with usage statistics

Event Data Examples

Based on actual streaming output:

// stream_start event
{
    "id": "anthropic_evt_SSrB7trNIXsLkbUB",
    "timestamp": 1756412888,
    "model": "claude-3-7-sonnet-20250219",
    "provider": "anthropic",
    "metadata": {
        "request_id": "msg_01BS7MKgXvUESY8yAEugphV2",
        "rate_limits": []
    }
}

// text_start event  
{
    "id": "anthropic_evt_8YI9ULcftpFtHzh3",
    "timestamp": 1756412888,
    "message_id": "msg_01BS7MKgXvUESY8yAEugphV2",
    "turn_id": null
}

// text_delta event
{
    "id": "anthropic_evt_NbS3LIP0QDl5whYu",
    "timestamp": 1756412888,
    "delta": "💠🌐 Well hello there! You want to know",
    "message_id": "msg_01BS7MKgXvUESY8yAEugphV2",
    "turn_id": null
}

// tool_call event
{
    "id": "anthropic_evt_qXvozT6OqtmFPgkG",
    "timestamp": 1756412889,
    "tool_id": "toolu_01NAbzpjGxv2mJ8gJRX5Bb8m",
    "tool_name": "search",
    "arguments": {"query": "current date and time in Detroit Michigan"},
    "message_id": "msg_01BS7MKgXvUESY8yAEugphV2",
    "reasoning_id": null
}

// stream_end event
{
    "id": "anthropic_evt_BZ3rqDYyprnywNyL",
    "timestamp": 1756412898,
    "finish_reason": "Stop",
    "usage": {
        "prompt_tokens": 3448,
        "completion_tokens": 192,
        "cache_write_input_tokens": 0,
        "cache_read_input_tokens": 0,
        "thought_tokens": 0
    }
}

Advanced Usage

Custom Event Processing

Access raw events for complete control over handling:

$events = Prism::text()
    ->using('openai', 'gpt-4')
    ->withPrompt('Explain quantum physics')
    ->asStream();

foreach ($events as $event) {
    match ($event->type()) {
        StreamEventType::TextDelta => handleTextChunk($event),
        StreamEventType::ToolCall => handleToolCall($event),
        StreamEventType::StreamEnd => handleCompletion($event),
        default => null,
    };
}

Streaming with Tools

Stream responses that include tool interactions:

use Prism\Prism\Facades\Tool;

$searchTool = Tool::as('search')
    ->for('Search for information')
    ->withStringParameter('query', 'Search query')
    ->using(function (string $query) {
        return "Search results for: {$query}";
    });

return Prism::text()
    ->using('anthropic', 'claude-3-7-sonnet')
    ->withTools([$searchTool])
    ->withPrompt("What's the weather in Detroit?")
    ->asEventStreamResponse();

Data Protocol Output

The Vercel AI SDK format provides structured streaming data:

data: {"type":"start","messageId":"anthropic_evt_NPbGJs7D0oQhvz2K"}

data: {"type":"text-start","id":"msg_013P3F8KkVG3Qasjeay3NUmY"}

data: {"type":"text-delta","id":"msg_013P3F8KkVG3Qasjeay3NUmY","delta":"Hello"}

data: {"type":"text-end","id":"msg_013P3F8KkVG3Qasjeay3NUmY"}

data: {"type":"finish","messageMetadata":{"finishReason":"stop","usage":{"promptTokens":1998,"completionTokens":288}}}

data: [DONE]

Configuration Options

Streaming supports all the same configuration options as regular text generation, including temperature, max tokens, and provider-specific settings.

8 replies

TonsiTT Oct 1, 2025

@sixlive can the Available Events be used for tracing?

avengerweb Oct 11, 2025

@avengerweb would be interested in your feedback on using asBroadcast

To reiterate, my primary concern is the library's alignment with the broader Laravel ecosystem, a standard I've come to expect.

While streaming responses are central to mainstream applications like agents and chats, they are often flawed by design. A traditional request-response model, for instance, offers a much cleaner workflow with straightforward error handling and retry mechanisms. In contrast, a streaming approach can involve numerous yields for a single message, followed by tool calls, and so on.

Given that a single streaming request has countless potential points of failure, let's consider the practical challenges of implementing a production-ready application, such as an enterprise agent, using this design proposal:

Data Persistence: To maintain a message history, I need to persist messages to Laravel models. A streaming-first approach creates a dual responsibility: assembling the final message content for database storage while also broadcasting granular text deltas to the UI. This process feels inefficient and redundant.

State Management on Refresh: When rendering chat history, I'd rely on standard Laravel features. However, these tools are not designed to handle in-progress streams. If a user reloads the page mid-stream, the transient state is lost, leading to a fragmented and confusing user experience. This issue is directly tied to persistence, as it necessitates creating two distinct rendering pathways: one for persisted, historical messages and another for the transient, real-time stream, which is an important architectural consideration.

Stream Resumption: The most significant challenge is resuming a stream after a disruption like a page reload. This would require a sophisticated state-tracking mechanism, where the client stores message and chunk identifiers and the backend can re-initiate the stream from a specific event. This introduces substantial complexity. A far simpler and more robust solution would be to send the complete, cumulative message content in each event. This stateless approach ({"hi"}, {"hi, I am"}) eliminates the need for complex state management, allowing the frontend to simply render the latest payload and freeing up development resources. The current proposal does not appear to offer a viable solution to this fundamental challenge.

Workflow Orchestration and Tool Calls: As I've previously mentioned, robust agent systems rely heavily on background queues. This is not just for offloading the entire task, but for granularly managing each step—send, tool_call, return—as an independent, retryable job. This is critical for multi-step tool use (send -> tool, tool -> send). The proposed design appears to favor a monolithic, long-running process, which lacks the resilience required for production environments.

Error Handling and Recovery: A key reason for returning complete message resources is the ability to embed critical metadata, such as a workflow_id. This identifier is essential for transactionally managing a sequence of interactions. In the event of a failure, this allows for predictable recovery patterns, such as allowing the user to either discard the failed workflow entirely or resume from the last known-good state. The current proposal seems to overlook this crucial aspect of application resilience.

kauffinger Oct 12, 2025

Data Persistence: To maintain a message history, I need to persist messages to Laravel models. A streaming-first approach creates a dual responsibility: assembling the final message content for database storage while also broadcasting granular text deltas to the UI. This process feels inefficient and redundant.

State Management on Refresh: When rendering chat history, I'd rely on standard Laravel features. However, these tools are not designed to handle in-progress streams. If a user reloads the page mid-stream, the transient state is lost, leading to a fragmented and confusing user experience. This issue is directly tied to persistence, as it necessitates creating two distinct rendering pathways: one for persisted, historical messages and another for the transient, real-time stream, which is an important architectural consideration.

Stream Resumption: The most significant challenge is resuming a stream after a disruption like a page reload. This would require a sophisticated state-tracking mechanism, where the client stores message and chunk identifiers and the backend can re-initiate the stream from a specific event. This introduces substantial complexity. A far simpler and more robust solution would be to send the complete, cumulative message content in each event. This stateless approach ({"hi"}, {"hi, I am"}) eliminates the need for complex state management, allowing the frontend to simply render the latest payload and freeing up development resources. The current proposal does not appear to offer a viable solution to this fundamental challenge.

I'm currently operating two agentic systems that use streaming responses with livewire (see my template repo for how it works in detail) and the main problems with it are exactly these.
I always send the full 'stream history' to the frontend to not have to handle state in it. Whenever the user refreshes or something crashes, the chat is left in a weird state. Not unrecoverable, but unrecoverable from my side. Definitely no resuming possible.

So, to confirm, I currently have kind of two rendering pathways. One, the stream, and second, the normal blade rerender of the livewire component which will replace the streamed component when it ends. Both rendered through alpinejs, but we do have to do some unintuitive "magic".

I am currently thinking about refactoring this to queues (with laravel workflow) & broadcasting. I'd like to think that when I broadcast the main loop using that, I could get pretty close to resumable streams when I broadcast the loop's full output at all times. I'd still use livewire & only stream the latest response.

I also think there might be a way to handle this by eagerly persisting the main loop's state to the database - currently I only persist on finished streams for simplicity. In my livewire component I could just reinitiate the stream request at the last persisted state. Will be another request, but I don't think my users would care about the lost generation. Probably won't try that, as I'll probably have to opt for broadcasting anyways.

I'll have to admit that both approaches don't feel very laravelly.

I know that the streaming refactor aims to have compatibility with the vercel protocol. I'm sure it handles a lot of the state stuff.

avengerweb Oct 12, 2025

Data Persistence: To maintain a message history, I need to persist messages to Laravel models. A streaming-first approach creates a dual responsibility: assembling the final message content for database storage while also broadcasting granular text deltas to the UI. This process feels inefficient and redundant.
State Management on Refresh: When rendering chat history, I'd rely on standard Laravel features. However, these tools are not designed to handle in-progress streams. If a user reloads the page mid-stream, the transient state is lost, leading to a fragmented and confusing user experience. This issue is directly tied to persistence, as it necessitates creating two distinct rendering pathways: one for persisted, historical messages and another for the transient, real-time stream, which is an important architectural consideration.
Stream Resumption: The most significant challenge is resuming a stream after a disruption like a page reload. This would require a sophisticated state-tracking mechanism, where the client stores message and chunk identifiers and the backend can re-initiate the stream from a specific event. This introduces substantial complexity. A far simpler and more robust solution would be to send the complete, cumulative message content in each event. This stateless approach ({"hi"}, {"hi, I am"}) eliminates the need for complex state management, allowing the frontend to simply render the latest payload and freeing up development resources. The current proposal does not appear to offer a viable solution to this fundamental challenge.

I'm currently operating two agentic systems that use streaming responses with livewire (see my template repo for how it works in detail) and the main problems with it are exactly these. I always send the full 'stream history' to the frontend to not have to handle state in it. Whenever the user refreshes or something crashes, the chat is left in a weird state. Not unrecoverable, but unrecoverable from my side. Definitely no resuming possible.

So, to confirm, I currently have kind of two rendering pathways. One, the stream, and second, the normal blade rerender of the livewire component which will replace the streamed component when it ends. Both rendered through alpinejs, but we do have to do some unintuitive "magic".

I am currently thinking about refactoring this to queues (with laravel workflow) & broadcasting. I'd like to think that when I broadcast the main loop using that, I could get pretty close to resumable streams when I broadcast the loop's full output at all times. I'd still use livewire & only stream the latest response.

I also think there might be a way to handle this by eagerly persisting the main loop's state to the database - currently I only persist on finished streams for simplicity. In my livewire component I could just reinitiate the stream request at the last persisted state. Will be another request, but I don't think my users would care about the lost generation. Probably won't try that, as I'll probably have to opt for broadcasting anyways.

I'll have to admit that both approaches don't feel very laravelly.

I know that the streaming refactor aims to have compatibility with the vercel protocol. I'm sure it handles a lot of the state stuff.

I've successfully used laravel-workflow in production for over a year with more than 1,000 daily users. The library provides Workflow and Action classes, which are very effective. I had to modify the default prism-php stream implementation to allow tools to execute in a separate Action. My setup includes several actions, such as ProcessMessage, ProcessToolCall (asynchronous), CompressHistory, and ValidateResponse (e.g., for profanity, harmful content, etc.).

There are a couple of pitfalls. The out-of-the-box error reporting is lacking; if you don't catch exceptions yourself, you get a generic "workflow failed" message without a clear stack trace for why the specific action failed. This is fixable, but it's annoying. Another issue is that if a worker is restarted or killed mid-workflow, the workflow can sometimes enter a weird state, which also requires a custom fix.

On the frontend, I use some of Vercel's UI components, which are great. I don't use their streaming React hooks like useChat. It's easy to use their components without the hooks, and they recently added a transport abstraction layer. This new layer will likely make it possible to use their useChat hook with a custom "broadcast" transport, which could be a great solution.

kauffinger Oct 13, 2025

@avengerweb Very interesting, thanks for the info!

Uh oh!

Proposal: Streaming Into the Future with Prism Events #569

Uh oh!

Uh oh!

sixlive Aug 19, 2025 Maintainer

The Current State (And Why I Think We Can Do Better)

The Proposal: Event-Driven Streaming

Key Changes

The Magic: asEventStreamResponse()

Tool Calls Get First-Class Treatment

Thinking Events for Reasoning Models

The Developer Experience

Backward Compatibility

The Vision

Replies: 5 comments · 11 replies

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sixlive Aug 19, 2025 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

sixlive Aug 22, 2025 Maintainer Author

Uh oh!

sixlive Aug 28, 2025 Maintainer Author

Streaming Output

Quick Start

Server-Sent Events (SSE)

Vercel AI SDK Integration

WebSocket Broadcasting with Background Jobs

Event Types

Available Events

Event Data Examples

Advanced Usage

Custom Event Processing

Streaming with Tools

Data Protocol Output

Configuration Options

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sixlive
Aug 19, 2025
Maintainer

Replies: 5 comments 11 replies

sixlive Aug 19, 2025
Maintainer Author

sixlive Aug 22, 2025
Maintainer Author

sixlive
Aug 28, 2025
Maintainer Author