You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Refactor workshop documentation by removing outdated examples
- Eliminated outdated example sections from chapter3-2.md and chapter3-3.md to enhance clarity and focus on relevant content.
- Streamlined the documentation to improve user engagement and understanding of streaming and interactive features.
On the frontend, you'll need to handle Server-Sent Events (SSE) or WebSockets to receive and display the streamed content:
183
183
184
-
!!! example "Frontend Streaming Implementation"
185
-
```
186
-
*This code shows how to handle streaming responses on the frontend, creating a reader for the response stream, decoding chunks as they arrive, and updating the UI in real-time to display incremental results.*
On the frontend, you'd handle this structured stream by updating different UI components based on the message type:
261
258
262
-
!!! example "Structured Data Streaming Handler"
263
-
```
264
-
*This code processes a structured data stream, separating different components (answer chunks, citations, follow-up questions) and rendering each in their appropriate UI sections. This creates a dynamic, engaging experience where different parts of the response appear progressively.*
265
-
```
259
+
266
260
267
261
This approach creates a dynamic, engaging experience where different parts of the response appear progressively, keeping users engaged throughout the generation process.
On the frontend, you'd display these interstitials in sequence during the waiting period:
376
370
377
-
!!! example "Meaningful Interstitials Implementation"
378
-
```
379
-
*This code shows how to fetch and display domain-specific interstitial messages that rotate every few seconds. The animation and context-specific messages engage users during waiting time, making the system feel more responsive.*
380
-
```
371
+
381
372
382
373
## Optimizing Actual Performance
383
374
@@ -441,10 +432,7 @@ Here's a simple but effective approach for Slack bots:
441
432
442
433
1.**Feedback Collection**: Pre-fill emoji reactions (👍 👎 ⭐) to prompt users for feedback on the response quality.
443
434
444
-
!!! example "Slack Bot Pseudo-Streaming Implementation"
445
-
```
446
-
*This code shows how to implement pseudo-streaming in a Slack bot environment, using message updates, emoji reactions, and staged processing to create the illusion of progress and maintain user engagement.*
447
-
```
435
+
448
436
449
437
!!! tip "Slack Feedback Collection"
450
438
By pre-filling emoji reactions (👍 👎 ⭐), you increase the likelihood of receiving user feedback. This approach places feedback options directly in the user's view, rather than requiring them to take additional steps. In testing, this approach increased feedback collection rates by up to 5x compared to text-based feedback prompts.
On the frontend, you can turn these citations into interactive elements:
122
122
123
-
!!! example "Interactive Citations Rendering"
124
-
```
125
-
*This code turns a response with citation markers into an interactive UI where citations are clickable elements, and sources can be rated for relevance.*
126
-
```
123
+
127
124
128
125
This creates an interactive experience where citations are visually distinct, clickable elements. When users engage with these elements, you can collect valuable feedback while enhancing their understanding of the response.
Taking this a step further, you can stream the thinking process as a separate UI component or interstitial. This serves two purposes: it makes the waiting time more engaging by showing users that complex reasoning is happening, and it allows users to intervene if they notice the reasoning going astray.
241
238
242
-
!!! example "Chain of Thought Streaming Implementation"
243
-
```
244
-
*This code processes streamed tokens containing XML-tagged thinking and answer sections, rendering them in separate UI components. This makes the reasoning process transparent and engaging for users.*
245
-
```
239
+
246
240
247
241
A financial advisory firm implemented this approach for their investment recommendation system. As the model reasoned through market conditions, client preferences, and portfolio considerations, this thinking was streamed to the advisor in a separate panel. If the advisor noticed a misunderstanding, they could pause generation and refine their query before the final recommendation.
0 commit comments