perf(chats): make chat streaming animations cheaper#1046
Draft
perf(chats): make chat streaming animations cheaper#1046
Conversation
Reduce re-renders, framer-motion overhead, and token churn in ChatMessages: - Custom memo comparator on ChatMessageItem so hover/copy/highlight on one message no longer re-renders the whole list. - Local per-item hover state (was global hoveredMessageId, triggering a cascade re-render on every mouse move across messages). - Replace hover-revealed motion.buttons with plain <button>s using CSS opacity transitions driven by local state. - Drop per-token motion.span stagger animation + onComplete playNote hooks. The chat synth note now fires once per assistant streaming delta from a centralised effect (throttled by useChatSynth). - Replace message-enter motion.div with a plain <div> for pre-existing messages (only new streaming messages fade in). - Swap motion.div urgent background animation for a CSS keyframe (animate-urgent-bg). - Memoise displayTokens, imageUrls, and link-preview URL collection per item to skip work unless the message or its content changed. - Pre-filter highlightSegment / localHighlightSegment at the parent so only the active target item sees a non-null ref (others skip render). - Stable useCallback identities for copy/delete/play/stop handlers. - Memoise currentMessagesToDisplay in ChatsAppComponent and currentRoomMessagesLimited in useChatRoom so slice() doesn't return a fresh array reference on every parent render. Co-authored-by: Ryo Lu <me@ryo.lu>
The previous revision used React local state + mouseenter/mouseleave handlers to toggle the visibility of the per-message copy/speak/delete toolbar. That triggered a state update + re-render on every mouse move across messages, and was also gated by an `isTouchDevice()` check that returns true on hybrid devices (navigator.maxTouchPoints > 0), silently breaking the hover toolbar there. Switch to pure CSS hover via Tailwind's `group` / `group-hover:` on the message wrapper. Desktop hover no longer re-renders the item at all; touch devices still get a state-driven toggle via onTouchStart. Co-authored-by: Ryo Lu <me@ryo.lu>
Further reduce per-delta work during assistant message streaming (useChat throttles to ~50ms, so each full chat render cycle runs ~20x/sec for long replies): - Add isStreaming flag: only true for the last assistant message while useChat isLoading === true. In that case, render text parts as a single plain-text span inside .whitespace-pre-wrap — no per-token markdown segmentation, no dozens of motion-less <span> nodes per bubble. Once streaming completes, the message re-renders with full markdown tokenization (bold/italic/links/citations/highlight). - Skip allUrls extraction while streaming. URL regex scans the whole message on every delta otherwise; link previews render after stream. - decodeHtmlEntities fast-paths text with no '&' marker, avoiding DOMParser allocation on the ~99% of chat text with no entities. Net effect: during streaming, the streaming bubble's work per delta reduces from O(N tokens) React element creation + N regex scans to O(1) text node update, which is what the browser should do anyway. Co-authored-by: Ryo Lu <me@ryo.lu>
|
The preview deployment for ryos-dev is ready. 🟢 Open Preview | Open Build Logs | Open Application Logs Last updated at: 2026-04-18 08:29:20 CET |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reduce the CPU/render cost of the streaming chat bubble so Chats stays responsive even while Ryo is mid-reply.
Problem
During streaming,
useChatfires a text delta every ~50ms. On each delta the currently-streaming assistant message was:motion.div/motion.spantree — every token became a Framer Motion node with a per-tokenonCompletecallback.ChatMessageItemto re-render too, because hover/copy/highlight IDs were held globally and passed as string IDs.For a multi-paragraph reply that's hundreds of React elements rebuilt ~20×/second, plus a DOMParser allocation per delta in
decodeHtmlEntities. With a long history or a running stream, this made hovers laggy and input feel unresponsive.Changes
Cherry-picks two previously-scoped perf commits from
cursor/optimize-chat-messages-rendering-1c1a(which had never been merged), then layers on a streaming fast-path and a hot-path micro-optimization:perf(chats): optimize message list rendering and animations(5af95c87a) — custommemocomparator onChatMessageItem, local hover state per item, CSS urgent-bg keyframe instead of framer keyframes, memoised tokens/URLs/image URLs, stable callback identities, pre-filtered highlight segments, memoised display-message array at theChatsAppComponent/useChatRoomlevel.fix(chats): use CSS group-hover for toolbar icons(46ea94a35) — pure-CSSgroup-hover:for the copy/speak/delete toolbar; no re-render per mouse enter/leave. Touch devices still toggle via React state.perf(chats): streaming fast-path + entity decode short-circuit(new) —isStreamingflag flows fromChatMessagesContenttoChatMessageItem; only true for the last assistant message whileisLoading. While streaming, text parts render as a single plain-text span insidewhitespace-pre-wrap. Once the stream completes, the item re-renders with full markdown tokenization (bold/italic/links/citations/speech highlight).decodeHtmlEntitiesfast-paths when the input has no&, avoiding aDOMParserallocation for the ~99% of chat text with no entities.Per-delta work on the streaming bubble drops from
O(N tokens)React element creation + multiple regex scans toO(1)text-content update, which is what the browser should do anyway.Testing
bun run build— builds cleanly.bun run test:unit— 130 pass, 0 fail.bun test tests/test-chat-markdown.test.ts— 4 pass, 0 fail.Per
AGENTS.md/ Cursor Cloud instructions, GUI-driven testing is skipped for non-visual perf changes. The visual output while streaming is a plain-text span (no markdown styling mid-stream); once the final delta arrives the bubble re-renders with the usual bold/italic/link rendering. For short replies that fit in a single delta, this transition isn't observable.