fix(notebook): stop streaming jumpiness — disable smooth markdown + stable adapter by Movm · Pull Request #695 · netzbegruenung/Gruenerator

Movm · 2026-04-26T01:15:29Z

Summary

Notebook chat answers visibly jump up and down during streaming on long, citation-dense responses. Two unrelated causes were compounding:

MarkdownTextPrimitive smooth-streaming. assistant-ui's `smooth=true` default re-buffers SSE chunks and reveals them character-by-character via `requestAnimationFrame`. With notebook's citation-dense answers (placeholder `^{` badges via `processChildren(text, citationMap, true)`), this caused continuous line-wrap recalculation — observable as +400/-500 pixel oscillations mid-stream. Diagnosed via a temporary `ResizeObserver`-based viewport jump monitor: identical `+1256 / −1252` cancellation pairs landed on every `useSmooth` animation frame.}
`NotebookChatProvider.getConfig` recreated the adapter on every prop identity change. `useCallback` had seven deps; any upstream churn (e.g. `config.collections` rebuilt by `getNotebookConfig` per render) recreated the adapter, which forced `useLocalRuntime` to reinitialize and reset scroll/streaming state. Logged in console as the existing `[Notebook] ⚠ Adapter RECREATED` warning, firing before the first request.

General chat is unaffected — its answers are shorter, less citation-dense, and its provider doesn't churn the adapter the same way.

Fix

New `MarkdownStreamingContext`: per-thread switch for `smooth`. Default `true` (matches assistant-ui, preserves typewriter feel for general chat). `NotebookChatProvider` overrides to `false`. `CitationMarkdownText` reads from context.
Stabilized `getConfig`: same ref-pattern already in use for `threadId`/`getFilters`/`onComplete`. `getConfig` is now a `useCallback([])` that snapshots all inputs from refs at request time — adapter is created exactly once per provider mount.

Files

`packages/chat/src/context/MarkdownStreamingContext.tsx` (new)
`packages/chat/src/runtime/NotebookChatProvider.tsx` (refs + provider wrap)
`packages/chat/src/components/message-parts/CitationMarkdownText.tsx` (read context, pass `smooth`)
`packages/chat/src/index.ts` (export new context APIs)

Test plan

Notebook chat: send a long answer (3k+ chars, many citations). No mid-stream up/down jumps. End-of-stream snap is reduced to a single small delta.
Notebook chat: `[Notebook] ⚠ Adapter RECREATED` warning no longer fires before the first request.
General chat: typewriter animation still active for incoming chunks.
No regression in citation badges (placeholders during stream → real popovers after).

Fixes the admin gate ("Kein Zugriff" page) caused by the legacy /status handler stripping fields from the user response. The hand-picked literal omitted is_admin and ~30 other UserProfile fields, so the frontend's useAuthStore could never see is_admin=true even when the DB row had it. Replaces the manual pick with toBetterAuthUser() — the same null-strip + Zod-parse boundary already used by authMiddleware. The new ts-rest contract enforces the response shape end-to-end, making this regression class impossible to reintroduce. - packages/contracts: authStatusContract + authStatusResponseSchema reusing userProfileSchema.nullable() - apps/api: authStatusContractRouter mounted in routes.ts before the legacy authRouter; legacy /status removed from authCore.ts - packages/shared: authStatus added to the typed ContractsClient - middleware: toBetterAuthUser exported for reuse

…page Remove the two cards from the StatisticsSection grid so the startpage focuses on document count and date range. Collapse the grid from four columns to two so the remaining cards fill the row at all breakpoints, and shrink the matching skeleton placeholders.

Regolo's gemma4-31b endpoint hangs upstream — every notebook chat with the default Gemma model just spun forever because the AI SDK has no built-in first-token deadline. This change: - Adds litellmFetchWithThinkingDisabled (sibling to regoloThinkingFetch) injecting Ollama's `think: false` so LiteLLM-served gemma streams content instead of burning its entire token budget on `reasoning`. - Re-routes the user-facing "Gemma 4" model from Regolo → LiteLLM. Old `gemma-regolo` ID is aliased server-side and migrated client-side (chatStore v6) to the new `gemma-litellm` ID. - Adds Qwen 3.6 27B as a selectable model (already in the existing Regolo reasoning-stream allowlist, so no extra wiring). - Introduces a 20s first-token deadline + single-step cross-provider fallback (gemma-litellm ↔ gpt-oss-regolo) in responseStreamingService. Qwen entries intentionally have no `fallback` field — the Chinese-only-when-selected firewall (informed-consent boundary, documented in ModelConfig). - Fixes pre-existing bug: getModel('litellm', modelId) ignored the modelId arg and always used LITELLM_DEFAULT_MODEL. Fallback is silent end-user-side: server emits a `fallback` SSE event, both runtime adapters log it to the browser console, no UI banner. Implementation notes: - streamAndAccumulate / streamAndAccumulateWithReasoning now have a shared `wrapWithCompatCatch` factory and an `*OrThrow` internal layer used by streamWithFallback. Existing chat router callers see the same null-on-failure shape, plus the new deadline + empty-completion safety nets for free. - Single shared deadline across initial-probe iterations (was accidentally giving 40s grace via per-call setTimeout). - Reasoning streamer split into Phase-1 (race vs deadline until first text) + Phase-2 (drain without race) — eliminates wasted Promise.race microtask hops on every reasoning chunk after first content. - Uses native AbortSignal.any() (Node 20.3+) instead of a hand-rolled composeAbortSignals helper.

… fallback" This reverts commit 5ba1b6d.

…+ stable adapter Two unrelated causes worked together to make notebook chat answers visibly jump up and down during streaming: 1. assistant-ui's MarkdownTextPrimitive defaults to smooth=true, which re-buffers SSE chunks and reveals them character-by-character via requestAnimationFrame. With notebook's citation-dense answers (placeholder <sup> badges via processChildren(..., true)), this caused continuous line-wrap recalculation — observable as +400/-500 pixel oscillations in mid-stream. General chat is fine because its answers are shorter and less citation-dense. 2. NotebookChatProvider's getConfig was wrapped in useCallback with seven deps. Any upstream prop identity churn (e.g. config.collections rebuilt by getNotebookConfig on every render) recreated the adapter, which forced useLocalRuntime to reinitialize and reset scroll/streaming state. Visible as the "[Notebook] ⚠ Adapter RECREATED" warning firing before the first request. Fix: - New per-thread MarkdownStreamingContext. Default smooth=true preserves the typewriter feel for general chat. NotebookChatProvider sets smooth=false. CitationMarkdownText reads from context. - All getConfig inputs moved behind refs (same pattern already used for threadId/getFilters/onComplete). getConfig itself is now a stable useCallback([]) that snapshots refs at request time, so the adapter is created exactly once per provider mount.

Movm added 6 commits April 25, 2026 19:50

refactor(notebook): rename "Globaler Chat" tab label to "Chat"

f44d8d6

Revert "feat(chat): route Gemma via LiteLLM, add Qwen 3.6, deadline +…

1abee42

… fallback" This reverts commit 5ba1b6d.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(notebook): stop streaming jumpiness — disable smooth markdown + stable adapter#695

fix(notebook): stop streaming jumpiness — disable smooth markdown + stable adapter#695
Movm wants to merge 6 commits intomasterfrom
fix/notebook-chat-jumpiness

Movm commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Movm commented Apr 26, 2026

Summary

Fix

Files

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant