fix: integrate ChatSummaryHistory into heartbeat protocol (#334)#473
fix: integrate ChatSummaryHistory into heartbeat protocol (#334)#473lml2468 wants to merge 1 commit into
Conversation
…ol (#334) ChatSummaryHistory (sidebar) was polling /summaries/batch-status on its own 5s interval without coordinating with SummaryListPage (2s interval) or SummaryDetailPage (15s fallback). Both already participated in the summary-batch-heartbeat protocol, but the sidebar neither dispatched nor listened, causing duplicate batch-status requests for the same taskIds. - Dispatch summary-batch-heartbeat after each successful batchStatus call so SummaryDetailPage can suppress its fallback poller. - Listen for summary-batch-heartbeat and skip sidebar polls when all active taskIds are already covered by another poller (freshness window: 10s). Adds 4 tests: heartbeat dispatch, suppression, expiry/resume, partial-cover.
OctoBoooot
left a comment
There was a problem hiding this comment.
Review: fix: integrate ChatSummaryHistory into heartbeat protocol (#334) (#473)
Verdict: Request changes
This fixes #472's two problems but reverts the design change that made them fixable, re-introducing #471's self-suppression 🔴. The fix is oscillating between two designs, each of which trades one bug for another — it needs both properties at once.
#472 issues — resolved:
- ✅ Partial-coverage: now uses
ids.every(id => heartbeatCoveredTaskIds.has(id))(was.some(...)), so the sidebar only suppresses when all its active tasks are covered — an uncovered task no longer gets stranded. - ✅ No-expiry over-suppression: the 10s staleness check (
Date.now() - this.lastHeartbeatTime < 10000) is back, so a silent-but-mounted coordinator no longer wedges the sidebar permanently.
But #471's self-suppression 🔴 is back:
Blockers
ChatSummaryHistory.tsx:112 + :145— the sidebar both dispatchessummary-batch-heartbeatafter each poll (line ~112) and listens for it (handleBatchHeartbeat, :145), with no source id indetail(onlytaskIds).window.dispatchEventis synchronous, so the sidebar's own heartbeat immediately re-enters its own listener, settingheartbeatCoveredTaskIds/lastHeartbeatTimeto its own poll. Trace as sole poller (no SummaryListPage mounted — the common sidebar-only case), tasks[42]: poll at t=0 → self-heartbeat sets covered={42}, lastHeartbeatTime=0 → t=5s tick:now-last < 10000AND[42].every(∈{42})→ suppressed → next real poll only at t≈10s. The intended 5s cadence degrades to ~10s whenever the sidebar polls alone. This is exactly the #471 self-suppression bug (there it was 5s→15s; the design that fixed it in #472 — pure consumer, never dispatch — has been reverted here). Fix: either keep #472's pure-consumer design (don't dispatch from the sidebar) plus #473'severy()+expiry, or tag the heartbeat with a source/owner id and havehandleBatchHeartbeatignore self-originated events.- Tests miss it again. The 4 new cases inject external heartbeats (
window.dispatchEvent(...)then advance) — none simulates "sidebar polls alone across ≥2 intervals with no external heartbeat" and asserts it keeps the 5s cadence. Same gap that let #471 through. Add that sole-poller test and it would fail on this diff.
Praise
- The
every()+ 10s-expiry combination is the correct coordination logic — it's strictly better than both #471 (some(), self-suppressing) and #472 (binary flag, no expiry). The two properties #473 gets right (full-coverage-only suppression + staleness recovery) are exactly the two that #471 and #472 each got half of. The remaining task is to keep these while not re-arming suppression from the sidebar's own heartbeat — i.e. combine #472's pure-consumer dispatch discipline with #473's coverage/expiry logic.
Out of scope (informational)
- Draft PR. This is the 3rd iteration on #334 (471→472→473), each fixing the prior's regression while re-introducing an earlier one. The end state needs: (a) full-coverage-only suppression [#473 ✅], (b) staleness expiry [#473 ✅], (c) no self-suppression [#472 ✅, #473 ✗]. Landing all three together closes it.
There was a problem hiding this comment.
This iterates on #472 (same issue #334, same 2 files). The two #472 residuals are now fixed, but the new self-dispatch reintroduces the #471-class self-suppression bug.
🔴 Blocking
🔴 Critical — sidebar self-suppresses its own polling (5s → 10s when sole poller). ChatSummaryHistory now both dispatches and listens to summary-batch-heartbeat with no source/owner discriminator in the event detail:
doPoll()dispatches the heartbeat with its owntaskIds—packages/dmworksummary/src/components/ChatSummaryHistory.tsx:114- the same component's
handleBatchHeartbeatconsumes that very event and recordslastHeartbeatTime = Date.now()+heartbeatCoveredTaskIds = taskIds—ChatSummaryHistory.tsx:147-151 - next 5s tick, the freshness check skips the poll because all ids are "covered" —
ChatSummaryHistory.tsx:89-93
Trace as the sole poller (no SummaryListPage/DetailPage mounted — the common case, since the sidebar lives in the chat panel and the user usually doesn't have the list page open): poll@t=5s → self-heartbeat → skip@t=10s → stale@t=15s polls again → skip@t=20s ... → effective interval degrades to 10s instead of the intended 5s. This is the exact #471 self-suppression class that #472 deliberately fixed by making the sidebar a pure consumer; #473 re-adds self-dispatch without guarding against self-consumption.
Fix: add a source/owner marker to the event detail (e.g. an instance id) and have handleBatchHeartbeat ignore heartbeats emitted by the same poller instance, while still honoring SummaryListPage heartbeats.
💬 Non-blocking
🔵 Missing regression test for the self-suppression path. The new tests cover external full-coverage suppression, stale-heartbeat recovery, and partial-coverage non-suppression — but none assert the standalone sidebar keeps its 5s cadence. Add: after one successful sidebar poll (which self-dispatches), advance another 5s with no external heartbeat and assert batchStatus is called again.
✅ Highlights / verified-fixed residuals
- Partial-coverage (#472 residual) — FIXED. Suppression predicate is now
ids.every(id => this.heartbeatCoveredTaskIds.has(id))(ChatSummaryHistory.tsx:92), not the oldsome(). On partial overlap (sidebar [10,20], heartbeat [10]) the whole set [10,20] is still polled. New test "does not skip poll when heartbeat only partially covers active taskIds" pins this. - No-expiry / permanent-suppression (#472 residual) — FIXED. Suppression is now time-bounded:
Date.now() - this.lastHeartbeatTime < 10000(ChatSummaryHistory.tsx:91). When SummaryListPage stops polling while mounted (activeIds→0,SummaryListPage.tsx:124-127stopBatchPollwithout unmount) no fresh heartbeat arrives, so within 10s the sidebar self-heals and resumes — no longer dependent onsummary-list-unmount. SummaryListPage is not in this PR's changed files and does not need to be: the expiry lives on the consumer side. New test "resumes poll when no new heartbeat arrives after a skipped cycle" pins this. - Event shape reuses the existing
summary-batch-heartbeatcontract used bySummaryListPage(SummaryListPage.tsx:152).
Note: PR is a Draft. CI is green (Build skipped, code-review pending). Could not run the focused vitest suite — this mirror checkout has no installed node_modules.
mochashanyao
left a comment
There was a problem hiding this comment.
[Octo-Q · automated review]
Verdict: Request changes — blocking findings below (data-flow traced).
octo-web PR#473 Review Report — fix: integrate ChatSummaryHistory into heartbeat protocol (#334)
Reviewer: Octo-Q (automated review)
Head SHA: 075c0d71b509b2112c0f8c34a754680170405d7f
Base: main (a3228c5)
1. Summary
This PR integrates the ChatSummaryHistory sidebar component into the existing summary-batch-heartbeat coordination protocol (already used by SummaryListPage → SummaryDetailPage). The intent is to eliminate redundant batchStatus API calls from the sidebar when SummaryListPage (2s interval) already covers the same taskIds.
Files changed (2 files, +178 −0):
ChatSummaryHistory.tsx— heartbeat dispatch + listen + suppression logicChatSummaryHistory.test.tsx— 4 new tests for heartbeat behavior
2. Findings
🔴 P1 — Self-heartbeat causes unintended poll self-suppression
File: ChatSummaryHistory.tsx
Diff-scope: New (introduced by this PR)
ChatSummaryHistory both dispatches and listens to the same summary-batch-heartbeat CustomEvent on window. Since window.dispatchEvent is synchronous and delivers to all registered listeners, the component receives its own heartbeat immediately after dispatching it.
Data-flow trace:
-
doPoll(taskIds)(line ~114) callsbatchStatus(taskIds), then:window.dispatchEvent(new CustomEvent('summary-batch-heartbeat', { detail: { taskIds } }));
-
handleBatchHeartbeat(line ~147) fires synchronously on the same component instance:this.heartbeatCoveredTaskIds = new Set(taskIds); // own taskIds this.lastHeartbeatTime = Date.now(); // now
-
Next
setIntervaltick (T+5000ms) inmaybeStartPoll()(line ~90):Date.now() - this.lastHeartbeatTime < 10000 // 5000 < 10000 → true ids.every(id => this.heartbeatCoveredTaskIds.has(id)) // own ids → true
→ Poll skipped.
-
T+10000ms:
10000 < 10000 → false→ poll fires, dispatches heartbeat again → cycle repeats.
Effect: The sidebar polls every 10 seconds instead of every 5 seconds, and only because of a strict-< edge case (10000 < 10000 = false), not by design. Status updates in the sidebar lag by up to 10s when no other poller is present (e.g., user has the list page closed but the sidebar open).
This is not the PR's intent — the listen path should only suppress in response to other pollers' heartbeats (e.g., SummaryListPage), not the component's own.
R1 check: This makes a previously working path (5s sidebar polling) produce stale data (10s effective interval). The degradation is user-visible: status badges in the sidebar update at half the intended frequency. → P1 blocking.
Fix direction: Either (a) set a _dispatchingHeartbeat flag before dispatch and skip self-events in the handler, (b) add a source field to the heartbeat detail (e.g., { source: 'sidebar', taskIds }) and ignore own-source events, or (c) move the dispatch to after the finally block or use a separate event name for the dispatch path.
⚠️ P2 — Test suite passes despite P1; tests don't isolate self-heartbeat from external heartbeat
File: ChatSummaryHistory.test.tsx
Diff-scope: New tests introduced by this PR
The four new tests all pass, but two of them conflate self-heartbeat with manually dispatched heartbeat:
-
"skips poll when heartbeat covers all active taskIds" (line ~468): The first poll at T+5s dispatches a self-heartbeat (sets
lastHeartbeatTime = T+5s). The manual dispatch afterward overwriteslastHeartbeatTimeto approximately the same value. The 8s advance then testslastHeartbeatTime + 8s < 10s— this passes regardless of the manual dispatch, because the self-heartbeat alone produces the same result. The test would still pass if the manual dispatch were removed. -
"resumes poll when no new heartbeat arrives after a skipped cycle" (line ~508): The suppression at T+5s is caused by the component's own self-heartbeat from
loadHistory→maybeStartPoll→firstdoPoll, not by the manual dispatch. The resume at T+10s happens because10000 < 10000 = false— a coincidental edge of the strict-<comparison, not a tested design property.
Missing test: A test that calls batchStatus once, then advances time without any manual heartbeat dispatch, and asserts that batchStatus IS called on the next cycle — this would currently fail (proving the P1 bug) and should be added as a regression guard.
P2 — heartbeatCoveredTaskIds wholesale replacement on each heartbeat
File: ChatSummaryHistory.tsx, handleBatchHeartbeat (line ~147)
Diff-scope: New
this.heartbeatCoveredTaskIds = new Set(taskIds);Each incoming heartbeat replaces the entire covered set rather than merging. If SummaryListPage polls {10, 20} at T and another poller dispatches {30} at T+1s, the {10, 20} coverage is lost. In practice this is unlikely to cause issues (SummaryListPage is the primary heartbeat source and polls all active IDs together), but the design is fragile for a multi-poller protocol.
Nit — Magic number 10000
File: ChatSummaryHistory.tsx, line ~91
The 10-second freshness window is hardcoded. SummaryListPage polls at 2s, so 10s = 5 missed polls before the sidebar considers the heartbeat stale. This should be a named constant (e.g., HEARTBEAT_FRESHNESS_MS) shared with SummaryDetailPage (which uses a 15s threshold at line ~468), to prevent silent drift if one is changed without the other.
3. Verification
| Item | Status | Evidence |
|---|---|---|
| Heartbeat dispatch matches SummaryListPage format | ✅ | Both use new CustomEvent('summary-batch-heartbeat', { detail: { taskIds } }) — SummaryListPage.tsx:152, ChatSummaryHistory.tsx:~114 |
| Listener registration/cleanup symmetric (mount/unmount) | ✅ | addEventListener in componentDidMount (line 44), removeEventListener in componentWillUnmount (line 52) |
| SummaryDetailPage handleBatchHeartbeat unaffected | ✅ | SummaryDetailPage.tsx:399 — only reads, never dispatches. Will now also receive sidebar heartbeats, which is a positive side effect (more fresh heartbeats). |
isPolling guard preserved |
✅ | doPoll still has if (this.isPolling) return at the top |
| Test additions correct in isolation | Tests verify the suppression path works but don't verify the non-suppression path still works (see P2 above) |
4. Data-Flow Trace
| Consumer | Data consumed | Upstream source | Does data actually arrive? |
|---|---|---|---|
maybeStartPoll suppression check |
this.heartbeatCoveredTaskIds |
handleBatchHeartbeat ← self-dispatch OR external dispatch |
✅ Yes — but self-dispatch always populates it with own IDs, causing unintended suppression |
maybeStartPoll suppression check |
this.lastHeartbeatTime |
handleBatchHeartbeat ← self-dispatch OR external dispatch |
✅ Yes — but self-dispatch sets it to Date.now() after every poll, making freshness check pass on next tick |
doPoll heartbeat dispatch |
taskIds parameter |
setInterval callback → getActiveTaskIds() → state.items filter |
✅ Yes — active task IDs flow correctly |
handleBatchHeartbeat |
event.detail.taskIds |
CustomEvent from self or other pollers |
✅ Yes — properly validated for empty/undefined |
5. Blind-spot Checklist (R5)
- C1 — Double-path parity: Dispatch + listen are symmetric within this component, which is the root cause of the P1 self-suppression bug. The other participant (
SummaryListPage) dispatches but does NOT listen, avoiding this problem. The protocol is asymmetric by design; this PR mistakenly added both roles to the same component. → HIT (P1). - C2 — Control-flow ordering / nesting: The heartbeat is dispatched before state update in
doPoll, meaninghandleBatchHeartbeatfires synchronously whileisPollingis still true. No double-action risk sincehandleBatchHeartbeatonly writes instance fields. Theevery()check in the suppression path is correct for emptyids(vacuously true) but is guarded by theids.length === 0early return above it. → Clear. - C3 — Authorization boundary: N/A (frontend UI, no auth-relevant changes).
- C4 — Authorization lifecycle: N/A.
- C5 — Build ≠ runtime: N/A (no build/packaging changes).
- C6 — Governance docs: N/A.
6. Cross-round Blocker Review (R6)
N/A — first review of this PR.
7. Recommendations
-
Fix P1: Prevent self-heartbeat suppression. Simplest fix: set a flag before dispatching and check it in the handler:
private _isDispatchingHeartbeat = false; // In doPoll(): this._isDispatchingHeartbeat = true; window.dispatchEvent(new CustomEvent('summary-batch-heartbeat', { detail: { taskIds } })); this._isDispatchingHeartbeat = false; // In handleBatchHeartbeat(): if (this._isDispatchingHeartbeat) return;
-
Add regression test: A test that verifies the sidebar still polls on consecutive 5s cycles when NO external heartbeat is dispatched:
// advance 5s → batchStatus called // advance 5s more → batchStatus called again (NOT suppressed)
-
Extract heartbeat freshness threshold as a shared constant with
SummaryDetailPage.
[Octo-Q] verdict: REQUEST_CHANGES — P1 self-heartbeat suppression effectively doubles the sidebar polling interval from 5s to 10s, degrading status freshness below the pre-PR baseline. Fix is straightforward (filter self-dispatched events in the handler), and tests need a regression guard for the non-suppression path.
yujiawei
left a comment
There was a problem hiding this comment.
Code Review — PR #473 (octo-web)
Reviewed at head SHA 075c0d71b509b2112c0f8c34a754680170405d7f. Diff is 2 files: ChatSummaryHistory.tsx (+29 source) and its test (+149). I read the sibling pollers (SummaryListPage.tsx, SummaryDetailPage.tsx) to understand the existing summary-batch-heartbeat protocol, and ran the unit tests locally (15/15 pass).
1. Spec compliance
The linked issue (#334) proposes "fix direction #2 (minimal): bring ChatSummaryHistory into the summary-batch-heartbeat protocol (send + receive)." The diff does implement send + receive. However, it implements only the suppressing half of the protocol and omits the two compensating signals the existing implementation always pairs with the heartbeat — see the P1 below. So as written it does not fully satisfy the intent ("close gap B" without opening a new one).
- Missing: the heartbeat is emitted, but
summary-status-changeandsummary-list-unmountare not (see P1). - Over-build: none.
- Deviation: none in approach; the gap is incompleteness, not a wrong direction.
2. Code quality
Verdict: Changes Requested
P1 — Asymmetric protocol participation can permanently starve a coexisting detail page
ChatSummaryHistory now emits summary-batch-heartbeat after every successful poll (ChatSummaryHistory.tsx:114). On the receiving side, SummaryDetailPage treats any heartbeat that includes its taskId as authoritative coverage and stops its own fallback poll:
// SummaryDetailPage.tsx:399-407
private handleBatchHeartbeat = (event: Event) => {
if (this.taskId == null) return;
const taskIds = (event as CustomEvent).detail?.taskIds;
if (!taskIds || !taskIds.includes(this.taskId)) return;
this.listPageActive = true;
this.lastEventTime = Date.now();
this.stopFallbackPoll();
};
In the working implementation, SummaryListPage pairs the heartbeat with two compensating signals so suppression stays safe:
summary-status-change(withchangedIds) on every detected status change (SummaryListPage.tsx:170) → makes the suppressed detail page reload its content.summary-list-unmounton unmount (SummaryListPage.tsx:84) → re-arms the suppressed detail page's fallback (SummaryDetailPage.tsx:454-461).
ChatSummaryHistory emits neither (confirmed: no summary-status-change / summary-list-unmount dispatch anywhere in the new code). The concrete failure path when the sidebar and an embedded/route SummaryDetailPage coexist over the same taskId (exactly the multi-instance scenario #334 documents):
- Sidebar
doPollsucceeds → dispatches heartbeat. - Detail page receives it →
listPageActive = true,stopFallbackPoll(). - Task transitions PROCESSING → COMPLETED. The sidebar detects this and updates its own list state (
ChatSummaryHistory.tsx:125-126) but emits nosummary-status-change, so the detail page never reloads. - The detail page's only self-heal (the
Date.now() - lastEventTime > 15000staleness reset) lives insidestartFallbackPoll(SummaryDetailPage.tsx:468-469), which is never re-entered because nothing re-arms it. The sidebar also never emitssummary-list-unmount.
Result: the detail view can sit on a stale PROCESSING state indefinitely while the task has actually completed. This is a user-visible regression introduced by the change, so it should be addressed before merge. Fix: either emit the companion summary-status-change (on detected change) and summary-list-unmount (on unmount) from ChatSummaryHistory to match the existing contract, or scope the heartbeat so it does not suppress pollers the sidebar can't compensate for.
P2 — Self-suppression degrades the sidebar's own 5s cadence
ChatSummaryHistory both dispatches (:114) and listens to (:44) the same event, and handleBatchHeartbeat records its own emission (:150-151). Because the freshness window (< 10000, :91) is larger than the poll interval (5000, line 95), each successful poll suppresses the next tick: every poll that covers all active ids will skip the following cycle, dropping the effective sidebar cadence from ~5s to ~10s (often more with request latency). The fix should ignore self-emitted heartbeats — e.g. tag the event detail with a per-instance sender id and skip it in handleBatchHeartbeat when it matches this.
P2 — All-or-nothing, replacement-based coverage leaves redundancy
handleBatchHeartbeat replaces coverage wholesale: this.heartbeatCoveredTaskIds = new Set(taskIds) (:150), and the suppression check requires ids.every(...) (:92). Two consequences:
- Overlapping subsets clobber each other: heartbeats for
[1]then[2]leave coverage{2}, so active[1,2]still polls both. - Partial coverage gives no benefit: a heartbeat for
[1]while active is[1,2]still polls[1,2], re-duplicating the covered id rather than polling only the uncovered[2].
So the duplicate-polling the PR targets can still persist for overlapping task sets. A taskId -> lastSeenTimestamp map (and polling only the uncovered subset) would address this. Advisory — not blocking on its own.
Positives
- Listener registration/teardown is correctly paired in
componentDidMount/componentWillUnmount— no leak. - Tests are well-structured and the freshness/expiry/partial-coverage cases are covered; they pass (15/15). Note the tests verify the sidebar's local behavior only and would not catch the P1 cross-component regression.
3. Overall verdict
CHANGES_REQUESTED — the P1 cross-component starvation is a real, user-visible regression. The two P2s are quality improvements that would make the fix actually eliminate the redundancy it targets.
根因 / 实现说明
ChatSummaryHistory(侧边栏)的 5sbatchStatus轮询完全游离在summary-batch-heartbeat协调协议之外:既不发送心跳,也不监听心跳。当SummaryListPage(2s)和/或SummaryDetailPage(15s fallback)已经覆盖相同 taskId 时,sidebar 仍在独立发请求,造成重复的 batch-status 调用。修复:把
ChatSummaryHistory接入心跳协议——batchStatus成功后广播summary-batch-heartbeat(与SummaryListPage行为一致),让SummaryDetailPage的 fallback poller 能感知到 sidebar 也在轮询。summary-batch-heartbeat,当所有活跃 taskId 都已被其他 poller 覆盖且心跳在 10s 内,跳过本周期轮询。心跳过期(≥10s 无新心跳)后自动恢复。改动摘要
packages/dmworksummary/src/components/ChatSummaryHistory.tsxheartbeatCoveredTaskIds、lastHeartbeatTime实例字段componentDidMount/componentWillUnmount注册/注销summary-batch-heartbeat监听doPoll()成功后 dispatchsummary-batch-heartbeatmaybeStartPoll()定时器回调中增加心跳抑制判断packages/dmworksummary/src/components/__tests__/ChatSummaryHistory.test.tsx测试
ChatSummaryHistory测试通过验证 (baseline diff, charter §6)
Fixes #334