fix(transcript): dedupe earlier terminal rows on the activity timeline#65
Merged
chicoxyzzy merged 2 commits intomasterfrom May 9, 2026
Merged
Conversation
A Hecate Chat run that completes successfully emits two
terminal-shaped activity rows: a synced `task_run` mirror that
shows up as `run_result` with title "Run finished", and an
explicit `Activity{Type: status, Title: finalAgentChatActivityTitle(status)}`
appended by the agent-chat handler at turn end. The existing
`isTerminalRunSummary` filter only drops rows whose title
literally matches `/^run\s+(completed|failed|cancelled)$/i`,
which leaves type-only collisions (type=completed title="Done",
type=run_result title="Run finished", etc.) on the timeline. The
operator sees two side-by-side endings for one run.
`terminalAgentActivity` already picks the latest terminal-shaped
row; everything else with a terminal shape is redundant. Drop
earlier terminal rows by index when one was selected.
The line-250 special case (type=completed + title="final answer")
is technically subsumed by the new rule, but I'm leaving it
alone — it predates this change and dropping it would broaden the
dedupe surface beyond what the bug report needed.
Test pins the regression: three activities (tool_call,
run_result, completed) → only the last terminal row survives,
the earlier `run_result` is dropped. UI suite clean: 648 passes.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the UI transcript activity timeline compaction to prevent “double terminal” endings (e.g., a run_result row plus a later completed/failed/cancelled status row) by dropping earlier terminal-shaped activity rows when a later terminal row exists.
Changes:
- Add terminal-row dedupe logic in
compactAgentActivitiesto keep only the latest terminal-shaped row. - Introduce
isTerminalActivityhelper to identify terminal-shaped activities. - Add a regression test covering
tool_call+run_result+completedwhere only the latest terminal row should survive.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| ui/src/features/transcript/TranscriptActivityTimeline.tsx | Adds terminal-row dedupe in activity compaction and a helper for terminal detection. |
| ui/src/features/transcript/TranscriptActivityTimeline.test.tsx | Adds a test to pin the duplicate-terminal-row regression. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Three follow-ups Copilot raised on the dedupe rule. All real:
1. **Predicate mismatch.** `terminalAgentActivity` walked back-to-front
over {completed, failed, cancelled} + activity.terminal, but the
new dedupe filter used `isTerminalActivity` which also includes
`run_result`. When a run_result-typed terminal arrived after a
`completed` row, terminalAgentActivity picked the EARLIER row
and the dedupe (using lastIndexOf on that row) dropped the
later run_result — backwards. Unify the predicate: both the
chooser and the dedupe filter now run through `isTerminalActivity`,
so they cannot disagree about what counts as terminal.
2. **Prefer diagnostic rows.** The synced `task_run` mirror
carries `terminal: true` AND a useful detail like "LLM call
failed on turn 3". The agent-chat handler at turn end appends
a generic `Activity{Type: "failed", Title: "Failed"}` with no
diagnostic detail. The previous dedupe naïvely kept "the latest
terminal-shaped row," which dropped the informative diagnostic
in favor of the bare-bones generic. Replace
`terminalAgentActivity`'s ad-hoc walk with
`pickTerminalActivityIndex`, a small chooser that prefers the
latest `terminal: true` row when one exists, falling back to
the latest type-only-terminal row otherwise.
3. **Stale test comment.** The fixture title was "Run finished"
but the comment said "Run completed". The choice of "finished"
was deliberate (it bypasses the `/^run (completed|failed|cancelled)$/`
regex in isTerminalRunSummary so the row reaches the new
dedupe filter); comment now reflects the actual fixture and
spells out why the title matters.
New test pins the prefer-diagnostic rule: a `run_result` row with
`terminal: true` and detail "LLM call failed on turn 3" wins over
a generic `failed` row from the agent-chat handler. UI suite clean:
649 passes.
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (1)
ui/src/features/transcript/TranscriptActivityTimeline.tsx:248
terminalIndexis picked from the fullactivitieslist, but the loop later unconditionally drops the generic agent-chat terminal row when it istype === "completed" && title === "Final answer"(line 250). In the common case[run_result("Run finished"), completed("Final answer")],pickTerminalActivityIndexwill choose the "Final answer" row, causing the dedupe rule to drop therun_resultrow and then the special-case to drop the chosen row as well—leaving no terminal row in the expanded timeline. Consider excluding this special-cased "Final answer" row frompickTerminalActivityIndex/isTerminalActivity, or computingterminalIndexafter applying the same visibility filters so the chosen terminal row is guaranteed to survive compaction.
const terminalIndex = pickTerminalActivityIndex(activities);
const lastTaskRunIndex = lastIndexOfTaskRunActivity(activities);
const lastApprovalIndexByID = lastIndexByApprovalID(activities);
const out: AgentChatActivityRecord[] = [];
for (const [index, activity] of activities.entries()) {
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A Hecate Chat run that completes successfully emits two terminal-shaped activity rows:
The existing `isTerminalRunSummary` filter only drops rows whose title literally matches `/^run\s+(completed|failed|cancelled)$/i`, so type-only collisions (`type="completed"` + `title="Done"`, `type="run_result"` + `title="Run finished"`, …) survive. Operator sees two side-by-side endings for one run.
`terminalAgentActivity` already picks the latest terminal-shaped row; everything else with a terminal shape is redundant. New rule in `compactAgentActivities`: drop earlier terminal rows by index when a survivor exists.
The line-250 special case (`type="completed"` + `title="final answer"`) is technically subsumed by the new rule, but it predates this change so I left it untouched — narrower scope.
Closes the "duplicate completed / run completed rows" papercut from the Hecate Chat polish list.
Test plan