Skip to content

Commit cc90935

Browse files
itomekkovtcharovclaudegithub-advanced-security[bot]
authored
Fix Agent UI Round 5: hide post-tool thinking, FileListView, text spacing (#566)
## Summary Instrumentation-first fixes for three persistent Agent UI bugs (Rounds 3 and 4 had failed due to incorrect assumptions about data flow — this round added console.log diagnostics, observed actual runtime behavior, then fixed based on confirmed evidence). - **Bug 8** — Post-tool italic thought text no longer shown after tool result cards - **Bug 10** — File search results now render as an interactive `FileListView` with individual file rows and an expandable "+N more" button (survives page reload) - **Bug 3** — Pre-tool and post-tool text chunks are now separated by a paragraph break ## Changes | File | Change | |------|--------| | `AgentActivity.tsx` | Filter thinking steps after first tool step in `displaySteps`; add `FileListView` component; fix `hasDetail` to include `fileList` | | `ChatView.tsx` | Extract `result_data.files` from `tool_result` events; `toolOccurredRef` flag for paragraph break between text chunks | | `types/index.ts` | Add `fileList` field to `AgentStep` interface | | `AgentActivity.css` | Styles for `.file-list-view`, `.file-list-item`, `.file-list-more` | | `_chat_helpers.py` | Persist `fileList` in `captured_steps` when `result_data.type == "file_list"` | | `models.py` | Add `FileListResponse` model and `fileList` field to `AgentStepResponse` | ## Test plan - [ ] Send "Find a file on my computer" in a fresh session - [ ] Verify no italic thought text appears below the tool result card (Bug 8) - [ ] Verify file list renders with individual rows and clickable "+N more" button (Bug 10) - [ ] Click "+N more" — list should expand - [ ] Reload page and re-open session — file list should still render (DB persistence) - [ ] Verify response text has proper spacing between sentences (Bug 3) - [ ] Unit tests: `pytest tests/unit/ --ignore=tests/unit/test_packaging.py` — 1044 passed - [ ] Lint: all checks pass ## Notes The `test_packaging.py` failure (`node_modules/@electron/node-gyp` Python files not in `setup.py`) is pre-existing and unrelated to these changes — confirmed by running on the unmodified branch. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Kalin Ovtcharov <kalin@extropolis.ai> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
1 parent 8a6452f commit cc90935

51 files changed

Lines changed: 1851 additions & 4297 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/commands/finalize.md

Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
---
2+
description: Finalizes implementation by rebasing onto main, running the GAIA claude.yml code review (Opus), fixing issues, linting, and looping until tests pass.
3+
---
4+
5+
You are finalizing the current branch's implementation. Work through the following steps in order. Be thorough and methodical — fix every issue you find before moving on.
6+
7+
---
8+
9+
## Step 1: Rebase onto Latest Main
10+
11+
1. Record the current branch: `git rev-parse --abbrev-ref HEAD`
12+
2. Verify the working tree is clean: `git status --porcelain`
13+
- If dirty: stop and tell the user to commit or stash their changes first.
14+
3. Fetch latest: `git fetch origin main`
15+
4. Rebase onto main: `git rebase origin/main`
16+
- If conflicts arise, resolve them per-commit. Prefer the feature branch intent unless main has clearly superseded it. After resolving each conflict: `git add <files>` then `git rebase --continue`.
17+
- If the rebase becomes intractable, run `git rebase --abort` and fall back to `git merge origin/main` with a warning to the user.
18+
5. Push the rebased branch:
19+
- If no remote tracking branch exists yet: `git push -u origin <branch>`
20+
- If remote branch exists: `git push --force-with-lease`
21+
6. Confirm success: `git log --oneline origin/main..HEAD` should show only feature commits, no merge commits.
22+
23+
---
24+
25+
## Step 2: Code Review (claude.yml Equivalent — Use Opus Agent)
26+
27+
This step replicates what the project's `.github/workflows/claude.yml` GitHub Action does when a PR is opened.
28+
29+
### 2a. Generate the diff
30+
31+
Run these commands to produce the review inputs:
32+
```
33+
git diff origin/main...HEAD > pr-diff.txt
34+
git diff --name-status origin/main...HEAD > pr-files.txt
35+
```
36+
37+
### 2b. Check for an existing PR and its review comments
38+
39+
Check if there is an open pull request for this branch:
40+
```
41+
gh pr list --head <branch> --json number,title,url
42+
```
43+
44+
If a PR exists:
45+
- Fetch existing Claude bot review comments: `gh pr view <number> --comments`
46+
- Note any 🔴 Critical or 🟡 Important issues already flagged by the claude.yml action
47+
48+
### 2c. Launch the code-reviewer Opus agent
49+
50+
Use the **code-reviewer** sub-agent (which uses Claude Opus) to perform a full code review of `pr-diff.txt` and `pr-files.txt`. Instruct it to follow the same checklist as the `claude.yml` review:
51+
52+
**Review checklist (from claude.yml):**
53+
- Code Quality & Patterns: architecture consistency, error handling, code style
54+
- Security: SQL injection, command injection, XSS, secrets exposure, path traversal, unsafe deserialization, resource cleanup
55+
- Testing: tests exist for new functionality, edge cases covered
56+
- Documentation: docs/ updated for new features, CLI reference updated if needed
57+
- Breaking Changes: public API compatibility
58+
- Performance: N+1 queries, inefficient algorithms, unnecessary dependencies
59+
60+
**Severity classification:**
61+
- 🔴 Critical — security issues, breaking changes, data loss risks
62+
- 🟡 Important — bugs, architectural concerns, missing tests
63+
- 🟢 Minor — style, optimizations (fix these too if easy)
64+
65+
**Do NOT flag:** Copyright headers, SPDX license identifiers.
66+
67+
### 2d. Fix all Critical and Important issues
68+
69+
Address every 🔴 Critical and 🟡 Important item found by the code-reviewer agent. Also fix 🟢 Minor issues when straightforward. After fixing, do a quick re-read of changed files to verify correctness.
70+
71+
---
72+
73+
## Step 3: The Ralph Wiggum Loop
74+
75+
Repeat this loop until **all three conditions pass**:
76+
- ✅ Lint passes with no errors
77+
- ✅ Code review finds no Critical or Important issues
78+
- ✅ Unit tests pass
79+
80+
### 3a. Lint
81+
82+
Run the linter with auto-fix:
83+
```
84+
python util/lint.py --all --fix
85+
```
86+
87+
Check the output. If the linter reports issues it could not auto-fix, fix them manually. Common issues:
88+
- Import ordering (isort violations) — reorder imports
89+
- Formatting (black violations) — reformat the affected code
90+
- Trailing whitespace, missing newlines at EOF
91+
92+
Re-run lint to confirm it passes cleanly before continuing.
93+
94+
### 3b. Re-run Code Review (Opus agent)
95+
96+
Launch the **code-reviewer** agent again on the current diff (`git diff origin/main...HEAD`) to check if your fixes introduced any new issues or if any Critical/Important items remain unresolved.
97+
98+
Fix any newly found Critical or Important issues.
99+
100+
### 3c. Run Unit Tests
101+
102+
```
103+
python -m pytest tests/unit/ -x --tb=short
104+
```
105+
106+
The `-x` flag stops at the first failure. Analyze failures:
107+
- Read the full traceback
108+
- Identify the root cause (changed interface, broken import, logic error, etc.)
109+
- Fix the underlying issue — do NOT skip or mock away real failures
110+
- Re-run tests to confirm the fix
111+
112+
If all unit tests pass, optionally run the full test suite:
113+
```
114+
python -m pytest tests/ -x --tb=short
115+
```
116+
(Skip integration tests that require external services like Lemonade if they are not running)
117+
118+
### 3d. Loop Control
119+
120+
After completing 3a–3c:
121+
- If **any step failed**, return to 3a and repeat
122+
- If **all steps passed**, exit the loop
123+
124+
---
125+
126+
## Completion
127+
128+
When the loop exits with everything passing, report:
129+
130+
```
131+
✅ FINALIZE COMPLETE
132+
133+
Branch: <branch-name>
134+
Rebased onto: main
135+
136+
Code Review: No Critical or Important issues remaining
137+
Lint: Passing
138+
Tests: All unit tests passing
139+
140+
Ready for PR review / merge.
141+
```
142+
143+
If a PR already exists, note its URL so the user can submit it for final merge.

0 commit comments

Comments
 (0)