Skip to content

Commit 2b79fce

Browse files
ZviBaratzclaude
andcommitted
fix: resolve pipeline findings F-022 through F-028
- F-022: rewrite ego-review Phase 3 to verify resource graph instead of rebuilding it manually - F-023: add skip logic to Phase 5a for automated AI checklist items - F-024: close as already fixed (check-disclosures.py has all 6 capabilities) - F-025: add --format=table to build-resource-graph.py for markdown output - F-026: redesign ego-submit from 3-agent concurrent to 2-phase sequential+parallel - F-027: add deduplicate:true to R-VER48-04b pattern rule - F-028: defer baseline suppression (inline ignore sufficient) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 7203293 commit 2b79fce

File tree

5 files changed

+136
-52
lines changed

5 files changed

+136
-52
lines changed

docs/internal/field-test-hara-hachi-bu.md

Lines changed: 35 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,13 +17,13 @@ This is the consolidated findings registry for the project's calibration baselin
1717

1818
## Current Baseline
1919

20-
Latest ego-lint results (2026-03-02):
20+
Latest ego-lint results (2026-03-03, post F-027 dedup fix):
2121

2222
| Status | Count |
2323
|--------|-------|
24-
| PASS | 206 |
24+
| PASS | 205 |
2525
| FAIL | 0 |
26-
| WARN | 7 |
26+
| WARN | 8 |
2727
| SKIP | 23 |
2828
| Exit | 0 |
2929

@@ -40,6 +40,7 @@ Latest ego-lint results (2026-03-02):
4040
| R-PREFS-04c | GTK layout widget advisory — correct (ListBox, SpinButton in prefs) |
4141
| R-SLOP-40 | Promise wrapper advisory — correct (D-Bus proxy constructor) |
4242
| R-QUAL-33 | Gio._promisify() module-scope advisory — correct (standard GJS pattern) |
43+
| R-VER48-04b | vertical property deprecated advisory — correct (deduplicated to 1 per file) |
4344
| metadata/shell-version-current | GNOME 49 not in shell-version — intentional (untested) |
4445
| quality/private-api | Private API with inline justification — disclosed in metadata |
4546

@@ -139,6 +140,13 @@ The A1-A7 checklist was entirely manual. Some items (accessible-role usage, acce
139140
| F-019 | Reviewer notes template | Fixed (2026-03-01) | 2026-02-28 |
140141
| F-020 | Readiness report format | Fixed (2026-03-01) | 2026-02-28 |
141142
| F-021 | AI slop overlap | Fixed (2026-03-01) | 2026-02-28 |
143+
| F-022 | Phase 3 rebuilds resource graph manually | Fixed (2026-03-03) | 2026-03-03 |
144+
| F-023 | Phase 5a re-checks automated AI patterns | Fixed (2026-03-03) | 2026-03-03 |
145+
| F-024 | check-disclosures.py missing 4 capabilities | Fixed (2026-03-03) | 2026-03-03 |
146+
| F-025 | Resource graph markdown table output | Fixed (2026-03-03) | 2026-03-03 |
147+
| F-026 | Parallel protocol prevents ego-lint reuse | Fixed (2026-03-03) | 2026-03-03 |
148+
| F-027 | R-VER48-04b deduplicate regression | Fixed (2026-03-03) | 2026-03-03 |
149+
| F-028 | Baseline suppression for known warnings | Deferred | 2026-03-03 |
142150

143151
**F-016: Parallelization strategy missing from ego-submit**
144152
ego-submit describes sequential phases, but they're largely independent. A 3-agent parallel approach (lifecycle+signals, security+quality, package+metadata) cut wall-clock time from ~10 to ~4 minutes. **Fix**: Added "Parallel Execution Protocol" section to `skills/ego-submit/SKILL.md` with agent roles, no-early-stopping rule, and deduplication strategy.
@@ -158,6 +166,27 @@ ego-submit says to produce a "readiness report" but doesn't define format. **Fix
158166
**F-021: AI slop overlap between ego-lint and ego-review**
159167
`check-quality.py` covers some AI patterns, but the 46-item `ai-slop-checklist.md` doesn't indicate which items are automated. Reviewer agents re-check automated items. **Fix**: Added `**Automated:**` field to each checklist item with Yes/No/Partial and the ego-lint check name.
160168

169+
**F-022: Phase 3 rebuilds resource graph manually**
170+
ego-review Phase 3 instructs agents to grep for signals, timeouts, file monitors, and D-Bus proxies — duplicating what `build-resource-graph.py` already computes in Phase 2. Agent 2 spent ~200s of 381s on this. **Fix**: Rewrite Phase 3 to verify the graph output rather than rebuild it. See [pipeline-review-2026-03-03.md](pipeline-review-2026-03-03.md).
171+
172+
**F-023: Phase 5a re-checks automated AI patterns**
173+
Despite F-021 adding automation mapping to the AI slop checklist, Phase 5a instructions don't tell agents to skip automated items. Agent 3 searched for all 46 items. **Fix**: Update Phase 5a instructions to use ego-lint results for automated items.
174+
175+
**F-024: check-disclosures.py missing 4 capabilities**
176+
F-012 added clipboard+network disclosure checking, but the 6-capability disclosure matrix (also pkexec, subprocess, private API, file I/O) is still partly manual. **Fix**: Extend check-disclosures.py to cover all 6. **Status**: Already fixed — check-disclosures.py covers all 6 capabilities (clipboard, network, pkexec, private-api, file-io, subprocess) at lines 35-99. The pipeline review observation was based on stale data from the F-012 description.
177+
178+
**F-025: Resource graph markdown table output**
179+
Readiness report requires a per-resource tracking table. Agents manually format this from graph JSON. **Fix**: Add `--format=table` flag to output markdown directly.
180+
181+
**F-026: Parallel protocol prevents ego-lint reuse**
182+
3-agent parallel protocol runs ego-lint concurrently with review agents, so Agents 2-3 can't use ego-lint results. **Fix**: Two-phase approach: run ego-lint first (~30s), then fan out 2 review agents with ego-lint output as context.
183+
184+
**F-027: R-VER48-04b deduplicate regression**
185+
Rule fired twice for quickSettingsPanel.js (lines 100 and 909). Previous run showed single WARN coincidentally. **Fix**: Not a regression — rule never had `deduplicate: true`. Added the field to collapse per-file hits into a single advisory WARN.
186+
187+
**F-028: Baseline suppression for known warnings**
188+
No mechanism to mark warnings as acknowledged. Every run produces the same 7-9 known WARNs, drowning new findings. **Deferred**: Existing inline `// ego-lint-ignore` suppression is sufficient. The known WARNs are correct warnings that flag real patterns reviewers will notice — suppressing them could mask regressions. Baseline feature would require changes to ego-lint.sh's output pipeline, JSON file management, and new CLI flags (medium effort, P2).
189+
161190
### EGO Reviewer Feedback
162191

163192
*No entries yet. This section will be populated when hara-hachi-bu completes EGO review.*
@@ -188,6 +217,8 @@ ego-submit says to produce a "readiness report" but doesn't define format. **Fix
188217
| 2026-02-28 | 193 | 0 | 5 | 17 | 1 | F-006 through F-015 fixed; 4 new Tier 2 scripts |
189218
| 2026-03-01 | 201 | 0 | 8 | 23 || 6 new pattern rules, +3 WARNs (R-SLOP-40, R-QUAL-33, R-PREFS-04c), +6 SKIP (VER49/50) |
190219
| 2026-03-02 | 206 | 0 | 7 | 23 | 1 | Full ego-submit pipeline (3-agent parallel). ESLint WARN resolved. +5 PASS. ego-submit: READY TO SUBMIT |
220+
| 2026-03-03 | 205 | 0 | 9 | 23 || Full ego-submit (3-agent parallel, fresh session). +2 WARNs (R-VER48-04b dedup regression). Pipeline efficiency review → F-022 through F-028 |
221+
| 2026-03-03 (post-fix) | 205 | 0 | 8 | 23 || F-022/F-023/F-025/F-026/F-027 fixed, F-024 closed, F-028 deferred. R-VER48-04b deduplicated (9→8 WARNs). 2-phase parallel protocol |
191222

192223
---
193224

@@ -205,4 +236,5 @@ ego-submit says to produce a "readiness report" but doesn't define format. **Fix
205236
- [pipeline-improvements-2026-02-28.md](pipeline-improvements-2026-02-28.md) — Detailed analysis and code snippets for F-006 through F-011
206237
- [review-feedback-2026-02-28.md](review-feedback-2026-02-28.md) — Full pipeline improvement proposals (F-012 through F-021)
207238
- [field-test-clipboard-indicator.md](field-test-clipboard-indicator.md) — One-shot field test (different format)
239+
- [pipeline-review-2026-03-03.md](pipeline-review-2026-03-03.md) — Pipeline efficiency review: agent redundancy, parallel protocol redesign (F-022 through F-028)
208240
- [Gap analysis](../research/gap-analysis.md) — "Known False Positives and Noise Reduction" section

rules/patterns.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -947,6 +947,7 @@
947947
fix: "Replace {vertical: true} with {orientation: Clutter.Orientation.VERTICAL}"
948948
min-version: 48
949949
fix-min-version: 47
950+
deduplicate: true
950951

951952
- id: R-VER48-05
952953
pattern: "\\bShell\\.SnippetHook\\b"

skills/ego-lint/scripts/build-resource-graph.py

Lines changed: 60 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -672,12 +672,67 @@ def format_summary(graph):
672672
return '\n'.join(lines)
673673

674674

675+
def format_table(graph):
676+
"""Format the resource graph as a markdown table."""
677+
files = graph['files']
678+
orphan_set = set()
679+
for orphan in graph['orphans']:
680+
orphan_set.add((orphan['file'], orphan['line'], orphan['type']))
681+
682+
# Build a mapping from (file, stored_as) -> destroy info
683+
destroy_map = {} # (file, ref) -> (line, pattern)
684+
for rel, file_data in files.items():
685+
for d in file_data.get('destroys', []):
686+
ref = d.get('ref')
687+
if ref:
688+
destroy_map.setdefault((rel, ref), []).append(d)
689+
690+
# Build ownership lookup: child_file -> parent_file
691+
parent_of = {}
692+
for rel, refs in graph.get('ownership', {}).items():
693+
for ref_info in refs.values():
694+
child = ref_info.get('source_file')
695+
if child:
696+
parent_of[child] = rel
697+
698+
lines = []
699+
lines.append('| Type | Name | File:Line (create) | File:Line (destroy) | Owner | Status |')
700+
lines.append('|------|------|--------------------|---------------------|-------|--------|')
701+
702+
for rel in sorted(files.keys()):
703+
file_data = files[rel]
704+
owner = parent_of.get(rel, '(root)')
705+
for c in file_data.get('creates', []):
706+
stored = c.get('stored_as', '(anonymous)')
707+
create_loc = f'{rel}:{c["line"]}'
708+
rtype = c['type']
709+
710+
# Find matching destroy
711+
destroy_loc = '—'
712+
if stored and stored != '(anonymous)':
713+
for d in file_data.get('destroys', []):
714+
d_ref = d.get('ref')
715+
if d_ref == stored or (stored in d.get('pattern', '')):
716+
destroy_loc = f'{rel}:{d["line"]}'
717+
break
718+
719+
is_orphan = (rel, c['line'], rtype) in orphan_set
720+
status = 'ORPHAN' if is_orphan else 'OK'
721+
722+
lines.append(
723+
f'| {rtype} | {stored} | {create_loc} | {destroy_loc} | {owner} | {status} |'
724+
)
725+
726+
return '\n'.join(lines)
727+
728+
675729
def main():
676730
summary_mode = '--summary' in sys.argv
677-
args = [a for a in sys.argv[1:] if a != '--summary']
731+
table_mode = '--format=table' in sys.argv
732+
args = [a for a in sys.argv[1:] if a not in ('--summary', '--format=table')]
678733

679734
if not args:
680-
print("Usage: build-resource-graph.py [--summary] EXTENSION_DIR",
735+
print("Usage: build-resource-graph.py [--summary] [--format=table] EXTENSION_DIR",
681736
file=sys.stderr)
682737
sys.exit(1)
683738

@@ -688,7 +743,9 @@ def main():
688743

689744
graph = build_resource_graph(ext_dir)
690745

691-
if summary_mode:
746+
if table_mode:
747+
print(format_table(graph))
748+
elif summary_mode:
692749
print(format_summary(graph))
693750
else:
694751
print(json.dumps(graph, indent=2))

skills/ego-review/SKILL.md

Lines changed: 18 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -72,10 +72,8 @@ Using [lifecycle-checklist.md](references/lifecycle-checklist.md):
7272
- Verify parent calls child's `destroy()` in its own `disable()`/`destroy()`
7373
- Verify destroy order is reverse of creation
7474
- Verify child's `destroy()` cleans up all its own resources
75-
7. **Build the resource tracking table** from graph data:
76-
77-
| Resource | File:Line (create) | File:Line (destroy) | Owner | Status |
78-
|----------|-------------------|--------------------|---------|----|
75+
7. **Build the resource tracking table**: run `build-resource-graph.py --format=table`
76+
to generate the markdown table for the report (or build manually from JSON if needed)
7977

8078
8. **If the graph reports 0 orphans and complete ownership chains**: abbreviate
8179
this phase — focus on async guards and cleanup ordering below
@@ -87,25 +85,16 @@ Using [lifecycle-checklist.md](references/lifecycle-checklist.md):
8785

8886
### Phase 3: Signal & Resource Audit
8987

90-
1. Grep for `connect(` / `connectObject(` — list all signal connections
91-
2. Grep for `timeout_add` / `timeout_add_seconds` / `idle_add` — list all timer sources
92-
3. Grep for `FileMonitor` / `monitor_file` — list all file monitors
93-
4. Grep for D-Bus proxy creation (`Gio.DBusProxy`, `new_for_bus`)
94-
5. Cross-reference: every creation must have a corresponding cleanup in destroy/disable
95-
96-
**D-Bus proxy lifecycle:**
97-
- Disconnect all signal connections from the proxy
98-
- Null the proxy reference
99-
- Verify error handling for when the D-Bus service is unavailable
100-
101-
**File monitor lifecycle:**
102-
- `monitor.cancel()` first
103-
- Then disconnect any signal handlers
104-
- Then null the reference
105-
106-
**GSettings connections:**
107-
- Verify `disconnectObject(this)` or manual `disconnect(id)` for all settings connections
108-
- Check that settings reference is nulled after disconnect
88+
1. **Review the resource graph** from Phase 2 — if 0 orphans and complete
89+
ownership chains, abbreviate this phase to spot-checks only
90+
2. **Spot-check**: pick 2-3 resource entries from the graph and verify by
91+
reading the cited file:line that create/destroy are correctly paired
92+
3. **Check for resource types the graph may miss**:
93+
- GSettings connections (`.connect('changed::...')` vs `.disconnectObject()`)
94+
- Custom cleanup methods (`_cleanup()`, `_teardown()`, `_clear()`)
95+
- Login manager / D-Bus signal connections via `connectSignal()` (not `connectObject()`)
96+
4. **Only do a full manual grep** if the graph reports orphans or incomplete
97+
ownership
10998

11099
### Phase 4: Security Review
111100

@@ -159,7 +148,12 @@ Using [code-quality-checklist.md](references/code-quality-checklist.md):
159148

160149
Using [ai-slop-checklist.md](references/ai-slop-checklist.md) (46-item checklist):
161150

162-
1. For each checklist item, search the extension source for the described pattern
151+
1. For each checklist item:
152+
- If marked **Automated: Yes** → use ego-lint's result (from Phase 0)
153+
instead of re-searching. Only verify if ego-lint reported a finding
154+
for that check.
155+
- If marked **Automated: No** or **Automated: Partial** → search the
156+
extension source for the described pattern.
163157
2. Record whether it triggers, with file:line references
164158
3. Note whether the pattern is justified by context (check "NOT a signal" exceptions)
165159
4. Count JS files to determine threshold tier:

skills/ego-submit/SKILL.md

Lines changed: 22 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -16,39 +16,39 @@ combining automated checks with manual review in a structured pipeline.
1616

1717
## Parallelization Strategy
1818

19-
For extensions with 10+ JS files, run phases in parallel using 3 agents:
19+
For extensions with 10+ JS files, run in two phases:
2020

21-
- **Agent 1:** ego-lint + package validation (Phase 1, Phase 3)
22-
- **Agent 2:** ego-review lifecycle + signal + security (ego-review Phases 2-4)
23-
- **Agent 3:** ego-review quality + AI patterns + metadata (ego-review Phases 5-5a, Phase 4)
21+
**Phase A — Automated baseline (sequential, ~30s):**
22+
Run ego-lint → capture full output (PASS/FAIL/WARN/SKIP counts, metrics,
23+
resource graph summary).
2424

25-
When running in parallel, Agents 2-3 skip Phase 0 (ego-lint baseline) since
26-
Agent 1 handles it. The deduplication happens at report compilation time, not
27-
during individual agent work.
25+
**Phase B — Manual review (parallel, ~3-4 min):**
26+
- **Agent 1:** ego-review Phases 1-4 (discovery, licensing, lifecycle, signals,
27+
security, accessibility). Receives ego-lint output as context.
28+
- **Agent 2:** ego-review Phases 5-5a (quality, AI patterns) + Phase 4 metadata
29+
+ disclosure matrix + package validation + readiness report draft.
30+
Receives ego-lint output as context.
2831

29-
This reduces wall-clock time from ~10 minutes (sequential) to ~4 minutes. For
30-
smaller extensions (<10 JS files), sequential execution is fine.
32+
Both agents use ego-lint results to skip already-covered checks (Phases 3, 5a).
33+
For smaller extensions (<10 JS files), sequential execution is fine.
3134

3235
### Parallel Execution Protocol
3336

34-
When running in parallel mode:
37+
- **Phase A**: Run ego-lint + capture output. This is fast (~30s) and provides
38+
the baseline for both review agents.
39+
- **Phase B**: Launch both agents with ego-lint output in their context.
40+
Each agent skips checks already covered by ego-lint and focuses on semantic,
41+
cross-file, and design-level issues.
3542

36-
- **Agent 1**: ego-lint + Phase 3 package validation. Reports lint summary
37-
(PASS/FAIL/WARN/SKIP counts) and package status.
38-
- **Agent 2**: ego-review Phases 2-4 (lifecycle, signals, security). Skips
39-
Phase 0 since Agent 1 handles the ego-lint baseline.
40-
- **Agent 3**: ego-review Phases 5-5a (quality, AI patterns) + Phase 4 metadata
41-
+ disclosure matrix + reviewer notes draft.
42-
43-
**No early stopping**: All agents complete regardless of findings. If Agent 1
43+
**No early stopping**: Both agents complete regardless of findings. If ego-lint
4444
reports FAILs, the final readiness report verdict is NEEDS FIXES with FAIL
4545
items listed first as blocking action items.
4646

47-
**Deduplication**: The orchestrator merges results from all agents. When ego-lint
48-
and ego-review flag the same issue (e.g., both catch a missing signal
49-
disconnect), prefer ego-lint's categorization (it has the rule ID).
47+
**Deduplication**: Inherent — both agents see ego-lint results and skip covered
48+
items. The orchestrator merges remaining findings. When ego-lint and ego-review
49+
flag the same issue, prefer ego-lint's categorization (it has the rule ID).
5050

51-
**STOP condition**: Applied by the orchestrator after all agents complete, not
51+
**STOP condition**: Applied by the orchestrator after both agents complete, not
5252
by individual agents during their work.
5353

5454
## Pipeline Phases

0 commit comments

Comments
 (0)