fix: resolve pipeline findings F-022 through F-028

ZviBaratz · claude · ZviBaratz · commit 2b79fce23c52 · 2026-03-03T12:31:25.000+02:00
- F-022: rewrite ego-review Phase 3 to verify resource graph instead of
  rebuilding it manually
- F-023: add skip logic to Phase 5a for automated AI checklist items
- F-024: close as already fixed (check-disclosures.py has all 6 capabilities)
- F-025: add --format=table to build-resource-graph.py for markdown output
- F-026: redesign ego-submit from 3-agent concurrent to 2-phase sequential+parallel
- F-027: add deduplicate:true to R-VER48-04b pattern rule
- F-028: defer baseline suppression (inline ignore sufficient)

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/docs/internal/field-test-hara-hachi-bu.md b/docs/internal/field-test-hara-hachi-bu.md
@@ -17,13 +17,13 @@ This is the consolidated findings registry for the project's calibration baselin
 
 ## Current Baseline
 
-Latest ego-lint results (2026-03-02):
+Latest ego-lint results (2026-03-03, post F-027 dedup fix):
 
 | Status | Count |
 |--------|-------|
-| PASS | 206 |
+| PASS | 205 |
 | FAIL | 0 |
-| WARN | 7 |
+| WARN | 8 |
 | SKIP | 23 |
 | Exit | 0 |
 
@@ -40,6 +40,7 @@ Latest ego-lint results (2026-03-02):
 | R-PREFS-04c | GTK layout widget advisory — correct (ListBox, SpinButton in prefs) |
 | R-SLOP-40 | Promise wrapper advisory — correct (D-Bus proxy constructor) |
 | R-QUAL-33 | Gio._promisify() module-scope advisory — correct (standard GJS pattern) |
+| R-VER48-04b | vertical property deprecated advisory — correct (deduplicated to 1 per file) |
 | metadata/shell-version-current | GNOME 49 not in shell-version — intentional (untested) |
 | quality/private-api | Private API with inline justification — disclosed in metadata |
 
@@ -139,6 +140,13 @@ The A1-A7 checklist was entirely manual. Some items (accessible-role usage, acce
 | F-019 | Reviewer notes template | Fixed (2026-03-01) | 2026-02-28 |
 | F-020 | Readiness report format | Fixed (2026-03-01) | 2026-02-28 |
 | F-021 | AI slop overlap | Fixed (2026-03-01) | 2026-02-28 |
+| F-022 | Phase 3 rebuilds resource graph manually | Fixed (2026-03-03) | 2026-03-03 |
+| F-023 | Phase 5a re-checks automated AI patterns | Fixed (2026-03-03) | 2026-03-03 |
+| F-024 | check-disclosures.py missing 4 capabilities | Fixed (2026-03-03) | 2026-03-03 |
+| F-025 | Resource graph markdown table output | Fixed (2026-03-03) | 2026-03-03 |
+| F-026 | Parallel protocol prevents ego-lint reuse | Fixed (2026-03-03) | 2026-03-03 |
+| F-027 | R-VER48-04b deduplicate regression | Fixed (2026-03-03) | 2026-03-03 |
+| F-028 | Baseline suppression for known warnings | Deferred | 2026-03-03 |
 
 **F-016: Parallelization strategy missing from ego-submit**
 ego-submit describes sequential phases, but they're largely independent. A 3-agent parallel approach (lifecycle+signals, security+quality, package+metadata) cut wall-clock time from ~10 to ~4 minutes. **Fix**: Added "Parallel Execution Protocol" section to `skills/ego-submit/SKILL.md` with agent roles, no-early-stopping rule, and deduplication strategy.
@@ -158,6 +166,27 @@ ego-submit says to produce a "readiness report" but doesn't define format. **Fix
 **F-021: AI slop overlap between ego-lint and ego-review**
 `check-quality.py` covers some AI patterns, but the 46-item `ai-slop-checklist.md` doesn't indicate which items are automated. Reviewer agents re-check automated items. **Fix**: Added `**Automated:**` field to each checklist item with Yes/No/Partial and the ego-lint check name.
 
+**F-022: Phase 3 rebuilds resource graph manually**
+ego-review Phase 3 instructs agents to grep for signals, timeouts, file monitors, and D-Bus proxies — duplicating what `build-resource-graph.py` already computes in Phase 2. Agent 2 spent ~200s of 381s on this. **Fix**: Rewrite Phase 3 to verify the graph output rather than rebuild it. See [pipeline-review-2026-03-03.md](pipeline-review-2026-03-03.md).
+
+**F-023: Phase 5a re-checks automated AI patterns**
+Despite F-021 adding automation mapping to the AI slop checklist, Phase 5a instructions don't tell agents to skip automated items. Agent 3 searched for all 46 items. **Fix**: Update Phase 5a instructions to use ego-lint results for automated items.
+
+**F-024: check-disclosures.py missing 4 capabilities**
+F-012 added clipboard+network disclosure checking, but the 6-capability disclosure matrix (also pkexec, subprocess, private API, file I/O) is still partly manual. **Fix**: Extend check-disclosures.py to cover all 6. **Status**: Already fixed — check-disclosures.py covers all 6 capabilities (clipboard, network, pkexec, private-api, file-io, subprocess) at lines 35-99. The pipeline review observation was based on stale data from the F-012 description.
+
+**F-025: Resource graph markdown table output**
+Readiness report requires a per-resource tracking table. Agents manually format this from graph JSON. **Fix**: Add `--format=table` flag to output markdown directly.
+
+**F-026: Parallel protocol prevents ego-lint reuse**
+3-agent parallel protocol runs ego-lint concurrently with review agents, so Agents 2-3 can't use ego-lint results. **Fix**: Two-phase approach: run ego-lint first (~30s), then fan out 2 review agents with ego-lint output as context.
+
+**F-027: R-VER48-04b deduplicate regression**
+Rule fired twice for quickSettingsPanel.js (lines 100 and 909). Previous run showed single WARN coincidentally. **Fix**: Not a regression — rule never had `deduplicate: true`. Added the field to collapse per-file hits into a single advisory WARN.
+
+**F-028: Baseline suppression for known warnings**
+No mechanism to mark warnings as acknowledged. Every run produces the same 7-9 known WARNs, drowning new findings. **Deferred**: Existing inline `// ego-lint-ignore` suppression is sufficient. The known WARNs are correct warnings that flag real patterns reviewers will notice — suppressing them could mask regressions. Baseline feature would require changes to ego-lint.sh's output pipeline, JSON file management, and new CLI flags (medium effort, P2).
+
 ### EGO Reviewer Feedback
 
 *No entries yet. This section will be populated when hara-hachi-bu completes EGO review.*
@@ -188,6 +217,8 @@ ego-submit says to produce a "readiness report" but doesn't define format. **Fix
 | 2026-02-28 | 193 | 0 | 5 | 17 | 1 | F-006 through F-015 fixed; 4 new Tier 2 scripts |
 | 2026-03-01 | 201 | 0 | 8 | 23 | — | 6 new pattern rules, +3 WARNs (R-SLOP-40, R-QUAL-33, R-PREFS-04c), +6 SKIP (VER49/50) |
 | 2026-03-02 | 206 | 0 | 7 | 23 | 1 | Full ego-submit pipeline (3-agent parallel). ESLint WARN resolved. +5 PASS. ego-submit: READY TO SUBMIT |
+| 2026-03-03 | 205 | 0 | 9 | 23 | — | Full ego-submit (3-agent parallel, fresh session). +2 WARNs (R-VER48-04b dedup regression). Pipeline efficiency review → F-022 through F-028 |
+| 2026-03-03 (post-fix) | 205 | 0 | 8 | 23 | — | F-022/F-023/F-025/F-026/F-027 fixed, F-024 closed, F-028 deferred. R-VER48-04b deduplicated (9→8 WARNs). 2-phase parallel protocol |
 
 ---
 
@@ -205,4 +236,5 @@ ego-submit says to produce a "readiness report" but doesn't define format. **Fix
 - [pipeline-improvements-2026-02-28.md](pipeline-improvements-2026-02-28.md) — Detailed analysis and code snippets for F-006 through F-011
 - [review-feedback-2026-02-28.md](review-feedback-2026-02-28.md) — Full pipeline improvement proposals (F-012 through F-021)
 - [field-test-clipboard-indicator.md](field-test-clipboard-indicator.md) — One-shot field test (different format)
+- [pipeline-review-2026-03-03.md](pipeline-review-2026-03-03.md) — Pipeline efficiency review: agent redundancy, parallel protocol redesign (F-022 through F-028)
 - [Gap analysis](../research/gap-analysis.md) — "Known False Positives and Noise Reduction" section
diff --git a/rules/patterns.yaml b/rules/patterns.yaml
@@ -947,6 +947,7 @@
   fix: "Replace {vertical: true} with {orientation: Clutter.Orientation.VERTICAL}"
   min-version: 48
   fix-min-version: 47
+  deduplicate: true
 
 - id: R-VER48-05
   pattern: "\\bShell\\.SnippetHook\\b"
diff --git a/skills/ego-lint/scripts/build-resource-graph.py b/skills/ego-lint/scripts/build-resource-graph.py
@@ -672,12 +672,67 @@ def format_summary(graph):
     return '\n'.join(lines)
 
 
+def format_table(graph):
+    """Format the resource graph as a markdown table."""
+    files = graph['files']
+    orphan_set = set()
+    for orphan in graph['orphans']:
+        orphan_set.add((orphan['file'], orphan['line'], orphan['type']))
+
+    # Build a mapping from (file, stored_as) -> destroy info
+    destroy_map = {}  # (file, ref) -> (line, pattern)
+    for rel, file_data in files.items():
+        for d in file_data.get('destroys', []):
+            ref = d.get('ref')
+            if ref:
+                destroy_map.setdefault((rel, ref), []).append(d)
+
+    # Build ownership lookup: child_file -> parent_file
+    parent_of = {}
+    for rel, refs in graph.get('ownership', {}).items():
+        for ref_info in refs.values():
+            child = ref_info.get('source_file')
+            if child:
+                parent_of[child] = rel
+
+    lines = []
+    lines.append('| Type | Name | File:Line (create) | File:Line (destroy) | Owner | Status |')
+    lines.append('|------|------|--------------------|---------------------|-------|--------|')
+
+    for rel in sorted(files.keys()):
+        file_data = files[rel]
+        owner = parent_of.get(rel, '(root)')
+        for c in file_data.get('creates', []):
+            stored = c.get('stored_as', '(anonymous)')
+            create_loc = f'{rel}:{c["line"]}'
+            rtype = c['type']
+
+            # Find matching destroy
+            destroy_loc = '—'
+            if stored and stored != '(anonymous)':
+                for d in file_data.get('destroys', []):
+                    d_ref = d.get('ref')
+                    if d_ref == stored or (stored in d.get('pattern', '')):
+                        destroy_loc = f'{rel}:{d["line"]}'
+                        break
+
+            is_orphan = (rel, c['line'], rtype) in orphan_set
+            status = 'ORPHAN' if is_orphan else 'OK'
+
+            lines.append(
+                f'| {rtype} | {stored} | {create_loc} | {destroy_loc} | {owner} | {status} |'
+            )
+
+    return '\n'.join(lines)
+
+
 def main():
     summary_mode = '--summary' in sys.argv
-    args = [a for a in sys.argv[1:] if a != '--summary']
+    table_mode = '--format=table' in sys.argv
+    args = [a for a in sys.argv[1:] if a not in ('--summary', '--format=table')]
 
     if not args:
-        print("Usage: build-resource-graph.py [--summary] EXTENSION_DIR",
+        print("Usage: build-resource-graph.py [--summary] [--format=table] EXTENSION_DIR",
               file=sys.stderr)
         sys.exit(1)
 
@@ -688,7 +743,9 @@ def main():
 
     graph = build_resource_graph(ext_dir)
 
-    if summary_mode:
+    if table_mode:
+        print(format_table(graph))
+    elif summary_mode:
         print(format_summary(graph))
     else:
         print(json.dumps(graph, indent=2))
diff --git a/skills/ego-review/SKILL.md b/skills/ego-review/SKILL.md
@@ -72,10 +72,8 @@ Using [lifecycle-checklist.md](references/lifecycle-checklist.md):
    - Verify parent calls child's `destroy()` in its own `disable()`/`destroy()`
    - Verify destroy order is reverse of creation
    - Verify child's `destroy()` cleans up all its own resources
-7. **Build the resource tracking table** from graph data:
-
-   | Resource | File:Line (create) | File:Line (destroy) | Owner | Status |
-   |----------|-------------------|--------------------|---------|----|
+7. **Build the resource tracking table**: run `build-resource-graph.py --format=table`
+   to generate the markdown table for the report (or build manually from JSON if needed)
 
 8. **If the graph reports 0 orphans and complete ownership chains**: abbreviate
    this phase — focus on async guards and cleanup ordering below
@@ -87,25 +85,16 @@ Using [lifecycle-checklist.md](references/lifecycle-checklist.md):
 
 ### Phase 3: Signal & Resource Audit
 
-1. Grep for `connect(` / `connectObject(` — list all signal connections
-2. Grep for `timeout_add` / `timeout_add_seconds` / `idle_add` — list all timer sources
-3. Grep for `FileMonitor` / `monitor_file` — list all file monitors
-4. Grep for D-Bus proxy creation (`Gio.DBusProxy`, `new_for_bus`)
-5. Cross-reference: every creation must have a corresponding cleanup in destroy/disable
-
-**D-Bus proxy lifecycle:**
-- Disconnect all signal connections from the proxy
-- Null the proxy reference
-- Verify error handling for when the D-Bus service is unavailable
-
-**File monitor lifecycle:**
-- `monitor.cancel()` first
-- Then disconnect any signal handlers
-- Then null the reference
-
-**GSettings connections:**
-- Verify `disconnectObject(this)` or manual `disconnect(id)` for all settings connections
-- Check that settings reference is nulled after disconnect
+1. **Review the resource graph** from Phase 2 — if 0 orphans and complete
+   ownership chains, abbreviate this phase to spot-checks only
+2. **Spot-check**: pick 2-3 resource entries from the graph and verify by
+   reading the cited file:line that create/destroy are correctly paired
+3. **Check for resource types the graph may miss**:
+   - GSettings connections (`.connect('changed::...')` vs `.disconnectObject()`)
+   - Custom cleanup methods (`_cleanup()`, `_teardown()`, `_clear()`)
+   - Login manager / D-Bus signal connections via `connectSignal()` (not `connectObject()`)
+4. **Only do a full manual grep** if the graph reports orphans or incomplete
+   ownership
 
 ### Phase 4: Security Review
 
@@ -159,7 +148,12 @@ Using [code-quality-checklist.md](references/code-quality-checklist.md):
 
 Using [ai-slop-checklist.md](references/ai-slop-checklist.md) (46-item checklist):
 
-1. For each checklist item, search the extension source for the described pattern
+1. For each checklist item:
+   - If marked **Automated: Yes** → use ego-lint's result (from Phase 0)
+     instead of re-searching. Only verify if ego-lint reported a finding
+     for that check.
+   - If marked **Automated: No** or **Automated: Partial** → search the
+     extension source for the described pattern.
 2. Record whether it triggers, with file:line references
 3. Note whether the pattern is justified by context (check "NOT a signal" exceptions)
 4. Count JS files to determine threshold tier:
diff --git a/skills/ego-submit/SKILL.md b/skills/ego-submit/SKILL.md
@@ -16,39 +16,39 @@ combining automated checks with manual review in a structured pipeline.
 
 ## Parallelization Strategy
 
-For extensions with 10+ JS files, run phases in parallel using 3 agents:
+For extensions with 10+ JS files, run in two phases:
 
-- **Agent 1:** ego-lint + package validation (Phase 1, Phase 3)
-- **Agent 2:** ego-review lifecycle + signal + security (ego-review Phases 2-4)
-- **Agent 3:** ego-review quality + AI patterns + metadata (ego-review Phases 5-5a, Phase 4)
+**Phase A — Automated baseline (sequential, ~30s):**
+Run ego-lint → capture full output (PASS/FAIL/WARN/SKIP counts, metrics,
+resource graph summary).
 
-When running in parallel, Agents 2-3 skip Phase 0 (ego-lint baseline) since
-Agent 1 handles it. The deduplication happens at report compilation time, not
-during individual agent work.
+**Phase B — Manual review (parallel, ~3-4 min):**
+- **Agent 1:** ego-review Phases 1-4 (discovery, licensing, lifecycle, signals,
+  security, accessibility). Receives ego-lint output as context.
+- **Agent 2:** ego-review Phases 5-5a (quality, AI patterns) + Phase 4 metadata
+  + disclosure matrix + package validation + readiness report draft.
+  Receives ego-lint output as context.
 
-This reduces wall-clock time from ~10 minutes (sequential) to ~4 minutes. For
-smaller extensions (<10 JS files), sequential execution is fine.
+Both agents use ego-lint results to skip already-covered checks (Phases 3, 5a).
+For smaller extensions (<10 JS files), sequential execution is fine.
 
 ### Parallel Execution Protocol
 
-When running in parallel mode:
+- **Phase A**: Run ego-lint + capture output. This is fast (~30s) and provides
+  the baseline for both review agents.
+- **Phase B**: Launch both agents with ego-lint output in their context.
+  Each agent skips checks already covered by ego-lint and focuses on semantic,
+  cross-file, and design-level issues.
 
-- **Agent 1**: ego-lint + Phase 3 package validation. Reports lint summary
-  (PASS/FAIL/WARN/SKIP counts) and package status.
-- **Agent 2**: ego-review Phases 2-4 (lifecycle, signals, security). Skips
-  Phase 0 since Agent 1 handles the ego-lint baseline.
-- **Agent 3**: ego-review Phases 5-5a (quality, AI patterns) + Phase 4 metadata
-  + disclosure matrix + reviewer notes draft.
-
-**No early stopping**: All agents complete regardless of findings. If Agent 1
+**No early stopping**: Both agents complete regardless of findings. If ego-lint
 reports FAILs, the final readiness report verdict is NEEDS FIXES with FAIL
 items listed first as blocking action items.
 
-**Deduplication**: The orchestrator merges results from all agents. When ego-lint
-and ego-review flag the same issue (e.g., both catch a missing signal
-disconnect), prefer ego-lint's categorization (it has the rule ID).
+**Deduplication**: Inherent — both agents see ego-lint results and skip covered
+items. The orchestrator merges remaining findings. When ego-lint and ego-review
+flag the same issue, prefer ego-lint's categorization (it has the rule ID).
 
-**STOP condition**: Applied by the orchestrator after all agents complete, not
+**STOP condition**: Applied by the orchestrator after both agents complete, not
 by individual agents during their work.
 
 ## Pipeline Phases