Update issue-repro and issue-fix skills for performance investigations by mattleibow · Pull Request #3549 · mono/SkiaSharp

mattleibow · 2026-03-05T15:16:05Z

Update AI skills for performance investigations

Motivated by the #3525 investigation (macOS GL stencil bug, PR #3546) which revealed that both skills had no support for performance bug reproduction or investigation.

Knowledge separation: General → Performance → macOS

Layer	Repro skill	Fix skill
General	—	schema-cheatsheet.md, anti-patterns.md, response-guidelines.md, expanded examples
Performance	Category 10 in bug-categories.md	perf-investigation.md
macOS	platform-macos.md	macos-diagnostics.md

Repro skill changes (3 files)

New platform-macos.md — macOS GUI app reproduction templates (Metal/GL backends, NSApplication gotchas, run binary directly vs open -n, Phase 3C guidance)
Category 10 in bug-categories.md — Performance bugs: native baselines, per-phase Stopwatch instrumentation, multi-complexity testing, VSync control, statistical stability
SKILL.md routing — macOS at priority 3, performance note in Phase 3

Fix skill changes — General parity (4 new + 2 updated files)

New schema-cheatsheet.md — quick-reference for fix JSON fields (triage/repro had this, fix didn't)
New anti-patterns.md — 12 numbered rules including [BUG] SkiaSharp rendering performance is much lower than native Skia C++ #3525 learnings (triage has 9, fix had 2 inline)
New response-guidelines.md — PR description template, issue comment structure, perf results tables
Updated fix-examples.md — added Example 3 (perf fix [BUG] SkiaSharp rendering performance is much lower than native Skia C++ #3525) and Example 4 (needs-info)
Updated SKILL.md — mandatory reads section, expanded Phase 5 (perf + macOS routing), Phase 7 (perf verification), Phase 8 (response guidelines), anti-patterns reference

Fix skill changes — Performance + macOS (2 new files)

New perf-investigation.md — full methodology: phase profiling, isolation experiments (one variable at a time), native baselines, AI model consultation protocol (3 models, 2/3 consensus), debugging tables, statistical requirements
New macos-diagnostics.md — GL stencil trap (the [BUG] SkiaSharp rendering performance is much lower than native Skia C++ #3525 root cause pattern), VSync P/Invoke, NSOpenGLContext setup, Metal vs GL switching, Skia source reading locations

What this enables

Next time someone reports "SkiaSharp is slow" or "macOS rendering has low FPS":

Repro skill will create actual GUI app benchmarks (not console apps), measure per-phase timing, and produce quantitative evidence
Fix skill will follow a systematic profiling → isolation → AI consultation pipeline instead of guessing

…3548) * Make add-comment risk dynamic based on content and confidence Previously, all add-comment actions were hardcoded as 'high' risk regardless of content. This change makes the risk level dynamic: - low: factual content (repro findings, version matrix) with confidence >= 0.85 - medium: workaround/suggestion content with confidence >= 0.85 - high: close/reject content, or confidence < 0.70 Updated in both triage and repro schema cheatsheets and schema definitions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Make skill action schemas automation-ready with required data fields - Add stateReason, category, milestone fields to both schemas - Remove update-project (not automatable) - Add Action Data Requirements section to repro cheatsheet - Document required fields for every action type in both cheatsheets - Prevents hollow actions (correct type but missing data) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add macOS platform support and performance bug handling to issue-repro skill: - New platform-macos.md reference (GUI templates, Metal/GL, macOS gotchas) - New Category 10: Performance in bug-categories.md - Route macOS and perf signals in SKILL.md Bring issue-fix skill to parity with triage/repro: - New schema-cheatsheet.md, anti-patterns.md, response-guidelines.md - New perf-investigation.md (phase profiling, isolation experiments, AI consultation) - New macos-diagnostics.md (GL stencil trap, VSync control, Skia source locations) - Expanded fix-examples.md with perf bug and needs-info examples - Updated SKILL.md: mandatory reads, expanded Phase 5/7/8 Motivated by #3525 investigation that found SKGLView stencil bug. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Refactor issue-triage: replace MANDATORY/NEVER/MUST blocks with reasoning-based guidance, consolidate triple pre-flight loading, add error recovery table, rewrite description - Polish issue-fix: soften remaining MUST/CRITICAL spots, rewrite description from keyword list to intent-focused - Remove macos-diagnostics.md reference file, inline key details into SKILL.md and perf-investigation.md Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Genericize examples that referenced specific issue numbers: - issue-fix SKILL.md: generalize #3369 workaround-vs-root-cause example, replace #3369/#3272 in PR template with #NNNN/#MMMM placeholders - issue-fix anti-patterns.md: remove #3525 from VSync rule - issue-triage skia-patterns.md: remove #3386 Issue Lessons table - issue-repro SKILL.md: genericize #3430 simulation example - issue-repro conclusion-guide.md: genericize #3279 breaking-change example Full worked JSON examples in *-examples.md files retained — those use issue numbers as structural anchors for complete schema examples. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Replace macOS-specific §5.2a/§5.2b with a single platform-generic performance debugging section. Key techniques (correct view type, baselines, per-phase profiling, VSync, multi-backend comparison) now apply to any platform. Full methodology still in perf-investigation.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Remove 'Application activation is critical' — sample app works without it; was a workaround for missing NSApplication.Main() - Remove verbose AppDelegate boilerplate (15 lines) — replaced with brief 'add view to window' snippet - Remove Pitfalls section — GL 'driver quirks' and 'abandoned GL' framing was misleading; GL is deprecated but still works - Remove glGetIntegerv stencil claim — too specific to one issue - Remove 'using SkiaSharp.Views.Mac' from templates (implicit) - Keep: run binary directly (true), dotnet build not run (true), backend templates (useful for rendering bugs), code signing tip - Rename 'Gotchas' to 'Tips', reframe backend templates as optional 195 → 160 lines Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Replace real issue-specific examples (#2997, #3422, #3525, #3540) with fictional but realistic scenarios that teach JSON structure and patterns without being tied to specific past bugs. Changes across all 3 skills: - fix-examples.md: 4 examples (fixed/cannot-fix/perf-fix/needs-info) - Platforms: Windows x64, cross-platform, Android arm64, Linux x64 - Scenarios: text measurement, animated WebP, GL flush, encode crash - triage-examples.md: 3 examples (bug/question/duplicate) - Platforms: iOS, Android - Scenarios: decode crash, font loading, duplicate detection - repro-examples.md: 3 examples (reproduced/cross-platform/confirmed) - Platforms: Windows x64, Linux Docker, Android arm64 - Scenarios: path ops, CJK rendering, touch pressure All 10 JSON examples validated against their respective schemas. Fixed schema violations in fix examples (area/method enum values, pr properties) that existed in originals but were never caught. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- SKILL.md Rules 7-9: Add explicit 'measure yourself', 'stop and go back to their code', and 'do NOT create simpler test with different mode' - bug-categories.md: Add ⛔ CPU raster prohibition, 🛑 STOP gate after Steps 1-2, strengthen native baseline instructions, require GPU view for GPU perf bugs Eval results: baseline 44% → iter1 69% → iter2 88% overall Perf eval: 14% → 43% → 71% Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- category-bugs.md: 7 bug types (API, native, rendering, platform, build, memory, parity gap) with Goal sections and standardized headings - category-features.md: Enhancement and documentation categories with confirmed/not-confirmed conclusion guidance - category-performance.md: Performance-specific workflow with mode-matching rule (GPU→GPU, CPU→CPU), timeout/retry guidance, and emphasis on measuring both sides of the comparison yourself Key changes: - Every category now has a Goal section explaining what the minimal repro is - Standardized heading order: Signals → Goal → Strategy → Example → Pitfalls → Only bail - Performance: CPU raster is valid for CPU bugs, invalid for GPU bugs (was previously 'never use CPU raster' which is wrong for CPU perf issues) - Performance: Timeouts are not blockers — retry network ops, wait for builds - SKILL.md classification now routes to the matching category file - Console fallback has GPU performance exception Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…structions Root cause of standalone GPU repro being skipped: category-performance.md had its own Step 1-6 numbering that competed with SKILL.md Phase 3A/3B/3C. Agent followed SKILL.md's version-based ordering and treated category steps as advisory. Fix: - Renamed steps to Phase 3A/3B/3C/3D matching SKILL.md naming - Phase 3A = measure both sides of reporter's comparison - Phase 3B = test latest version (standard) - Phase 3C = YOUR standalone GPU repro (MANDATORY, with concrete example) - Phase 3D = instrument your repro with per-phase timing - SKILL.md now says perf bugs REPLACE standard Phase 3 flow - Added concrete `dotnet new macos` + SKGLView example for GPU repro - Explicitly references platform-macos.md templates Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Phase 3 was version-centric (3A=reporter version, 3B=latest, 3C=main) which buried the 'create YOUR minimal repro' step. The agent would run reporter's code across versions but never create its own independent reproduction. New Phase 3 (reproduction-centric): - 3A: Run baseline (if reporter provides native C++, working version, etc.) - 3B: Run reporter's broken code (verify their claim) - 3C: Create YOUR minimal repro (strip framework, use platform file) - 3D: Test your repro across versions (reporter's → latest → main) - 3E: Cross-platform verification (conditional) Key insight: baseline isn't perf-specific — any bug can have a baseline (native C++ that renders correctly, older version that works, etc.) And 'create your own repro' is universal — the SVG agent did it naturally but it was never an explicit step. Now it is. category-performance.md slimmed from 152→96 lines: removed duplicate Phase 3 flow, kept only perf-specific additions (instrumentation, VSync, mode-matching). Now says 'follow SKILL.md Phase 3 with these additions' instead of 'replace the standard flow'. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

SKMetalView when the reporter's issue was about SKGLView (OpenGL). OpenGL and Metal are different Skia code paths — switching backends invalidates the comparison. Changes: - Rule 9: 'same rendering mode' → 'same rendering backend (GL→GL, Metal→Metal)' - category-performance.md: explicit backend matching, not just GPU vs CPU - SKILL.md Phase 3C: 'match reporter's backend' not 'SKGLView/SKMetalView' - platform-macos.md: reorder templates with match-reporter warning, GL first Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

1. Created project manually instead of 'dotnet new macos' — missing Info.plist/bundle structure caused blank screen 2. Measured 120fps on empty canvas without validating anything rendered 3. Skipped native C++ baseline (Rule 7 violation, not cleverness) Fixes: - SKILL.md Phase 3C: mandatory rendering validation before measurements - platform-macos.md: 'dotnet new macos' is REQUIRED, not optional - platform-macos.md: Run & Verify now mandates visual validation - category-performance.md: validate rendering before measuring FPS - anti-patterns.md: new #10 — accepting measurements without validation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

dotnet new macos ensures proper app structure (no hacks to launch). Visual validation is always required regardless of how the project was created — they are independent concerns. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The agent keeps choosing Metal even when the reporter uses SKGLView (OpenGL). Root cause: the templates were listed by backend name and the agent picks one. Restructured platform-macos.md to: - Lead with a lookup table: reporter uses X → you use X - Templates are reference implementations, not suggestions - Removed 'test both backends' as primary guidance; now it's 'reproduce on reporter's backend, optionally test the other' Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Agents keep using the reporter's claimed '120fps' instead of building and running the native C++ baseline. The existing language ('build and run it yourself') was too soft. Now uses 🛑 stop gate with explicit statement that claimed numbers are a Rule 7 violation. Added concrete anti-pattern example: reporter says 120fps, you skip building and use that number — that's not measuring. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-project-automation bot added this to SkiaSharp Backlog Mar 5, 2026

mattleibow force-pushed the dev/skill-perf-investigation branch from da0317b to 4410adb Compare March 5, 2026 19:39

mattleibow and others added 4 commits March 5, 2026 21:44

Skills

6abdc91

mattleibow force-pushed the dev/skill-perf-investigation branch from 4410adb to 1cd1a6c Compare March 5, 2026 19:44

mattleibow and others added 16 commits March 5, 2026 21:52

Rename docker-testing.md to platform-docker.md for consistent naming

85e5682

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add suggestedReproPlatform to triage bug example

b0db22d

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Testing the new skills now

9eb0556

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update issue-repro and issue-fix skills for performance investigations#3549

Update issue-repro and issue-fix skills for performance investigations#3549
mattleibow wants to merge 20 commits intomainfrom
dev/skill-perf-investigation

mattleibow commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mattleibow commented Mar 5, 2026

Update AI skills for performance investigations

Knowledge separation: General → Performance → macOS

Repro skill changes (3 files)

Fix skill changes — General parity (4 new + 2 updated files)

Fix skill changes — Performance + macOS (2 new files)

What this enables

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant