Skip to content

Add review-skia-update skill and unify persist pattern#3567

Open
mattleibow wants to merge 3 commits intomainfrom
dev/review-skia-update-skill
Open

Add review-skia-update skill and unify persist pattern#3567
mattleibow wants to merge 3 commits intomainfrom
dev/review-skia-update-skill

Conversation

@mattleibow
Copy link
Contributor

@mattleibow mattleibow commented Mar 21, 2026

Summary

Adds the review-skia-update skill for reviewing Skia upstream merge PRs in mono/skia. Also updates the issue pipeline skills (issue-triage, issue-repro, issue-fix) with persist scripts and unified output patterns.

review-skia-update Skill

Analyzes a Skia upstream merge PR and produces a structured, schema-validated JSON report. Reduces 100K–500K line diffs to focused, human-reviewable artifacts.

Architecture

4 phases — mechanical work is scripted (Python), judgment work is AI:

Phase What How
1. Orchestrator Fetch metadata, checkout both PRs, run generator, source integrity, DEPS audit run_review.py
2. Summaries Write factual summaries for each change, build JSON report AI reads raw-results.json
3. C# Review Review companion SkiaSharp PR (hand-written wrappers, ABI, tests) AI reads checked-out files
4. Validate & Persist Schema validation + copy to output/ai/ validate-skia-review.ps1 + persist-skia-review.ps1

Scripts

  • run_review.py — orchestrator (imports check scripts as modules)
  • check_generated_files.py — runs pwsh generate.ps1, verifies output via git diff
  • check_source.py — diff-of-diffs for upstream + interop integrity
  • check_deps.py — DEPS parsing and comparison
  • validate-skia-review.ps1 / .py — JSON schema validation
  • persist-skia-review.ps1 — copy to output/ai/

Key design decisions

  • Orchestrator handles checkout: Both SkiaSharp companion PR and skia submodule checked out in Phase 1
  • Both PRs required: A Skia update always has a companion SkiaSharp PR — no optional fallback
  • No-retry policy: Generator check runs once — FAIL is FAIL (prevents agents from re-running for a different result)
  • Generator output captured: generator-output.log as proof of execution
  • Schema-validated: additionalProperties: false throughout, typed companion PR findings
  • Phase gates: Validation and persist follow the proven issue-triage/repro/fix pattern with stop signs

Evaluation results

Tested with PR #170 (m122, good generator) and PR #171 (m132, bad generator structs) — 3 serial runs each:

PR Gen Status Consistency
170 PASS ✅ 3/3 identical results
171 FAIL ✅ 3/3 identical results

All runs: summaries present on all items, diffs included in JSON, 0 absolute paths, companion PR reviewed.

Other changes

  • utils/generate.ps1: Replace Invoke-Expression with splatted args array for proper exit code propagation (also in Fix generate.ps1 error propagation #3569)
  • scripts/persist-to-cache.ps1: Shared script for pushing output/ai/ to the data-cache branch
  • Issue pipeline skills: Add persist scripts, update SKILL.md with unified output patterns

mattleibow and others added 2 commits March 23, 2026 20:16
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… review

- Add Run-Review.ps1 orchestrator script for mechanical steps
- Fix generate.ps1: replace Invoke-Expression with direct invocation for proper exit code propagation
- Simplify Check-GeneratedFiles.ps1: use git diff instead of string comparison
- Restructure SKILL.md as thin dispatcher (109 lines)
- Add references/writing-summaries.md for judgment guidance
- Add references/csharp-review.md for companion PR review (now required)
- Add no-retry policy for generation verification
- Capture raw generator output as proof of execution

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mattleibow mattleibow force-pushed the dev/review-skia-update-skill branch 3 times, most recently from 36c2056 to 83fcbc5 Compare March 23, 2026 21:39
@mattleibow mattleibow requested a review from Copilot March 23, 2026 21:54
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new review-skia-update skill intended to mechanically summarize large upstream Skia merge PRs into schema-validated review artifacts, and updates existing skills to persist outputs into output/ai/ (with timestamp-isolated /tmp/skiasharp/... working directories) instead of auto-pushing to the data-cache worktree.

Changes:

  • Introduces the review-skia-update skill (orchestrator + checks + schema/docs + validators + persist script).
  • Adds scripts/persist-to-cache.ps1 to manually copy/push output/ai/ contents into .data-cache/.
  • Updates issue pipeline skills (triage/repro/fix) to persist to output/ai/ and adjusts docs/examples for timestamped /tmp/skiasharp/.../{timestamp}/ paths.

Reviewed changes

Copilot reviewed 27 out of 27 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
scripts/persist-to-cache.ps1 New helper to copy output/ai/ into .data-cache/ and optionally commit/push.
.github/skills/review-skia-update/SKILL.md New skill definition and end-to-end procedure (run → summarize → validate → persist).
.github/skills/review-skia-update/scripts/run_review.py Orchestrates fetching/checkouts and runs generated/source/deps checks into raw-results.json.
.github/skills/review-skia-update/scripts/check_generated_files.py Regenerates P/Invokes and reports mismatches.
.github/skills/review-skia-update/scripts/check_source.py Produces diff-of-diffs outputs for upstream vs interop integrity.
.github/skills/review-skia-update/scripts/check_deps.py Parses and compares Skia DEPS across refs.
.github/skills/review-skia-update/scripts/validate-skia-review.py Python validator for final review JSON.
.github/skills/review-skia-update/scripts/validate-skia-review.ps1 PowerShell validator for final review JSON.
.github/skills/review-skia-update/scripts/persist-skia-review.ps1 Copies validated review JSON into output/ai/... for collection.
.github/skills/review-skia-update/scripts/.gitignore Ignores __pycache__/ for the new skill scripts directory.
.github/skills/review-skia-update/references/skia-review-schema.json New JSON schema for skia update review reports.
.github/skills/review-skia-update/references/schema-cheatsheet.md Quick reference guide for producing schema-valid reports.
.github/skills/review-skia-update/references/writing-summaries.md Guidance for summarizing raw-results.json into final report JSON.
.github/skills/review-skia-update/references/csharp-review.md Guidance for reviewing the SkiaSharp companion PR and recording findings.
.github/skills/issue-triage/scripts/persist-triage.ps1 Switches persistence to output/ai/ (no auto-push).
.github/skills/issue-triage/SKILL.md Updates examples/paths for timestamped /tmp/skiasharp/... outputs and new persist flow.
.github/skills/issue-repro/scripts/persist-repro.ps1 Switches persistence to output/ai/ (no auto-push).
.github/skills/issue-repro/SKILL.md Updates examples/paths for timestamped outputs and new persist flow.
.github/skills/issue-repro/references/repro-schema.json Updates docs string to reference timestamped repro directory.
.github/skills/issue-repro/references/platform-wasm-blazor.md Updates repro instructions to timestamped directories.
.github/skills/issue-repro/references/platform-docker-linux.md Updates docker repro instructions to timestamped directories.
.github/skills/issue-repro/references/platform-console.md Updates console repro instructions to timestamped directories.
.github/skills/issue-repro/references/bug-categories.md Updates repro instructions to timestamped directories.
.github/skills/issue-repro/references/anti-patterns.md Updates repro constraints to timestamped directories.
.github/skills/issue-fix/scripts/persist-fix.ps1 Switches persistence to output/ai/ (no auto-push).
.github/skills/issue-fix/SKILL.md Updates validation/persist instructions and output paths to include timestamps.
.github/skills/issue-fix/references/docker-testing.md Updates docker test instructions to use a timestamped directory.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 8 comments.

Comment on lines +31 to +36
foreach ($file in $files) {
$relativePath = [System.IO.Path]::GetRelativePath((Resolve-Path $source).Path, $file.FullName)
$destPath = Join-Path $dest $relativePath
New-Item -ItemType Directory -Force (Split-Path $destPath) | Out-Null
Copy-Item $file.FullName $destPath -Force
Write-Host " 📄 $relativePath"
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[System.IO.Path]::GetRelativePath(...) is not available in Windows PowerShell 5.1/.NET Framework, so this script will fail if run under powershell.exe. If this script requires PowerShell 7+, add an explicit #requires -Version 7.x (or similar) at the top; otherwise replace GetRelativePath with a PS5.1-compatible relative-path calculation.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added #requires -Version 7.0 at top of persist-to-cache.ps1.

Comment on lines +54 to +62
git commit -m "ai: persist output from $(Get-Date -Format 'yyyy-MM-dd HH:mm')"

$pushed = $false
for ($i = 0; $i -lt 3; $i++) {
git push 2>&1
if ($LASTEXITCODE -eq 0) { $pushed = $true; break }
Write-Host "⚠️ Push failed (attempt $($i + 1)/3), rebasing..."
git pull --rebase
}
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The git commit/git pull --rebase/git push calls aren’t checked for failure (only the git diff --cached --quiet and git push retry loop are). If git commit fails (e.g., missing user.name/email) the script will continue and can produce confusing output. Capture and validate $LASTEXITCODE after each git command that must succeed (especially commit and pull --rebase) and exit non-zero with a clear message on failure.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added $LASTEXITCODE checks after git commit and git pull --rebase, exits with clear error on failure.

Comment on lines +154 to +160
base_sha = subprocess.run(
["git", "rev-parse", "origin/skiasharp"],
cwd=skia_root, capture_output=True, text=True,
).stdout.strip()
pr_head_sha = pr["headRefOid"]

eprint(f" Base (skiasharp): {base_sha}")
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_review.py fetches baseRefName/baseRefOid from gh pr view, but then ignores them and hard-codes the base SHA as origin/skiasharp. If the mono/skia PR targets a different base branch (or if skiasharp is renamed), the comparisons will be against the wrong base. Use the PR metadata (baseRefOid and/or baseRefName) to determine base_sha and fetch the correct ref.

Suggested change
base_sha = subprocess.run(
["git", "rev-parse", "origin/skiasharp"],
cwd=skia_root, capture_output=True, text=True,
).stdout.strip()
pr_head_sha = pr["headRefOid"]
eprint(f" Base (skiasharp): {base_sha}")
# Determine base SHA from PR metadata, with fallbacks for legacy behavior
base_ref_name = pr.get("baseRefName")
base_sha = pr.get("baseRefOid")
if not base_sha:
# Fall back to resolving the base branch locally if the OID is unavailable
ref = f"origin/{base_ref_name}" if base_ref_name else "origin/skiasharp"
base_sha = subprocess.run(
["git", "rev-parse", ref],
cwd=skia_root, capture_output=True, text=True,
).stdout.strip()
if not base_sha or len(base_sha) < 7:
raise RuntimeError(
f"Could not determine valid base SHA for PR "
f"(baseRefName={base_ref_name!r}, baseRefOid={pr.get('baseRefOid')!r})"
)
pr_head_sha = pr["headRefOid"]
eprint(f" Base ({base_ref_name or 'unknown-base'}): {base_sha}")

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — base SHA now derived from PR metadata (baseRefOidorigin/{baseRefName}origin/skiasharp fallback chain).

Comment on lines +24 to +42
def parse_deps(content: str) -> dict:
"""Execute DEPS as Python and extract the deps dict."""
# DEPS files use Var() to reference variables defined in vars = {}
# First pass with stub Var to extract vars dict, then re-exec with real Var
restricted_builtins = {"__builtins__": {"True": True, "False": False, "None": None}}
stub_ns = {**restricted_builtins, "Var": lambda name: f"__VAR_{name}__"}
try:
exec(content, stub_ns) # noqa: S102
except Exception as e:
eprint(f"Warning: DEPS exec failed: {e}")
return {}

variables = stub_ns.get("vars", {})
namespace = {**restricted_builtins, "Var": lambda name: variables.get(name, name)}
try:
exec(content, namespace) # noqa: S102
except Exception as e:
eprint(f"Warning: DEPS re-exec failed: {e}")
return {}
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_deps() executes the DEPS file with exec(). Even with restricted __builtins__, DEPS content comes from the PR head/upstream branch and can be attacker-controlled (fork PRs), so this is still a risky code-execution primitive. Prefer a non-executing parser (e.g., ast.parse + a small evaluator that only permits literals/dicts/lists and Var("..."), or a purpose-built DEPS parser) so malicious DEPS content can’t run arbitrary Python.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — replaced exec() with ast.parse + ast.literal_eval walker. Only extracts literal dicts/strings and resolves Var() calls — no code execution.

Comment on lines +101 to +108
base_content = git_show_deps(base_sha, "DEPS-base")
head_content = git_show_deps(pr_head, "DEPS-head")
upstream_content = git_show_deps(upstream_branch, "DEPS-upstream")

base_deps = parse_deps(base_content)
head_deps = parse_deps(head_content)
upstream_deps = parse_deps(upstream_content)

Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script claims to audit “between base, head, and upstream”, and it parses upstream_content into upstream_deps, but upstream_deps is never used in the comparisons (only base vs head is compared). Either incorporate upstream into the audit output (e.g., flag head pins that diverge from upstream, or indicate which base/head changes are carrying fork-only pins) or remove the upstream parsing/claims to avoid misleading results.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added upstreamRevision and matchesUpstream fields to added/changed deps so reviewers can see fork-only pins vs upstream-aligned changes.

Comment on lines +75 to +79
subprocess.run(
["git", "checkout", "HEAD", "--", HARFBUZZ_GENERATED_FILE],
cwd=repo_root,
capture_output=True,
)
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The git checkout HEAD -- ... used to revert the HarfBuzz generated file ignores the exit code. If the path changes or the checkout fails, the script will continue and may report a false PASS/FAIL. Capture the return code (or use check=True) and surface an explicit error when the revert step fails.

Suggested change
subprocess.run(
["git", "checkout", "HEAD", "--", HARFBUZZ_GENERATED_FILE],
cwd=repo_root,
capture_output=True,
)
revert_result = subprocess.run(
["git", "checkout", "HEAD", "--", HARFBUZZ_GENERATED_FILE],
cwd=repo_root,
capture_output=True,
text=True,
)
if revert_result.returncode != 0:
eprint(f"❌ Failed to revert {HARFBUZZ_GENERATED_FILE} to HEAD (git exit code {revert_result.returncode})")
if revert_result.stderr:
eprint(f" git stderr: {revert_result.stderr.strip()}")
return {
"status": "FAIL",
"checked": [],
"mismatches": [],
"generatorError": f"Failed to revert {HARFBUZZ_GENERATED_FILE} to HEAD (git exit code {revert_result.returncode})",
"generatorLog": generator_log,
}

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — HarfBuzz revert now checks returncode and returns FAIL with error on failure.

Comment on lines +92 to +117
# Risk consistency
gen_fail = data.get("generatedFiles", {}).get("status") == "FAIL"
upstream_review = data.get("upstreamIntegrity", {}).get("status") == "REVIEW_REQUIRED"
expected_high = gen_fail or upstream_review
if expected_high and data.get("riskAssessment") != "HIGH":
errors.append("riskAssessment should be HIGH")

deps_review = data.get("depsAudit", {}).get("status") == "REVIEW_REQUIRED"
interop_review = data.get("interopIntegrity", {}).get("status") == "REVIEW_REQUIRED"
expected_medium = deps_review or interop_review
if not expected_high and expected_medium and data.get("riskAssessment") == "LOW":
errors.append("riskAssessment should be MEDIUM or HIGH, not LOW")

# No absolute paths
json_text = json.dumps(data)
for m in re.findall(r'"[^"]*(?:/Users/|/home/|C:\\Users\\)[^"]*"', json_text):
errors.append(f"Absolute path found: {m}")

if not errors:
pr = data.get("meta", {}).get("skiaPrNumber", "?")
risk = data.get("riskAssessment", "?")
print(f"✅ {path.name} is valid (PR #{pr}, risk: {risk})")
sys.exit(0)

print(f"❌ {len(errors)} validation error(s) in {path.name}:\n")
for e in errors:
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Python validator and the PowerShell validator enforce different rules (e.g., the PS script validates meta.shas.* are 40-char hex, but the Python version doesn’t). Since SKILL.md explicitly treats Python as the fallback validator, consider aligning the Python checks with the PS checks (or vice versa) so validation results don’t depend on which runtime is available.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — Python validator now checks meta.shas.* are 40-char hex strings, aligned with the PowerShell validator.

Comment on lines +115 to +117
- Each dep item must have `name` + `summary`
- No absolute paths (redact `/Users/...` → relative)
- Schema version must be `"1.0"`
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cheatsheet has duplicated validation bullets at the end (e.g., “No absolute paths” and “Schema version must be "1.0"” appear twice). Deduplicate the list to avoid confusion about whether there are additional requirements.

Suggested change
- Each dep item must have `name` + `summary`
- No absolute paths (redact `/Users/...` → relative)
- Schema version must be `"1.0"`

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — removed 3 duplicate validation bullets.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 4 comments.

Comment on lines +71 to +74
repo_root = subprocess.run(
["git", "rev-parse", "--show-toplevel"],
capture_output=True, text=True,
).stdout.strip()
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git rev-parse --show-toplevel is invoked without checking returncode. If this script is run outside a git worktree (or git fails), repo_root becomes an empty string and subsequent path construction will fail with confusing errors. Capture the CompletedProcess, validate returncode == 0, and raise a clear fatal error if it fails.

Suggested change
repo_root = subprocess.run(
["git", "rev-parse", "--show-toplevel"],
capture_output=True, text=True,
).stdout.strip()
# Determine repository root; fail fast if not inside a git worktree.
git_root_result = run_git(["rev-parse", "--show-toplevel"], cwd=os.getcwd(), check=True)
repo_root = git_root_result.stdout.strip()
if not repo_root:
raise RuntimeError(
"Failed to determine git repository root using 'git rev-parse --show-toplevel'. "
"Ensure this script is run from within a git worktree."
)

Copilot uses AI. Check for mistakes.
["git", "diff", "--no-index", old_tmp.name, new_tmp.name],
capture_output=True,
text=True,
)
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When computing patchDiff, git diff --no-index is run without checking returncode. That command commonly returns 1 when differences are found (expected), but returns >1 on errors; right now an error would silently produce an empty/partial patchDiff. Consider explicitly handling returncode > 1 as fatal and surfacing stderr, while allowing 0/1 as success.

Suggested change
)
)
# git diff --no-index returns:
# 0 when there were no differences
# 1 when differences were found (expected)
# >1 on errors, which we must treat as fatal
if patch_result.returncode not in (0, 1):
raise RuntimeError(
f"'git diff --no-index' failed with exit code {patch_result.returncode}: {patch_result.stderr}"
)

Copilot uses AI. Check for mistakes.
Comment on lines +66 to +67
eprint(f"Warning: DEPS parse failed: {e}")
return {}
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ast.parse(DEPS) fails, parse_deps() returns {} and the audit can incorrectly end up with status: PASS (because base/head deps both look empty). Treat a DEPS parse failure as fatal (or at least force REVIEW_REQUIRED and surface a parseError) so the orchestrator can’t silently produce a misleading clean report.

Suggested change
eprint(f"Warning: DEPS parse failed: {e}")
return {}
eprint(f"Error: DEPS parse failed: {e}")
raise

Copilot uses AI. Check for mistakes.
@@ -83,7 +83,7 @@ gh issue view {number} --repo mono/SkiaSharp --json title,body,labels,comments,s
If using cached JSON:

```bash
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command that redirects to /tmp/skiasharp/triage/{timestamp}/{number}.md will fail on a fresh machine because the {timestamp} directory isn’t created first. Add an explicit mkdir -p /tmp/skiasharp/triage/{timestamp} (or equivalent) before the redirection so the example is copy/paste safe.

Suggested change
```bash
```bash
mkdir -p /tmp/skiasharp/triage/{timestamp}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants