add non-blocking PR benchmark guard in manifold.yml (base vs head perfTest) by AnshulPatil2005 · Pull Request #1680 · elalish/manifold

AnshulPatil2005 · 2026-04-27T13:11:36Z

Summary

This PR adds an initial PR benchmark guard to CI for performance visibility during review.
It runs a lightweight benchmark comparison between:

the PR base commit, and
the PR head commit,
on the same runner to reduce cross-runner noise.

working

Checkout repository with full history.
Create two git worktrees:
- wt-base from github.event.pull_request.base.sha
- wt-head from github.event.pull_request.head.sha
Build and run extras/perfTest for each variant (base, head) with the same settings.
benchmark used -> perfTest
Repeat runs (PERF_REPEATS=3) and save outputs under:
- bench/base/run*.txt
- bench/head/run*.txt

reporting

Parse timing lines (time = ... sec) from both sides.
Compute per-run means and compare using mean of run means (primary); median of run means is also reported for context.
Emit a warning when both thresholds are exceeded:
- REGRESSION_WARN_PCT=20
- REGRESSION_WARN_ABS_MS=10
Append markdown summary to GITHUB_STEP_SUMMARY.
Upload raw outputs and JSON summary as artifact (pr-benchmark-guard).

expected follow-up

Calibrate thresholds and repeats using data.
Improve output accordingly if needed (case-level clarity, summary polish).

ref - opencax/GSoC#114 (comment)

codecov · 2026-04-27T13:33:00Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.05%. Comparing base (8b173db) to head (44e1489).
⚠️ Report is 34 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1680      +/-   ##
==========================================
+ Coverage   94.87%   95.05%   +0.18%     
==========================================
  Files          36       37       +1     
  Lines        8305     8337      +32     
==========================================
+ Hits         7879     7925      +46     
+ Misses        426      412      -14

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

AnshulPatil2005 · 2026-05-02T14:47:07Z

artifact upload uses if: always() so benchmark outputs are available even if earlier steps fail,
added an cleanup step to remove wt-base/wt-head worktrees after the job which was missing

if base/head data is invalid (e.g., different run counts, missing run files, or empty parsed samples), should we fail the benchmark_guard job?

elalish · 2026-05-03T09:55:20Z

if base/head data is invalid (e.g., different run counts, missing run files, or empty parsed samples), should we fail the benchmark_guard job?

No, if it's only a warning when the perf drops, it should also only be a warning if these other things fail. Out of curiosity, where will these warnings show up?

AnshulPatil2005 · 2026-05-04T14:45:46Z

result.json
summary.md

if base/head data is invalid (e.g., different run counts, missing run files, or empty parsed samples), should we fail the benchmark_guard job?

No, if it's only a warning when the perf drops, it should also only be a warning if these other things fail. Out of curiosity, where will these warnings show up?

Got it,
For visibility: currently warnings show up as GitHub Actions in the benchmark_guard job logs (::warning::) and also in the uploaded bench artifacts (summary.md/result.json).
here is the example of the last output attached

AnshulPatil2005 · 2026-05-24T13:30:14Z

I switched the primary comparison metric from median of run means to mean of run means, since it should better reflect overall timing differences across repeated runs, while still keeping median of run means in the summary as a secondary context metric.
Am i going in the right direction here ?should there be any specific change ? @elalish @pca006132

elalish · 2026-05-24T20:15:11Z

Sounds reasonable - @pca006132 you've been doing more on the perf testing side of things, do you have time to review this?

pca006132

I don't understand what mean you are talking about.

By run means, are you running each single benchmark multiple times and taking their mean? And mean of run means is taking the mean across benchmarks? If yes, I don't think you need the final mean, just compare the mean of each benchmark.

AnshulPatil2005 · 2026-05-25T15:16:17Z

I don't understand what mean you are talking about.

By run means, are you running each single benchmark multiple times and taking their mean? And mean of run means is taking the mean across benchmarks? If yes, I don't think you need the final mean, just compare the mean of each benchmark.

Yes you are understanding it right, honestly I was confused with the approach for the comparison ; I will do the each benchmark mean comparison

pca006132

Can we use cached build from master? I remember GitHub action has some complicated cache setup, some may have security concerns, so we may need to be a bit careful about that.

AnshulPatil2005 · 2026-05-28T11:22:24Z

Can we use cached build from master? I remember GitHub action has some complicated cache setup, some may have security concerns, so we may need to be a bit careful about that.

Yes, good point. I am also finding a safe cache strategy for base/master builds to avoid cross-branch/cache-poisoning issues and other issues as you mentioned.

This could help in other sanitizer lane pr as-well

edit :this job currently takes around 4 minutes while the overall CI critical path is much longer so i dont think we should add a cache to this

init

2789e8e

upload benchmark artifacts and clean worktrees

217463d

AnshulPatil2005 force-pushed the ci/pr-benchmark-guard branch from b963cd9 to 217463d Compare May 2, 2026 14:43

AnshulPatil2005 added 2 commits May 18, 2026 13:08

all warning only and run count mismatch

3cfaaa4

completely non-blocking and mean of mean run times compared

3b4ef01

AnshulPatil2005 marked this pull request as ready for review May 24, 2026 13:23

pca006132 requested changes May 24, 2026

View reviewed changes

Comment thread .github/workflows/manifold.yml Outdated

Comment thread .github/workflows/manifold.yml Outdated

Comment thread .github/workflows/manifold.yml Outdated

AnshulPatil2005 added 4 commits May 25, 2026 21:37

remove redundant if: always()

1be007c

changed to per benchmark mean

439ec0b

standardize perf JSON schema with run index, stdev, n_runs, and metadata

12ee8bb

.

aea1a87

pca006132 reviewed May 28, 2026

View reviewed changes

Comment thread .github/workflows/manifold.yml Outdated

Comment thread .github/workflows/manifold.yml Outdated

AnshulPatil2005 added 2 commits May 28, 2026 13:29

add raw files to logs and clean yml

a2cc8cb

add EOF

6c857f4

.

44e1489

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add non-blocking PR benchmark guard in manifold.yml (base vs head perfTest)#1680

add non-blocking PR benchmark guard in manifold.yml (base vs head perfTest)#1680
AnshulPatil2005 wants to merge 11 commits into
elalish:masterfrom
AnshulPatil2005:ci/pr-benchmark-guard

AnshulPatil2005 commented Apr 27, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Apr 27, 2026 •

edited

Loading

Uh oh!

AnshulPatil2005 commented May 2, 2026

Uh oh!

elalish commented May 3, 2026

Uh oh!

AnshulPatil2005 commented May 4, 2026 •

edited

Loading

Uh oh!

AnshulPatil2005 commented May 24, 2026

Uh oh!

elalish commented May 24, 2026

Uh oh!

pca006132 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AnshulPatil2005 commented May 25, 2026 •

edited

Loading

Uh oh!

pca006132 left a comment

Uh oh!

Uh oh!

Uh oh!

AnshulPatil2005 commented May 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

AnshulPatil2005 commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

working

reporting

expected follow-up

Uh oh!

codecov Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

AnshulPatil2005 commented May 2, 2026

Uh oh!

elalish commented May 3, 2026

Uh oh!

AnshulPatil2005 commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AnshulPatil2005 commented May 24, 2026

Uh oh!

elalish commented May 24, 2026

Uh oh!

pca006132 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AnshulPatil2005 commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pca006132 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

AnshulPatil2005 commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AnshulPatil2005 commented Apr 27, 2026 •

edited

Loading

codecov Bot commented Apr 27, 2026 •

edited

Loading

AnshulPatil2005 commented May 4, 2026 •

edited

Loading

AnshulPatil2005 commented May 25, 2026 •

edited

Loading

AnshulPatil2005 commented May 28, 2026 •

edited

Loading