Add benchmarking suite by jpfairbanks · Pull Request #63 · AlgebraicJulia/CellularSheaves.jl

jpfairbanks · 2026-06-02T01:48:12Z

This PR adds a benchmark suite including github actions and slurm sharding. The slurm part is untested at this point.

codecov · 2026-06-02T01:52:14Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.60%. Comparing base (c379f01) to head (65a27d7).

Additional details and impacted files

@@           Coverage Diff           @@
##             main      #63   +/-   ##
=======================================
  Coverage   73.60%   73.60%           
=======================================
  Files          36       36           
  Lines        2550     2550           
=======================================
  Hits         1877     1877           
  Misses        673      673

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

Adds a dedicated benchmarking suite to CellularSheaves.jl, integrating BenchmarkTools.jl/PkgBenchmark.jl with shardable “small” (CI/local) and “large” (SLURM) tiers, plus docs and automation to run and render reports without impacting tests/docs builds.

Changes:

Introduces a benchmark framework under bench/ (suite definition, shard selection, result capture, and report rendering) and a benchmark/benchmarks.jl entrypoint for PkgBenchmark.jl.
Adds a GitHub Actions workflow to run *-small shards in parallel and aggregate artifacts into merged reports.
Adds SLURM submission/aggregation script and documentation pages/README updates describing usage.

Reviewed changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 13 comments.

Show a summary per file

File	Description
README.md	Documents benchmark setup, running, comparing, and report rendering.
docs/src/benchmarks.md	Adds a docs page describing tiers/shards and how to run in local/CI/SLURM contexts.
docs/src/benchmark_report.md	Adds a placeholder docs page for a benchmark report without running benchmarks during docs builds.
docs/make.jl	Adds “Benchmarks” section to the Documenter navigation.
CLAUDE.md	Updates contributor/agent command snippets to use the new benchmark entrypoints.
benchmark/benchmarks.jl	Provides `PkgBenchmark.jl`-compatible `SUITE` definition driven by env-based filtering.
bench/src/CellularSheavesBenchmarks.jl	Defines the benchmark module and exports the benchmark/report APIs.
bench/src/BenchmarkSuite.jl	Implements benchmark fixtures and the full benchmark group tree.
bench/src/BenchmarkShards.jl	Implements shard manifests, profile/shard selection, and CI runner guardrails.
bench/src/BenchmarkReports.jl	Implements artifact writing plus Markdown/HTML summary and comparison report rendering.
bench/slurm_benchmarks.sh	Adds SLURM array submission and dependent aggregation for “large” shards.
bench/run_benchmarks.jl	CLI entrypoint to run benchmarks from environment configuration.
bench/render_report.jl	CLI entrypoint to render merged reports from saved artifacts.
bench/compare_benchmarks.jl	CLI entrypoint to run `PkgBenchmark` comparisons from environment configuration.
bench/Project.toml	Adds benchmark-environment dependencies (`JSON`, `PkgBenchmark`) and compat bounds.
bench/benchmarks.jl	Redirects legacy script name to the new runner entrypoint.
.gitignore	Ignores benchmark result directories under `bench/results/`.
.github/workflows/benchmarks.yml	Adds CI workflow to run shards and upload/aggregate benchmark artifacts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+## Generated report
+
+The latest checked-in benchmark report placeholder lives here:
+
+- [Generated benchmark report](generated/benchmark_report.md)
+- [Generated benchmark report](benchmark_report.md)


Copilot · 2026-06-02T02:11:14Z

+const STALK_DIM = 2
+const SMALL_SIZES = [20, 100]
+const LARGE_SIZES = [500]
+const ALL_SIZES = vcat(SMALL_SIZES, LARGE_SIZES)
+const GRAPH_FAMILIES = ["cycle", "path"]
+
+function make_path_sheaf(n, d)
+    g = path_graph(n)
+    sheaf_from_graph(g, d, k -> Matrix{Float64}(I, k, k); symmetric_edges=true)
+end
+
+function make_cycle_sheaf(n, d)
+    g = cycle_graph(n)
+    sheaf_from_graph(g, d, k -> Matrix{Float64}(I, k, k); symmetric_edges=true)
+end
+
+const FACTORIES = Dict(
+    "cycle" => make_cycle_sheaf,
+    "path" => make_path_sheaf,
+)
+
+const SHEAVES = Dict(
+    (family, n) => FACTORIES[family](n, STALK_DIM)
+    for family in GRAPH_FAMILIES
+    for n in ALL_SIZES
+)


Fixed in the latest commit. build_suite now accepts a sizes::Vector{Int} parameter (defaulting to ALL_SIZES) and builds sheaf fixtures only for those sizes. The module-level SHEAVES, HARMONIC_BOUNDARIES, and GLOBAL_SECTION_INPUTS constants (which eagerly constructed all fixtures at load time) have been removed and replaced with local variables computed inside build_suite.

A new sizes_for_shards helper in BenchmarkShards.jl derives the minimal size set from the selected shard names, and filtered_suite now passes that to build_suite — so a small profile run only constructs 20- and 100-node sheaves, skipping the 500-node ones entirely.

Fixed in 7a38f54. build_suite now accepts a sizes::Vector{Int} parameter (defaulting to ALL_SIZES) and constructs sheaves, boundary fixtures, and random inputs only for those sizes. A new sizes_for_shards helper derives the minimal set of sizes from the selected shards, and filtered_suite passes that to build_suite — so small runs never build n=500 fixtures.

+Run all small shards locally:
+
+```julia
+julia --project=bench bench/run_benchmarks.jl
+```
+
+Run one shard explicitly:
+
+```julia
+BENCHMARK_PROFILE=small BENCHMARK_SHARD=assembly-small julia --project=bench bench/run_benchmarks.jl
+```
+
+Render a merged report from prior artifacts:
+
+```julia
+BENCHMARK_INPUT_DIR=bench/results BENCHMARK_OUTPUT_DIR=bench/results julia --project=bench bench/render_report.jl
+```


+```julia
+julia --project=bench bench/run_benchmarks.jl
+BENCHMARK_INPUT_DIR=bench/results BENCHMARK_OUTPUT_DIR=bench/results BENCHMARK_DOCS_MARKDOWN=docs/src/benchmark_report.md julia --project=bench bench/render_report.jl
+```


+
+**Compare against a baseline ref:**
+```
+BENCHMARK_PROFILE=small BENCHMARK_BASELINE_REF=main julia --project=bench bench/compare_benchmarks.jl


jpfairbanks · 2026-06-02T02:10:05Z

+
+**Render benchmark reports:**
+```
+BENCHMARK_INPUT_DIR=bench/results BENCHMARK_OUTPUT_DIR=bench/results julia --project=bench bench/render_report.jl


@copilot fix all problems that stem from the same underlying issue. There is no need to use timestamped directories for the results. Since the results directory is being passed via ENV, users can use either git commits or timestamps in bash scripts to configure this.

Fixed in the latest commit. default_result_dir() now returns bench/results directly (no timestamp subdirectory), so the run and render commands both use the same default path. The slurm script's RESULT_DIR is similarly simplified to bench/results/slurm (no SLURM job ID derivation), and BENCHMARK_RESULT_DIR is now passed explicitly to the aggregate job so it uses the same directory as the array tasks.

+      sbatch --parsable \
+        --array="1-${count}" \
+        --export=ALL,BENCHMARK_PROFILE="${PROFILE}",BENCHMARK_RESULT_DIR="${RESULT_DIR}" \
+        "$0" run-array
+    )"


Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

- Remove module-level SHEAVES, HARMONIC_BOUNDARIES, GLOBAL_SECTION_INPUTS constants (they were built for ALL_SIZES at module load time) - Add sizes parameter to build_suite; fixtures are now constructed inside the function for only the given sizes - Add sizes_for_shards helper in BenchmarkShards.jl - Update filtered_suite to derive needed sizes and pass to build_suite

Copilot

Pull request overview

Copilot reviewed 17 out of 18 changed files in this pull request and generated 1 comment.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

add benchmarking suite

6d19270

Copilot AI review requested due to automatic review settings June 2, 2026 01:48

Copilot started reviewing on behalf of jpfairbanks June 2, 2026 01:48 View session

Copilot AI reviewed Jun 2, 2026

View reviewed changes

Copilot started work on behalf of jpfairbanks June 2, 2026 02:07 View session

Potential fix for pull request finding

5e9e422

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot started work on behalf of jpfairbanks June 2, 2026 02:08 View session

jpfairbanks and others added 3 commits June 1, 2026 22:10

Potential fix for pull request finding

ac1ae60

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

99b51ab

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot finished work on behalf of jpfairbanks June 2, 2026 02:11

Copilot finished work on behalf of jpfairbanks June 2, 2026 02:13

Copilot started work on behalf of jpfairbanks June 2, 2026 02:13 View session

Remove timestamped default result dir; use bench/results directly

9a32566

Copilot finished work on behalf of jpfairbanks June 2, 2026 02:17

jpfairbanks requested a review from Copilot June 2, 2026 14:23

Copilot started reviewing on behalf of jpfairbanks June 2, 2026 14:24 View session

Copilot AI reviewed Jun 2, 2026

View reviewed changes

Comment thread docs/src/benchmarks.md Outdated

jpfairbanks and others added 3 commits June 2, 2026 10:38

remove redundant link

f5dd3ab

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

remove constraint on specific JSON.jl point version

3c318bb

relax type constraint in report generation

65a27d7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmarking suite#63

Add benchmarking suite#63
jpfairbanks wants to merge 9 commits into
mainfrom
jpf/benchmarking

jpfairbanks commented Jun 2, 2026

Uh oh!

codecov Bot commented Jun 2, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jun 2, 2026

Uh oh!

Copilot AI Jun 2, 2026

Uh oh!

Uh oh!

jpfairbanks Jun 2, 2026

Uh oh!

Copilot AI Jun 2, 2026

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jpfairbanks commented Jun 2, 2026

Uh oh!

codecov Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jpfairbanks Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented Jun 2, 2026 •

edited

Loading