Skip to content

Add benchmarking suite#63

Open
jpfairbanks wants to merge 9 commits into
mainfrom
jpf/benchmarking
Open

Add benchmarking suite#63
jpfairbanks wants to merge 9 commits into
mainfrom
jpf/benchmarking

Conversation

@jpfairbanks
Copy link
Copy Markdown
Member

This PR adds a benchmark suite including github actions and slurm sharding. The slurm part is untested at this point.

Copilot AI review requested due to automatic review settings June 2, 2026 01:48
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.60%. Comparing base (c379f01) to head (65a27d7).

Additional details and impacted files
@@           Coverage Diff           @@
##             main      #63   +/-   ##
=======================================
  Coverage   73.60%   73.60%           
=======================================
  Files          36       36           
  Lines        2550     2550           
=======================================
  Hits         1877     1877           
  Misses        673      673           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a dedicated benchmarking suite to CellularSheaves.jl, integrating BenchmarkTools.jl/PkgBenchmark.jl with shardable “small” (CI/local) and “large” (SLURM) tiers, plus docs and automation to run and render reports without impacting tests/docs builds.

Changes:

  • Introduces a benchmark framework under bench/ (suite definition, shard selection, result capture, and report rendering) and a benchmark/benchmarks.jl entrypoint for PkgBenchmark.jl.
  • Adds a GitHub Actions workflow to run *-small shards in parallel and aggregate artifacts into merged reports.
  • Adds SLURM submission/aggregation script and documentation pages/README updates describing usage.

Reviewed changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
README.md Documents benchmark setup, running, comparing, and report rendering.
docs/src/benchmarks.md Adds a docs page describing tiers/shards and how to run in local/CI/SLURM contexts.
docs/src/benchmark_report.md Adds a placeholder docs page for a benchmark report without running benchmarks during docs builds.
docs/make.jl Adds “Benchmarks” section to the Documenter navigation.
CLAUDE.md Updates contributor/agent command snippets to use the new benchmark entrypoints.
benchmark/benchmarks.jl Provides PkgBenchmark.jl-compatible SUITE definition driven by env-based filtering.
bench/src/CellularSheavesBenchmarks.jl Defines the benchmark module and exports the benchmark/report APIs.
bench/src/BenchmarkSuite.jl Implements benchmark fixtures and the full benchmark group tree.
bench/src/BenchmarkShards.jl Implements shard manifests, profile/shard selection, and CI runner guardrails.
bench/src/BenchmarkReports.jl Implements artifact writing plus Markdown/HTML summary and comparison report rendering.
bench/slurm_benchmarks.sh Adds SLURM array submission and dependent aggregation for “large” shards.
bench/run_benchmarks.jl CLI entrypoint to run benchmarks from environment configuration.
bench/render_report.jl CLI entrypoint to render merged reports from saved artifacts.
bench/compare_benchmarks.jl CLI entrypoint to run PkgBenchmark comparisons from environment configuration.
bench/Project.toml Adds benchmark-environment dependencies (JSON, PkgBenchmark) and compat bounds.
bench/benchmarks.jl Redirects legacy script name to the new runner entrypoint.
.gitignore Ignores benchmark result directories under bench/results/.
.github/workflows/benchmarks.yml Adds CI workflow to run shards and upload/aggregate benchmark artifacts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/src/benchmarks.md Outdated
Comment on lines +56 to +61
## Generated report

The latest checked-in benchmark report placeholder lives here:

- [Generated benchmark report](generated/benchmark_report.md)
- [Generated benchmark report](benchmark_report.md)
Comment thread bench/src/BenchmarkSuite.jl Outdated
Comment on lines +1 to +26
const STALK_DIM = 2
const SMALL_SIZES = [20, 100]
const LARGE_SIZES = [500]
const ALL_SIZES = vcat(SMALL_SIZES, LARGE_SIZES)
const GRAPH_FAMILIES = ["cycle", "path"]

function make_path_sheaf(n, d)
g = path_graph(n)
sheaf_from_graph(g, d, k -> Matrix{Float64}(I, k, k); symmetric_edges=true)
end

function make_cycle_sheaf(n, d)
g = cycle_graph(n)
sheaf_from_graph(g, d, k -> Matrix{Float64}(I, k, k); symmetric_edges=true)
end

const FACTORIES = Dict(
"cycle" => make_cycle_sheaf,
"path" => make_path_sheaf,
)

const SHEAVES = Dict(
(family, n) => FACTORIES[family](n, STALK_DIM)
for family in GRAPH_FAMILIES
for n in ALL_SIZES
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the latest commit. build_suite now accepts a sizes::Vector{Int} parameter (defaulting to ALL_SIZES) and builds sheaf fixtures only for those sizes. The module-level SHEAVES, HARMONIC_BOUNDARIES, and GLOBAL_SECTION_INPUTS constants (which eagerly constructed all fixtures at load time) have been removed and replaced with local variables computed inside build_suite.

A new sizes_for_shards helper in BenchmarkShards.jl derives the minimal size set from the selected shard names, and filtered_suite now passes that to build_suite — so a small profile run only constructs 20- and 100-node sheaves, skipping the 500-node ones entirely.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 7a38f54. build_suite now accepts a sizes::Vector{Int} parameter (defaulting to ALL_SIZES) and constructs sheaves, boundary fixtures, and random inputs only for those sizes. A new sizes_for_shards helper derives the minimal set of sizes from the selected shards, and filtered_suite passes that to build_suite — so small runs never build n=500 fixtures.

Comment thread README.md Outdated
Comment thread docs/src/benchmarks.md
Comment on lines +24 to +40
Run all small shards locally:

```julia
julia --project=bench bench/run_benchmarks.jl
```

Run one shard explicitly:

```julia
BENCHMARK_PROFILE=small BENCHMARK_SHARD=assembly-small julia --project=bench bench/run_benchmarks.jl
```

Render a merged report from prior artifacts:

```julia
BENCHMARK_INPUT_DIR=bench/results BENCHMARK_OUTPUT_DIR=bench/results julia --project=bench bench/render_report.jl
```
Comment on lines +7 to +10
```julia
julia --project=bench bench/run_benchmarks.jl
BENCHMARK_INPUT_DIR=bench/results BENCHMARK_OUTPUT_DIR=bench/results BENCHMARK_DOCS_MARKDOWN=docs/src/benchmark_report.md julia --project=bench bench/render_report.jl
```
Comment thread CLAUDE.md

**Compare against a baseline ref:**
```
BENCHMARK_PROFILE=small BENCHMARK_BASELINE_REF=main julia --project=bench bench/compare_benchmarks.jl
Comment thread CLAUDE.md

**Render benchmark reports:**
```
BENCHMARK_INPUT_DIR=bench/results BENCHMARK_OUTPUT_DIR=bench/results julia --project=bench bench/render_report.jl
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot fix all problems that stem from the same underlying issue. There is no need to use timestamped directories for the results. Since the results directory is being passed via ENV, users can use either git commits or timestamps in bash scripts to configure this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the latest commit. default_result_dir() now returns bench/results directly (no timestamp subdirectory), so the run and render commands both use the same default path. The slurm script's RESULT_DIR is similarly simplified to bench/results/slurm (no SLURM job ID derivation), and BENCHMARK_RESULT_DIR is now passed explicitly to the aggregate job so it uses the same directory as the array tasks.

Comment thread bench/slurm_benchmarks.sh
Comment thread bench/slurm_benchmarks.sh
Comment on lines +24 to +28
sbatch --parsable \
--array="1-${count}" \
--export=ALL,BENCHMARK_PROFILE="${PROFILE}",BENCHMARK_RESULT_DIR="${RESULT_DIR}" \
"$0" run-array
)"
Comment thread bench/slurm_benchmarks.sh
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
jpfairbanks and others added 3 commits June 1, 2026 22:10
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
- Remove module-level SHEAVES, HARMONIC_BOUNDARIES, GLOBAL_SECTION_INPUTS
  constants (they were built for ALL_SIZES at module load time)
- Add sizes parameter to build_suite; fixtures are now constructed
  inside the function for only the given sizes
- Add sizes_for_shards helper in BenchmarkShards.jl
- Update filtered_suite to derive needed sizes and pass to build_suite
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 18 changed files in this pull request and generated 1 comment.

Comment thread docs/src/benchmarks.md Outdated
jpfairbanks and others added 3 commits June 2, 2026 10:38
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants