Add benchmarking suite#63
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #63 +/- ##
=======================================
Coverage 73.60% 73.60%
=======================================
Files 36 36
Lines 2550 2550
=======================================
Hits 1877 1877
Misses 673 673 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Adds a dedicated benchmarking suite to CellularSheaves.jl, integrating BenchmarkTools.jl/PkgBenchmark.jl with shardable “small” (CI/local) and “large” (SLURM) tiers, plus docs and automation to run and render reports without impacting tests/docs builds.
Changes:
- Introduces a benchmark framework under
bench/(suite definition, shard selection, result capture, and report rendering) and abenchmark/benchmarks.jlentrypoint forPkgBenchmark.jl. - Adds a GitHub Actions workflow to run
*-smallshards in parallel and aggregate artifacts into merged reports. - Adds SLURM submission/aggregation script and documentation pages/README updates describing usage.
Reviewed changes
Copilot reviewed 17 out of 18 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| README.md | Documents benchmark setup, running, comparing, and report rendering. |
| docs/src/benchmarks.md | Adds a docs page describing tiers/shards and how to run in local/CI/SLURM contexts. |
| docs/src/benchmark_report.md | Adds a placeholder docs page for a benchmark report without running benchmarks during docs builds. |
| docs/make.jl | Adds “Benchmarks” section to the Documenter navigation. |
| CLAUDE.md | Updates contributor/agent command snippets to use the new benchmark entrypoints. |
| benchmark/benchmarks.jl | Provides PkgBenchmark.jl-compatible SUITE definition driven by env-based filtering. |
| bench/src/CellularSheavesBenchmarks.jl | Defines the benchmark module and exports the benchmark/report APIs. |
| bench/src/BenchmarkSuite.jl | Implements benchmark fixtures and the full benchmark group tree. |
| bench/src/BenchmarkShards.jl | Implements shard manifests, profile/shard selection, and CI runner guardrails. |
| bench/src/BenchmarkReports.jl | Implements artifact writing plus Markdown/HTML summary and comparison report rendering. |
| bench/slurm_benchmarks.sh | Adds SLURM array submission and dependent aggregation for “large” shards. |
| bench/run_benchmarks.jl | CLI entrypoint to run benchmarks from environment configuration. |
| bench/render_report.jl | CLI entrypoint to render merged reports from saved artifacts. |
| bench/compare_benchmarks.jl | CLI entrypoint to run PkgBenchmark comparisons from environment configuration. |
| bench/Project.toml | Adds benchmark-environment dependencies (JSON, PkgBenchmark) and compat bounds. |
| bench/benchmarks.jl | Redirects legacy script name to the new runner entrypoint. |
| .gitignore | Ignores benchmark result directories under bench/results/. |
| .github/workflows/benchmarks.yml | Adds CI workflow to run shards and upload/aggregate benchmark artifacts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ## Generated report | ||
|
|
||
| The latest checked-in benchmark report placeholder lives here: | ||
|
|
||
| - [Generated benchmark report](generated/benchmark_report.md) | ||
| - [Generated benchmark report](benchmark_report.md) |
| const STALK_DIM = 2 | ||
| const SMALL_SIZES = [20, 100] | ||
| const LARGE_SIZES = [500] | ||
| const ALL_SIZES = vcat(SMALL_SIZES, LARGE_SIZES) | ||
| const GRAPH_FAMILIES = ["cycle", "path"] | ||
|
|
||
| function make_path_sheaf(n, d) | ||
| g = path_graph(n) | ||
| sheaf_from_graph(g, d, k -> Matrix{Float64}(I, k, k); symmetric_edges=true) | ||
| end | ||
|
|
||
| function make_cycle_sheaf(n, d) | ||
| g = cycle_graph(n) | ||
| sheaf_from_graph(g, d, k -> Matrix{Float64}(I, k, k); symmetric_edges=true) | ||
| end | ||
|
|
||
| const FACTORIES = Dict( | ||
| "cycle" => make_cycle_sheaf, | ||
| "path" => make_path_sheaf, | ||
| ) | ||
|
|
||
| const SHEAVES = Dict( | ||
| (family, n) => FACTORIES[family](n, STALK_DIM) | ||
| for family in GRAPH_FAMILIES | ||
| for n in ALL_SIZES | ||
| ) |
There was a problem hiding this comment.
Fixed in the latest commit. build_suite now accepts a sizes::Vector{Int} parameter (defaulting to ALL_SIZES) and builds sheaf fixtures only for those sizes. The module-level SHEAVES, HARMONIC_BOUNDARIES, and GLOBAL_SECTION_INPUTS constants (which eagerly constructed all fixtures at load time) have been removed and replaced with local variables computed inside build_suite.
A new sizes_for_shards helper in BenchmarkShards.jl derives the minimal size set from the selected shard names, and filtered_suite now passes that to build_suite — so a small profile run only constructs 20- and 100-node sheaves, skipping the 500-node ones entirely.
There was a problem hiding this comment.
Fixed in 7a38f54. build_suite now accepts a sizes::Vector{Int} parameter (defaulting to ALL_SIZES) and constructs sheaves, boundary fixtures, and random inputs only for those sizes. A new sizes_for_shards helper derives the minimal set of sizes from the selected shards, and filtered_suite passes that to build_suite — so small runs never build n=500 fixtures.
| Run all small shards locally: | ||
|
|
||
| ```julia | ||
| julia --project=bench bench/run_benchmarks.jl | ||
| ``` | ||
|
|
||
| Run one shard explicitly: | ||
|
|
||
| ```julia | ||
| BENCHMARK_PROFILE=small BENCHMARK_SHARD=assembly-small julia --project=bench bench/run_benchmarks.jl | ||
| ``` | ||
|
|
||
| Render a merged report from prior artifacts: | ||
|
|
||
| ```julia | ||
| BENCHMARK_INPUT_DIR=bench/results BENCHMARK_OUTPUT_DIR=bench/results julia --project=bench bench/render_report.jl | ||
| ``` |
| ```julia | ||
| julia --project=bench bench/run_benchmarks.jl | ||
| BENCHMARK_INPUT_DIR=bench/results BENCHMARK_OUTPUT_DIR=bench/results BENCHMARK_DOCS_MARKDOWN=docs/src/benchmark_report.md julia --project=bench bench/render_report.jl | ||
| ``` |
|
|
||
| **Compare against a baseline ref:** | ||
| ``` | ||
| BENCHMARK_PROFILE=small BENCHMARK_BASELINE_REF=main julia --project=bench bench/compare_benchmarks.jl |
|
|
||
| **Render benchmark reports:** | ||
| ``` | ||
| BENCHMARK_INPUT_DIR=bench/results BENCHMARK_OUTPUT_DIR=bench/results julia --project=bench bench/render_report.jl |
There was a problem hiding this comment.
@copilot fix all problems that stem from the same underlying issue. There is no need to use timestamped directories for the results. Since the results directory is being passed via ENV, users can use either git commits or timestamps in bash scripts to configure this.
There was a problem hiding this comment.
Fixed in the latest commit. default_result_dir() now returns bench/results directly (no timestamp subdirectory), so the run and render commands both use the same default path. The slurm script's RESULT_DIR is similarly simplified to bench/results/slurm (no SLURM job ID derivation), and BENCHMARK_RESULT_DIR is now passed explicitly to the aggregate job so it uses the same directory as the array tasks.
| sbatch --parsable \ | ||
| --array="1-${count}" \ | ||
| --export=ALL,BENCHMARK_PROFILE="${PROFILE}",BENCHMARK_RESULT_DIR="${RESULT_DIR}" \ | ||
| "$0" run-array | ||
| )" |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
- Remove module-level SHEAVES, HARMONIC_BOUNDARIES, GLOBAL_SECTION_INPUTS constants (they were built for ALL_SIZES at module load time) - Add sizes parameter to build_suite; fixtures are now constructed inside the function for only the given sizes - Add sizes_for_shards helper in BenchmarkShards.jl - Update filtered_suite to derive needed sizes and pass to build_suite
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
This PR adds a benchmark suite including github actions and slurm sharding. The slurm part is untested at this point.