Add minimal benchmark runner for comparing refs using ASV by sanskarmit · Pull Request #128 · sympy/sympy_benchmarks

sanskarmit · 2026-03-30T18:43:47Z

References to other Issues or PRs

NO

Brief description of what is fixed or changed

Adds a minimal ASV-based benchmark runner to compare performance between two refs and generate a structured JSON report. Includes unit tests also.

AI Generation Disclosure

I used online resources and AI tools to assist in understanding ASV usage patterns, structuring some parts of the implementation, and enhancing code readability. All of the code is reviewed, adapted, and tested by me to ensure accuracy and appropriateness in relation to the project requirements.

I am familiar with the implementation, and I can explain or modify any part of the code.

Copilot

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-30T19:14:24Z

+def rev_parse(ref: str) -> str:
+    """Resolve a git ref to a commit hash."""
+    return subprocess.run(
+        ["git", "rev-parse", ref], check=True, capture_output=True, text=True
+    ).stdout.strip()


rev_parse() runs git rev-parse in the current working directory, which resolves refs/commits for this repository, not the repository being benchmarked by ASV (configured via asv.conf.json as https://github.com/sympy/sympy.git). This will typically fail for defaults like origin/master (if this repo’s default branch is main) and, even if it succeeds, it will produce commit hashes that won’t match the ASV result filenames, leading to empty/incorrect reports. Consider removing this and instead (a) accept commit hashes as inputs, or (b) determine the commit hash from the ASV results produced in results_dir (e.g., by inspecting the created *-*.json filenames / metadata), or (c) run git rev-parse inside ASV’s checkout of the benchmarked repo.

Copilot · 2026-03-30T19:14:24Z

+    result_files = list(results_dir.glob(f"**/{commit_hash[:8]}-*.json")) + list(
+        results_dir.glob(f"**/{commit_hash}-*.json")
+    ) 
+
+    merged: Dict[str, float] = {}
+    for result_file in result_files:
+        data = json.loads(result_file.read_text())
+        for benchmark_name, payload in data.get("results", {}).items():
+            value = extract_numeric(payload.get("result") if isinstance(payload, dict) else payload) 
+            if value is not None:
+                merged[benchmark_name] = value


load_results() merges results across all matching JSON files but iterates files in filesystem order and overwrites merged[benchmark_name] on collisions. If ASV produces multiple result files for the same commit (e.g., multiple envs from the matrix), this makes the output nondeterministic and can silently drop data. Consider sorting result_files and either grouping results by environment/file or erroring when the same benchmark appears with different values.

Suggested change

result_files = list(results_dir.glob(f"**/{commit_hash[:8]}-*.json")) + list(

results_dir.glob(f"**/{commit_hash}-*.json")

)

merged: Dict[str, float] = {}

for result_file in result_files:

data = json.loads(result_file.read_text())

for benchmark_name, payload in data.get("results", {}).items():

value = extract_numeric(payload.get("result") if isinstance(payload, dict) else payload)

if value is not None:

merged[benchmark_name] = value

result_files = sorted(

list(results_dir.glob(f"**/{commit_hash[:8]}-*.json"))

+ list(results_dir.glob(f"**/{commit_hash}-*.json")),

key=lambda p: str(p),

)

merged: Dict[str, float] = {}

benchmark_sources: Dict[str, Path] = {}

for result_file in result_files:

data = json.loads(result_file.read_text())

for benchmark_name, payload in data.get("results", {}).items():

value = extract_numeric(

payload.get("result") if isinstance(payload, dict) else payload

)

if value is None:

continue

if benchmark_name in merged:

if merged[benchmark_name] != value:

raise ValueError(

f"Conflicting results for benchmark '{benchmark_name}': "

f"{merged[benchmark_name]} from {benchmark_sources[benchmark_name]} "

f"vs {value} from {result_file}"

)

else:

merged[benchmark_name] = value

benchmark_sources[benchmark_name] = result_file

Copilot · 2026-03-30T19:14:24Z

+    parser.add_argument("--base-ref", default="origin/master")
+    parser.add_argument("--target-ref", default="HEAD")
+    parser.add_argument("--config", default="asv.conf.json")
+    parser.add_argument("--output", default="benchmark_results/runner_report.json")
+    parser.add_argument("--machine", default="benchmark-runner")
+    parser.add_argument("--bench", default=".*")
+    return parser.parse_args()
+
+
+def main() -> None:
+    args = parse_args()
+    output_path = Path(args.output)
+    output_path.parent.mkdir(parents=True, exist_ok=True)
+
+    base_hash = rev_parse(args.base_ref)
+    target_hash = rev_parse(args.target_ref)
+
+    base_results_dir = output_path.parent / "base_results"
+    target_results_dir = output_path.parent / "target_results"
+    if base_results_dir.exists():
+        shutil.rmtree(base_results_dir)
+    if target_results_dir.exists():
+        shutil.rmtree(target_results_dir)


The default output location (benchmark_results/...) and the derived base_results/target_results directories are not covered by .gitignore (whereas results/ is). Running this script will likely leave many untracked files in the working tree. Consider defaulting output/results under an already-ignored directory (like results/) or updating the ignore patterns accordingly.

Copilot · 2026-03-30T19:14:24Z

+def main() -> None:
+    args = parse_args()
+    output_path = Path(args.output)
+    output_path.parent.mkdir(parents=True, exist_ok=True)
+
+    base_hash = rev_parse(args.base_ref)
+    target_hash = rev_parse(args.target_ref)
+
+    base_results_dir = output_path.parent / "base_results"
+    target_results_dir = output_path.parent / "target_results"
+    if base_results_dir.exists():
+        shutil.rmtree(base_results_dir)
+    if target_results_dir.exists():
+        shutil.rmtree(target_results_dir)
+
+    run_asv(args.base_ref, args.config, base_results_dir, machine=args.machine, bench=args.bench)
+    run_asv(args.target_ref, args.config, target_results_dir, machine=args.machine, bench=args.bench) 
+
+    report = {
+        "metadata": {
+            "base_ref": args.base_ref,
+            "target_ref": args.target_ref,
+            "bench": args.bench,
+            "machine": args.machine,
+        },
+        "base": {
+            "ref": args.base_ref,
+            "commit": base_hash,
+            "results": load_results(base_results_dir, base_hash),
+        },
+        "target": {
+            "ref": args.target_ref,
+            "commit": target_hash,
+            "results": load_results(target_results_dir, target_hash), 
+        },
+    }
+
+    output_path.write_text(json.dumps(report, indent=2, sort_keys=True))
+    print(f"Wrote {output_path}")


The CLI path in main() (arg parsing + running both refs + writing the report) is currently untested; the tests cover run_asv() command construction and result parsing only. Adding an integration-style unit test that monkeypatches subprocess.run and creates fake ASV result files would help prevent regressions (and would catch issues like resolving commit hashes in the wrong repo).

Add minimal benchmark runner for comparing refs using ASV

f94be69

Copilot AI review requested due to automatic review settings March 30, 2026 18:43

Copilot started reviewing on behalf of sanskarmit March 30, 2026 18:48 View session

Copilot AI reviewed Mar 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add minimal benchmark runner for comparing refs using ASV#128

Add minimal benchmark runner for comparing refs using ASV#128
sanskarmit wants to merge 1 commit intosympy:masterfrom
sanskarmit:clean-benchmark-runner

sanskarmit commented Mar 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

sanskarmit commented Mar 30, 2026

References to other Issues or PRs

Brief description of what is fixed or changed

AI Generation Disclosure

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants