Clean up benchmarking scripts #1382

sarahyurick · 2026-01-15T23:53:01Z

No description provided.

Signed-off-by: Sarah Yurick <[email protected]>

greptile-apps · 2026-01-15T23:56:16Z

Greptile Summary

This PR consolidates duplicate utility functions across benchmark scripts into a shared benchmarking/scripts/utils.py module. The refactoring removes code duplication by extracting three common patterns: executor setup, dataset file loading, and benchmark results writing. Additionally, all benchmark scripts now use a standardized metric name time_taken_s instead of the inconsistent time_taken.

Key changes:

Created new benchmarking/scripts/utils.py with shared functions: setup_executor(), load_dataset_files(), and write_benchmark_results()
Removed write_benchmark_results() from benchmarking/runner/utils.py (now consolidated in scripts/utils.py)
Updated all 7 benchmark scripts to import from local utils module instead of duplicating functions
Fixed redundant executor initialization in domain_classification_benchmark.py and embedding_generation_benchmark.py (line 66 in both files was creating a second executor that overrode the first)
Standardized metric naming: time_taken → time_taken_s across all benchmarks
Enhanced write_benchmark_results() to accept Path | str for flexibility
Removed complex sys.path manipulation in audio_fleurs_benchmark.py

Confidence Score: 5/5

This PR is safe to merge with no risk
The refactoring is straightforward and well-executed: it consolidates duplicate code into a shared module without changing functionality. The changes improve maintainability by eliminating code duplication across 7 files and fix actual bugs (redundant executor initialization). All function signatures are preserved correctly, and the new Path | str parameter type makes the API more flexible.
No files require special attention

Important Files Changed

Filename	Overview
benchmarking/runner/utils.py	Removed `write_benchmark_results` function and related imports (json, pickle) to consolidate in scripts/utils.py
benchmarking/scripts/utils.py	New utility file consolidating common benchmark functions: `setup_executor`, `load_dataset_files`, and `write_benchmark_results` with Path
benchmarking/scripts/domain_classification_benchmark.py	Removed duplicate functions (`write_results`, `load_dataset_files`), consolidated executor setup, fixed redundant executor initialization, standardized metric name
benchmarking/scripts/embedding_generation_benchmark.py	Removed duplicate utility functions, consolidated executor setup, removed redundant executor initialization, standardized metric naming

Sequence Diagram

sequenceDiagram
    participant Script as Benchmark Script
    participant Utils as scripts/utils.py
    participant Executor as Executor (Xenna/RayData/RayActors)
    participant Pipeline as Pipeline
    participant Writer as Result Writer

    Script->>Utils: setup_executor(executor_name)
    Utils->>Executor: Initialize executor instance
    Executor-->>Script: Return executor

    alt Scripts with dataset loading
        Script->>Utils: load_dataset_files(path, size_gb)
        Utils->>Utils: Calculate file subset by size
        Utils-->>Script: Return file list
    end

    Script->>Pipeline: Create pipeline with stages
    Script->>Pipeline: run(executor, initial_tasks)
    Pipeline->>Executor: Execute pipeline stages
    Executor-->>Pipeline: Return output tasks
    Pipeline-->>Script: Return results

    Script->>Script: Calculate metrics (time, documents, etc.)
    Script->>Script: Build results dict (params, metrics, tasks)
    
    Script->>Utils: write_benchmark_results(results, output_path)
    Utils->>Utils: Convert str to Path if needed
    Utils->>Writer: Create output directory
    Utils->>Writer: Write params.json
    Utils->>Writer: Write metrics.json
    Utils->>Writer: Write tasks.pkl
    Writer-->>Script: Results written

greptile-apps

_{9 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

benchmarking/scripts/common_crawl_benchmark.py

Signed-off-by: Sarah Yurick <[email protected]>

greptile-apps

Additional Comments (2)

benchmarking/scripts/common_crawl_benchmark.py, line 152 (link)

syntax: Missing type=Path for this argument. write_benchmark_results expects a Path object but will receive a string, causing AttributeError: 'str' object has no attribute 'mkdir'
benchmarking/scripts/dedup_removal_benchmark.py, line 158-160 (link)

syntax: Missing type=Path for this argument. write_benchmark_results expects a Path object but will receive a string, causing AttributeError: 'str' object has no attribute 'mkdir'

_{9 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

benchmarking/scripts/fuzzy_dedup_identification_benchmark.py

Signed-off-by: Sarah Yurick <[email protected]>

praateekmahajan

this is awesome! thank you 🙏

Clean up benchmarking scripts

ae4a94c

Signed-off-by: Sarah Yurick <[email protected]>

greptile-apps bot reviewed Jan 15, 2026

View reviewed changes

benchmarking/scripts/common_crawl_benchmark.py Outdated Show resolved Hide resolved

ruff and greptile

68d9567

Signed-off-by: Sarah Yurick <[email protected]>

greptile-apps bot reviewed Jan 16, 2026

View reviewed changes

benchmarking/scripts/fuzzy_dedup_identification_benchmark.py Show resolved Hide resolved

accept path or str

77e9c7b

Signed-off-by: Sarah Yurick <[email protected]>

sarahyurick requested review from praateekmahajan and rlratzel January 16, 2026 00:16

praateekmahajan approved these changes Jan 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clean up benchmarking scripts #1382

Clean up benchmarking scripts #1382

Uh oh!

sarahyurick commented Jan 15, 2026

Uh oh!

greptile-apps bot commented Jan 15, 2026 •

edited

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

greptile-apps bot left a comment •

edited

Loading

Uh oh!

Uh oh!

praateekmahajan left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Clean up benchmarking scripts #1382

Are you sure you want to change the base?

Clean up benchmarking scripts #1382

Uh oh!

Conversation

sarahyurick commented Jan 15, 2026

Uh oh!

greptile-apps bot commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (2)

Uh oh!

Uh oh!

praateekmahajan left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps bot commented Jan 15, 2026 •

edited

Loading

greptile-apps bot left a comment •

edited

Loading