Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to do more work in Python #382

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open

Refactor to do more work in Python #382

wants to merge 13 commits into from

Conversation

mdboom
Copy link
Contributor

@mdboom mdboom commented Mar 27, 2025

This is the first step of #380.

@mdboom mdboom force-pushed the refactor-to-python branch from 50b4624 to 75a085b Compare March 28, 2025 18:16
@mdboom mdboom marked this pull request as ready for review March 28, 2025 18:16
@mdboom
Copy link
Contributor Author

mdboom commented Mar 28, 2025

@diegorusso: This moves as much work as possible from the GHA yml files to Python, so it should be portable across automation systems. I realize it's a fairly large PR, but I would be interested in your feedback to confirm it is going to meet your needs, etc.

@mdboom mdboom requested a review from Copilot March 28, 2025 18:20
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR is the first step in the refactor to shift more work into Python as part of #380. The changes update tests, centralize benchmark hash computation via benchmark_definitions, remove the old should_run script, and adjust workflow and template scripts to use the new commands.

  • Updated tests in tests/test_run_benchmarks.py to use the new workflow.should_run and hardcoded benchmark hash logic.
  • Removed the obsolete util.get_benchmark_hash in favor of benchmark_definitions.get_benchmark_hash and updated references across the codebase.
  • Revised workflow and template scripts (including workflow_bootstrap.py, _pystats.src.yml, and _benchmark.src.yml) to align with the new command structure.

Reviewed Changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/test_run_benchmarks.py Updated tests to call hardcode_benchmark_hash and use workflow.should_run
bench_runner/util.py Removed legacy benchmark hash function
bench_runner/templates/workflow_bootstrap.py Added new bootstrap script for setting up the workflow
bench_runner/templates/benchmark.src.yml, _pystats.src.yml, _benchmark.src.yml Adjusted steps to call workflow_bootstrap.py instead of the old commands
bench_runner/scripts/workflow.py Major refactoring of workflow logic with benchmark hash updates
bench_runner/scripts/should_run.py Removed in favor of updated workflow script
bench_runner/scripts/run_benchmarks.py Updated to reference benchmark_definitions for hash
bench_runner/scripts/install.py, get_merge_base.py Minor changes for YAML generation and hash retrieval
bench_runner/git.py Updated clone function with context management changes
bench_runner/benchmark_definitions.py New centralized benchmark hash computation
bench_runner/main.py Updated commands reflecting the refactored workflow
Files not reviewed (1)
  • bench_runner/templates/env.yml: Language not supported
Comments suppressed due to low confidence (1)

bench_runner/benchmark_definitions.py:34

  • [nitpick] The variable name 'hash' shadows the built-in Python function hash(). Consider renaming it to something like 'sha' or 'hasher' to improve clarity.
    hash = hashlib.sha256()

if not sys.platform.startswith("linux"):
return

run_in_venv(venv, "pyperf", ["system", perf and "reset" or "tune"], sudo=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that the CPU frequency scaling governors are set to "powersave" when profiling and "performance" when benchmarking?
Could this not lead to different characteristics between the profiled run and the scored run? Though I admit that the profiling will create different characteristics anyway. So I guess my question is just, what is the reason for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for this is pretty recent. When @mpage added events to pyperf so that we can turn perf on and off tightly around benchmarking code, I discovered that the tuning made some of these events not fire all the time, probably due to a lower sampling rate. This resolved that and did not seem to have a significant effect on the overall perf results.

But you're right in general that the perf measurement makes the results slightly different anyway, which is unavoidable.

@GeorgeWort
Copy link

GeorgeWort commented Apr 7, 2025

It would be nice to have a way to locally set the nickname for the host as used here.

_clean(runners.get_nickname_for_hostname(socket.gethostname())),

Otherwise generate_results trips over here.

return order.index(val.split()[0]), val

Copy link

@GeorgeWort GeorgeWort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable, in that it does everything that we want it to do thanks! Though I haven't gone through and understood everything this PR changes, just that it works for our workflow.

I'm still getting error: use of undeclared identifier 'lastopcode' with the current CPython main, so python3.11 workflow_bootstrap.py python main linux-aarch64 all clang --pystats fails.

If I fix that build failure for the tail calling interpreter and I want the bytecode profiles, I can run something like python3.11 workflow_bootstrap.py python main linux-aarch64 all clang --pystats and python3.11 workflow_bootstrap.py python main linux-aarch64 all clang --pystats, but I have to separately move the pystats results files from results to profiling for profiling_plot to pick them up, is there currently a script somewhere that does this movement that needs including in this or am I missing something?

if Path(".debug").exists():
shutil.rmtree(".debug")

pystats_dir = Path("/tmp") / "pystats"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be py_stats?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Yes.

# Now that we've installed the full bench_runner library,
# continue on in a new process...

last_arg = sys.argv.index("workflow_bootstrap.py")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that I have to copy workflow_bootstrap.py out of templates into the repository that I'm running bench_runner from, could this be changed to check the index of the argument that includes "workflow_bootstrap.py"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The python -m bench_runner install script will copy the workflow_bootstrap.py script to the results repository for you (just as it already does for all of the generated GitHub workflow files).

The bootstrap script is required because we need to have a working venv to run the rest of the workflow.

Copy link

@GeorgeWort GeorgeWort Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes thanks, that got lost in my fiddling.


args = []
if pystats:
args.append("--with-pystats")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be --enable-pystats

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good catch.

@mdboom
Copy link
Contributor Author

mdboom commented Apr 8, 2025

It would be nice to have a way to locally set the nickname for the host as used here.

_clean(runners.get_nickname_for_hostname(socket.gethostname())),

Otherwise generate_results trips over here.

return order.index(val.split()[0]), val

Sorry, I'm not following. Can you be more specific? The code at the second link should just sort unknown nicknames to the beginning.

@mdboom mdboom force-pushed the refactor-to-python branch from 02c4dc1 to 6e8270f Compare April 8, 2025 13:04
@mdboom mdboom requested review from Copilot and GeorgeWort April 8, 2025 13:04
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 15 out of 16 changed files in this pull request and generated no comments.

Files not reviewed (1)
  • bench_runner/templates/env.yml: Language not supported
Comments suppressed due to low confidence (2)

bench_runner/git.py:157

  • The code uses 'contextlib.chdir', which is not a standard library function. Please verify that a valid context manager for changing the working directory is defined or imported.
with contextlib.chdir(dirname):

bench_runner/benchmark_definitions.py:33

  • The updated benchmark hash logic now produces a different hash (e.g., 'dcfded') compared to earlier expectations (like '215d35'). Confirm that this change is intentional and that all tests and dependent logic have been updated accordingly.
def get_benchmark_hash() -> str:

@GeorgeWort
Copy link

Sorry, I'm not following. Can you be more specific? The code at the second link should just sort unknown nicknames to the beginning.

I don't understand why it should do that. If the string returned by val.split()[0] is not in the order list, then order.index will throw an exception as it should.

Copy link

@GeorgeWort GeorgeWort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, only nice to haves left are the moving of the pystat files and the "unknown" issue.

@mdboom
Copy link
Contributor Author

mdboom commented Apr 8, 2025

Sorry, I'm not following. Can you be more specific? The code at the second link should just sort unknown nicknames to the beginning.

I don't understand why it should do that. If the string returned by val.split()[0] is not in the order list, then order.index will throw an exception as it should.

You're right -- I'm confusing it with find. I think the thing to do here is provide a default.

BENCHMARK_REPOS = [
BenchmarkRepo(
"56d12a8fd7cc1432835965d374929bfa7f6f7a07",
"https://github.com/mdboom/pyperformance.git",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you pointing to your own fork?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is just an artifact of needing some modifications to pyperformance early on. There's no need for this anymore, I'll change it.

# the virtual environment to run the full bench_runner.


# NOTE: This file should import in Python 3.9 or later so it can at least print
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later we require Python 3.11 though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I'll update this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, actually, this is correct. bench_runner requires 3.11 and above and is only tested on that. However, the oldest LTS Linux we use installs Python 3.9 as default. It's nice to be able to run this script and get a nice error message that a Python 3.11 needs to be installed.

@mdboom mdboom requested review from diegorusso and Copilot April 9, 2025 12:11
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 16 out of 17 changed files in this pull request and generated no comments.

Files not reviewed (1)
  • bench_runner/templates/env.yml: Language not supported
Comments suppressed due to low confidence (4)

bench_runner/benchmark_definitions.py:34

  • [nitpick] The variable name 'hash' shadows the built-in hash() function; consider renaming it to 'hasher' or a similar alternative.
hash = hashlib.sha256()

bench_runner/scripts/generate_results.py:130

  • The try/except fallback for unknown runner names defaults the index to -1; please document or reconsider if this behavior is intended for proper runner ordering.
            idx = order.index(val.split()[0])

bench_runner/scripts/install.py:248

  • The condition now skips files with a '.py' extension; verify that legitimate template files are not unintentionally omitted from processing.
if not (ROOT_PATH / path.name).is_file() or path.suffix == ".py":

bench_runner/git.py:157

  • Using 'contextlib.chdir' is nonstandard since the standard library does not provide this context manager; ensure that it is defined elsewhere or replace it with a custom context manager for changing directories.
with contextlib.chdir(dirname):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants