Skip to content

Improvements to Match Modern Python Package Practices #1366

@nicholaslivingstone

Description

@nicholaslivingstone

Currently, the method for setting up Benchpark (according to the docs), involves:

  1. Cloning the repository
  2. Running the setup script (which adds a bash script to the path)
  3. Separately installing the dependencies via the provided requirements file.

This has proven to be a bit of a tedious headache when trying to integrate Benchpark into an existing environments (e.g. venv, conda, uv, pixi, etc.), specifically it'd be a lot smoother to be able to run pip install BENCHPARK_URL or adding it to a dependency list. It seems there's changes that could be made which allow Benchpark to match modern python packaging and make use of Benchpark easier. I have come up with the following suggestions for improvement of the benchpark package. If there's a motivation for the design decisions I propose changing, I'd love to hear them to better understand how Benchpark works. For the time being, my group has started it's own fork to implement some of these changes to match our workflow.

Suggested Improvements

1. Forgo the Bash script and add the main file as a pyproject script.

Currently the benchpark launch script just calls lib/main.py, besides the optional coverage it doesn't do anything more. This could be added as a executable script in pyproject.toml with something akin to the following:

[project.scripts]
benchpark = "lib.main:main()"

Of couse, the coverage option either this needs to be integrated into main.py or a second script should be added for this.

2. Requirements.txt Improvements

A) Add requirements.txt to pyproject.toml

The docs currently describe installing the dependencies in requirements.txt as a separate step. If they were integrated into the pyproject.toml they could be installed automatically, I can't particularly imagine a reason when someone would want to install Benchpark without it's dependencies. This can be done without modifying the requirements file itself:

[tool.setuptools.dynamic]
dependencies = {file = ["requirements.txt"]}

B) Add Ramble Requirements from requirements.txt and just list ramble as a dependency.

Half the requirements.txt is just for the dependencies defined by ramble. Shouldn't it be better to list ramble as a dependency? And let ramble manage it's own dependencies? This also requires Benchpark developers to keep this up to date based on the ramble package.

C) Split the requirements.txt into a core requirements and a dev file.

requirements.txt currently lists the following:

pyyaml
deepdiff
pytest
flake8
xmltodict
coverage
pytest-cov
pre-commit

If you're not developing for benchpark. It'd be nice to split the dependencies into a requirements.txt and a requirements-dev.txt. The latter for things llike pytest, flake8, coverage, etc. that aren't actually needed to run Benchpark as a user.

3. Tidy Up Project Folder Structure

This is more of a broad idea rather than specifics, but most modern python packages keep all python source in either the src-layout or the flat-layout. When trying to make some of the above modifications in my own fork, I was running into some pathing issues that could be resolved by shifting to one of the aforemention layouts. Specifically, benchpark's source is currently scattered across many root folders. Most modern packages keep a single root folder with all the python source and/or leverage the package discovery capabilities of setuptools. In addition, using setup tools to include things like config/ might be a good idea as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions