Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions .github/actions/cache-keys/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
---

name: placeholder
description: placeholder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these appear somewhere in the output? I've never used the composite actions feature, so I'm not super familiar with it. I'd think these need updating, but maybe it doesn't matter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, these are used when publishing standalone actions (in the repo root) to Marketplace. It's not limited actions being composite.

I think the missing fields were making either jsonschema or actionlint unhappy so I added dummy strings. They aren't actually needed for the action to function.


outputs:
cache-key-for-dep-files:
description: >-
A cache key string derived from the dependency declaration files.
value: ${{ steps.calc-cache-key-files.outputs.files-hash-key }}

runs:
using: composite
steps:
- name: >-
Calculate dependency files' combined hash value
for use in the cache key
id: calc-cache-key-files
run: |
from os import environ
from pathlib import Path

FILE_APPEND_MODE = 'a'

files_derived_hash = '${{
hashFiles(
'tox.ini',
'pyproject.toml',
'.pre-commit-config.yaml',
'pytest.ini',
'dependencies/**',
'dependencies/*/**',
'setup.cfg'
)
}}'

print(f'Computed file-derived hash is {files_derived_hash}.')

with Path(environ['GITHUB_OUTPUT']).open(
mode=FILE_APPEND_MODE,
) as outputs_file:
print(
f'files-hash-key={files_derived_hash}',
file=outputs_file,
)
shell: python
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, this embedded script, and the other ones like it in ci.yml, are the thing which I find most surprising in this PR.
If I'm reading correctly, the near-equivalent bash would be this?

      hash='${{
          hashFiles(
            'tox.ini',
            'pyproject.toml',
            '.pre-commit-config.yaml',
            'pytest.ini',
            'dependencies/**',
            'dependencies/*/**',
            'setup.cfg'
          )
      }}'
      echo "computed hash is $hash"
      echo "files-hash-key=$hash" >> "${GITHUB_OUTPUT}"

I have no objection to using Python over bash as the shell for a step, but I was reading it with the expectation that I'd see some Python-specific feature here. It is more verbose in this case and I'm not sure I understand why you made this choice -- am I missing something important that I should note?

If the explanation is merely a matter of preference, then let's mark this resolved, that works for me. But I didn't want to overlook some essential detail here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. There's nothing special about this. I think that originally, I was trying to unify bits of caching, even before I distilled some into https://github.com/marketplace/actions/cache-python-deps, I was trying to separate components and reduce job sizes (it seemed necessary at the time). So I came up with a part that computes the file-based portion of the cache key and the Python runtime-derived one. Eventually, it turned out that this only needs to be called once but it was already separated and I was synchronizing changes across projects so I left it as is.

The Python syntax side of this likely came from me copying some scaffolding from other steps where I was already using Python. By the way, that second cache key portion does access version data through Python imports. I don't remember but maybe they were combined at some point prior to refactoring.

Anyway, I was thinking that it'd be good to replace GHA's hashFiles() w/ pure Python (and filter out the non-existing file path wildcards dynamically), turning this into an external action later — maybe we'd just have a list of all most used file patterns and it would be called w/o inputs even.

This comment was marked as spam.


...
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---

inputs:
calling-job-context:
description: A JSON with the calling job inputs
type: string
current-job-steps:
description: >-
The `$ {{ steps }}` context passed from the reusable workflow's
tox job encoded as a JSON string. The caller passes it this input
as follows:
`current-job-steps: $ {{ toJSON(steps) }}`.
type: string
job-dependencies-context:
default: >-
{}
description: >-
The `$ {{ needs }}` context passed from the calling workflow
encoded as a JSON string. The caller is expected to form this
input as follows:
`job-dependencies-context: $ {{ toJSON(needs) }}`.
required: false
type: string

runs:
using: composite
steps:
- name: Verify that the artifacts with expected names got created
if: fromJSON(inputs.calling-job-context).toxenv == 'build-dists'
run: >
# Verify that the artifacts with expected names got created


ls -1
'dist/${{
fromJSON(
inputs.job-dependencies-context
).pre-setup.outputs.sdist-artifact-name
}}'
'dist/${{
fromJSON(
inputs.job-dependencies-context
).pre-setup.outputs.wheel-artifact-name
}}'
shell: bash
- name: Store the distribution packages
if: fromJSON(inputs.calling-job-context).toxenv == 'build-dists'
uses: actions/upload-artifact@v4
with:
name: >-
${{
fromJSON(
inputs.job-dependencies-context
).pre-setup.outputs.dists-artifact-name
}}
# NOTE: Exact expected file names are specified here
# NOTE: as a safety measure — if anything weird ends
# NOTE: up being in this dir or not all dists will be
# NOTE: produced, this will fail the workflow.
path: |
dist/${{
fromJSON(
inputs.job-dependencies-context
).pre-setup.outputs.sdist-artifact-name
}}
dist/${{
fromJSON(
inputs.job-dependencies-context
).pre-setup.outputs.wheel-artifact-name
}}
retention-days: >-
${{
fromJSON(
fromJSON(
inputs.job-dependencies-context
).pre-setup.outputs.release-requested
)
&& 90
|| 30
}}

...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't fully worked this out yet, but I think we should look for a way to make the hook mechanism easier to follow. The "double delegation" pattern (we delegate to reusable-tox, which delegates back to this action) results in some difficult to understand logic here in order to extract values.

i.e. We need fromJSON(inputs.job-dependencies-context).pre-setup.outputs.release-requested, which is a pretty long expression, in order to get an input which we had in-hand in a workflow in this repo (when we did the pre-setup step).

I think I would prefer if it were the responsibility of the ci.yml workflow to specify data to pass to the hooks, and then thread that down into this job. I haven't explored if this is possible, but something like...

# in ci.yml
 build:
    name: >-
      📦 Build dists
    needs:
    - pre-setup
    uses: tox-dev/workflow/.github/workflows/reusable-tox.yml@...
    with:
      # is it possible to make this something we pass in? (I'm guessing not, but it's an idea)
      post-tox-action: .github/workflows/reusables/reusable-tox/post-tox/action.yml
      # this is the important part: passing a blob of action inputs
      post-tox-action-call-args:
        release-requested: ${{ jobs.pre-setup.outputs.release-requested }}
        ...

# in action.yml , using that data
      retention-days: >-
        ${{
            fromJSON(
              fromJSON(inputs.call-args).release-requested
            )
            && 90
            || 30
        }}

It's not more powerful than having full access to the needs job context data, but I think the explicit arg-passing style is much easier to follow. I'm not sure how much of this is possible within the constraints of GitHub Actions -- it's been a while since I tinkered with custom actions, but I'm aware that there are some limitations on the system which could make this difficult.


All of the above is more of a comment on reusable-tox than it is about the integration here. I've followed enough to understand that we're getting context data from the original ci.yml passed down into here and we can move forward with this. But it was difficult to follow and I think it may be hard for future contributors to read.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can work on this in the reusable workflow. Some things (like passing an arbitrary hook path) are indeed impossible within the native constraints of GHA. They could probably be hacked through manual repo cloning but that's a can of worms.

I wasn't sure about passing the hook args but maybe we can think of something later. Your example of a nested input might not work natively and you'd have to turn it into JSON (or a YAML string, via |, perhaps).

246 changes: 244 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,43 @@ on:
description: >-
A JSON string with pip versions
to test against under CPython.
required: true
required: false
type: string
cpython-versions:
description: >-
A JSON string with CPython versions
to test against.
required: true
required: false
type: string
release-version:
description: >-
Target PEP440-compliant version to release.
Please, don't prepend `v`.
required: false
type: string
release-committish:
default: ''
description: >-
The commit to be released to PyPI and tagged
in Git as `release-version`. Normally, you
should keep this empty.
type: string

outputs:
dists-artifact-name:
description: Workflow artifact name containing dists.
value: ${{ jobs.pre-setup.outputs.dists-artifact-name }}
is-upstream-repository:
description: >-
A flag representing whether the workflow runs in the upstream
repository or a fork.
value: ${{ jobs.pre-setup.outputs.is-upstream-repository }}
project-name:
description: PyPI project name.
value: ${{ jobs.pre-setup.outputs.project-name }}
project-version:
description: PyPI project version string.
value: ${{ jobs.pre-setup.outputs.dist-version }}

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.sha }}
Expand All @@ -44,8 +73,221 @@ env:
PYTEST_THEME
PYTEST_THEME_MODE
PRE_COMMIT_COLOR
UPSTREAM_REPOSITORY_ID: >-
5746963

jobs:
pre-setup:
name: ⚙️ Pre-set global build settings

runs-on: ubuntu-latest

timeout-minutes: 2 # network is slow sometimes when fetching from Git

defaults:
run:
shell: python

outputs:
# NOTE: These aren't env vars because the `${{ env }}` context is
# NOTE: inaccessible when passing inputs to reusable workflows.
dists-artifact-name: python-package-distributions
dist-version: >-
${{
steps.request-check.outputs.release-requested == 'true'
&& inputs.release-version
|| steps.scm-version.outputs.dist-version
}}
project-name: ${{ steps.metadata.outputs.project-name }}
release-requested: >-
${{
steps.request-check.outputs.release-requested || false
}}
cache-key-for-dep-files: >-
${{ steps.calc-cache-key-files.outputs.cache-key-for-dep-files }}
sdist-artifact-name: ${{ steps.artifact-name.outputs.sdist }}
wheel-artifact-name: ${{ steps.artifact-name.outputs.wheel }}
is-upstream-repository: >-
${{ toJSON(env.UPSTREAM_REPOSITORY_ID == github.repository_id) }}

steps:
- name: Switch to using Python 3.14 by default
uses: actions/setup-python@v6
with:
python-version: 3.14
- name: >-
Mark the build as untagged '${{
github.event.repository.default_branch
}}' branch build
id: untagged-check
if: >-
github.event_name == 'push' &&
github.ref_type == 'branch' &&
github.ref_name == github.event.repository.default_branch
run: |
from os import environ
from pathlib import Path

FILE_APPEND_MODE = 'a'

with Path(environ['GITHUB_OUTPUT']).open(
mode=FILE_APPEND_MODE,
) as outputs_file:
print('is-untagged-devel=true', file=outputs_file)
- name: Mark the build as "release request"
id: request-check
if: inputs.release-version != ''
run: |
from os import environ
from pathlib import Path

FILE_APPEND_MODE = 'a'

with Path(environ['GITHUB_OUTPUT']).open(
mode=FILE_APPEND_MODE,
) as outputs_file:
print('release-requested=true', file=outputs_file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two scripts above seem significantly longer than their bash equivalents. More than in the case of the hashing action, I wonder if these should be bash shell steps, and then the tomllib and setuptools_scm scripts below can set shell: python.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, bash is often shorter.

This job is about computing a lot of things so I wanted to make use of Python more w/o being caught in potential Bash traps. Plus it's in sync with other repos, which would mean having to do this in other places at the same time.

It may be a good idea to look into making it more reusable too at some point.

- name: Check out src from Git
uses: actions/checkout@v4
with:
fetch-depth: >-
${{
steps.request-check.outputs.release-requested == 'true'
&& 1 || 0
}}
ref: ${{ inputs.release-committish }}
- name: Scan static PEP 621 core packaging metadata
id: metadata
run: |
from os import environ
from pathlib import Path
from tomllib import loads as parse_toml_from_string

FILE_APPEND_MODE = 'a'

pyproject_toml_txt = Path('pyproject.toml').read_text()
metadata = parse_toml_from_string(pyproject_toml_txt)['project']
project_name = metadata["name"]

with Path(environ['GITHUB_OUTPUT']).open(
mode=FILE_APPEND_MODE,
) as outputs_file:
print(f'project-name={project_name}', file=outputs_file)
- name: >-
Calculate dependency files' combined hash value
for use in the cache key
if: >-
steps.request-check.outputs.release-requested != 'true'
id: calc-cache-key-files
uses: ./.github/actions/cache-keys
- name: Set up pip cache
if: >-
steps.request-check.outputs.release-requested != 'true'
uses: re-actors/cache-python-deps@release/v1
with:
cache-key-for-dependency-files: >-
${{ steps.calc-cache-key-files.outputs.cache-key-for-dep-files }}
- name: Drop Git tags from HEAD for non-release requests
if: >-
steps.request-check.outputs.release-requested != 'true'
run: >-
git tag --points-at HEAD
|
xargs git tag --delete
shell: bash -eEuxo pipefail {0}
- name: Set up versioning prerequisites
if: >-
steps.request-check.outputs.release-requested != 'true'
run: >-
python -m
pip install
--user
setuptools-scm
shell: bash -eEuxo pipefail {0}
- name: Set the current dist version from Git
if: steps.request-check.outputs.release-requested != 'true'
id: scm-version
run: |
from os import environ
from pathlib import Path

import setuptools_scm

FILE_APPEND_MODE = 'a'

ver = setuptools_scm.get_version(local_scheme='dirty-tag')
with Path(environ['GITHUB_OUTPUT']).open(
mode=FILE_APPEND_MODE,
) as outputs_file:
print(f'dist-version={ver}', file=outputs_file)
print(
f'dist-version-for-filenames={ver.replace("+", "-")}',
file=outputs_file,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing this snippet makes me think of a tool I've been working on which might interest you:
https://mddj.readthedocs.io/en/latest/

The advantage over calling into setuptools_scm directly is that it's generic, so if you wanted a version of this script which you can port across projects, it might look like...

# /// script
# dependencies = ["mddj==0.4.2"]
# ///
from mddj.api import DJ
dj = DJ()
ver = dj.read.version()
... # rest of script follows

And that will read project.version if it's static, or call into build if it's dynamic.

The project still needs a lot of work; it's just an alpha right now. But I find I frequently want to be able to script around project metadata, so it's an interesting space to try to make a good utility.

Copy link
Member Author

@webknjaz webknjaz Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this looks fun. The thing is that this snippet is supposed to predict what version is to be expected from Git so that it can be used for sanity checks in other jobs that actually grab the metadata. Perhaps, it's fine to move to reading the metadata but up until recently, setuptools-scm didn't have an easy way to let us influence its settings from outside the config. In some projects, I was modifying pyproject.toml on disk, in multiple jobs and would --assume-unchanged it in Git. There's 3 version modes this (in combination with other steps) was supposed to achieve:

  • release (an exact matching tag on the same commit; sometimes emulated — in case of workflow_dispatch the tag is created locally but then actually pushed to GitHub way later)
  • merge to the main branch (a version that has the .dev counter but no local portion)
  • PR run (both .dev counter and the local version portion)

The first two are publishable to Test(PyPI). But since I wasn't copying all the bits from other repos just yet, this may not be as obvious.

P.S. It's always scary to modify these steps because I often mess up there with the context spread across multiple lines. So I'm usually trying to avoid that until I can turn it into something reusable.

- name: Set the expected dist artifact names
id: artifact-name
env:
PROJECT_NAME: ${{ steps.metadata.outputs.project-name }}
run: |
from os import environ
from pathlib import Path

FILE_APPEND_MODE = 'a'

whl_file_prj_base_name = environ['PROJECT_NAME'].replace('-', '_')
sdist_file_prj_base_name = (
whl_file_prj_base_name.
replace('.', '_').
lower()
)

with Path(environ['GITHUB_OUTPUT']).open(
mode=FILE_APPEND_MODE,
) as outputs_file:
print(
f"sdist={sdist_file_prj_base_name !s}-${{
steps.request-check.outputs.release-requested == 'true'
&& inputs.release-version
|| steps.scm-version.outputs.dist-version
}}.tar.gz",
file=outputs_file,
)
print(
f"wheel={whl_file_prj_base_name !s}-${{
steps.request-check.outputs.release-requested == 'true'
&& inputs.release-version
|| steps.scm-version.outputs.dist-version
}}-py3-none-any.whl",
file=outputs_file,
)

build:
name: >-
📦 Build dists
needs:
- pre-setup # transitive, for accessing settings

uses: tox-dev/workflow/.github/workflows/reusable-tox.yml@617ca35caa695c572377861016677905e58a328c # yamllint disable-line rule:line-length
with:
cache-key-for-dependency-files: >-
${{ needs.pre-setup.outputs.cache-key-for-dep-files }}
check-name: Build dists under 🐍3.12
checkout-src-git-committish: >-
${{ inputs.release-committish }}
checkout-src-git-fetch-depth: >-
${{
fromJSON(needs.pre-setup.outputs.release-requested)
&& 1
|| 0
}}
job-dependencies-context: >- # context for hooks
${{ toJSON(needs) }}
python-version: 3.12
runner-vm-os: ubuntu-latest
timeout-minutes: 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 minutes seems a bit tight, if there's any slowdown (e.g., network flakiness). Should we maybe loosen it a little, e.g. to 5 minutes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this one, this shouldn't be necessary. It takes 30s and having 4x more time is already more than enough. 1-2 extra mins for network flakiness has shown good results in the past. But I'm usually playing with longer timeouts for tests and linters, though. For example, things like sphinx's linkcheck.

toxenv: build-dists
xfail: false

linters:
name: Linters
uses: ./.github/workflows/reusable-qa.yml
Expand Down
Loading