Skip to content

ci: add dependency caching and split lint from test matrix#190

Open
framsouza wants to merge 7 commits intomistralai:mainfrom
framsouza:ci/add-caching-split-lint-from-tests
Open

ci: add dependency caching and split lint from test matrix#190
framsouza wants to merge 7 commits intomistralai:mainfrom
framsouza:ci/add-caching-split-lint-from-tests

Conversation

@framsouza
Copy link
Contributor

@framsouza framsouza commented Feb 21, 2026

Dependency caching: setup-uv has built-in cache support keyed on uv.lock but it wasn't enabled in the test or docs jobs. Added enable-cache: true and cache-dependency-glob: "uv.lock" to all jobs. The docs workflow already had its own caching setup so this makes things consistent.

Lint runs once: ruff and mypy were running inside the 4-version test matrix, which means they were running 4 times and producing the same result every time. Moved them into a separate lint job on Python 3.12. The test job now has needs: lint so if linting fails nothing else starts.

Coverage artifact: pytest was generating coverage.xml on every run but the file was never saved anywhere. Added an artifact upload so it's accessible from the Actions tab per Python version.

Each PR commit currently triggers 4 redundant package downloads and runs linting 4 times for no gain. With caching and the lint split, subsequent runs skip the downloads entirely and linting consumes one job instead of four. This translates directly to faster feedback and fewer GitHub Actions compute minutes burned, which at scale has a real cost impact.

Testing

Tested locally with act pull_request. All jobs pass.

| ================================ tests coverage ================================
| _______________ coverage: platform linux, python 3.12.3-final-0 ________________
| 
| Coverage XML written to file coverage.xml
| ================ 506 passed, 16 skipped, 38 warnings in 57.61s =================

NOTE: test_from_url will show as failing on this PR in CI until #188 is merged into main first.

Linting (ruff, mypy) was running on all 4 Python versions even though
it produces identical results regardless of version. Split it into a
separate job that runs once on 3.12, and added needs: lint so the test
matrix doesn't spin up if linting fails.

Also enabled setup-uv's built-in caching in all jobs — the docs workflow
already had this but the test workflow didn't. And the coverage.xml that
pytest was generating was just getting thrown away, so added an artifact
upload to make it accessible after runs.
@juliendenize
Copy link
Contributor

Added enable-cache: true and cache-dependency-glob: "uv.lock" to all jobs.

Default settings should be correct here and take care of this. Might be worth to update the action but not the rest.

ruff and mypy were running inside the 4-version test matrix, which means they were running 4 times and producing the same result every time.

Don't think it's true, there are issues rising only for some Py versions so we need to ensure everything works as expected for all.

@framsouza
Copy link
Contributor Author

thanks @juliendenize , you've raised two valid points that I had overlooked.

You're right that setup-uv handles caching well by default. I've removed the explicit cache-dependency-glob since the default already covers uv.lock, pyproject.toml, and other dependency files, which is actually broader and better than what I had. I kept enable-cache: true to be explicit, though on github-hosted runners auto would also work.

About linting, I moved mypy back to the test matrix so it runs across all four python versions, the runtime python does matters here and different versions can surface different type errors as you said.

Ruff stays in the single-version lint job because AFAIU (please correct me if i'm wrong) it works differently, it determines its target python version from requires-python in pyproject.toml (https://docs.astral.sh/ruff/configuration/#python-version), not from the runtime interpreter. It's a Rust binary that parses source files directly and never loads version-specific stubs or interacts with the python runtime. The output is identical regardless of which python runs it.

oh, and I also moved uv lock --check to the lint job since it just verifies that uv.lock is in sync with pyproject.toml, it doesn't depend on the python version, so running it 4 times in the matrix was redundant.

@abdelhadi703
Copy link
Contributor

Great improvement! Separating lint from the test matrix is a solid optimization — running ruff 4 times across Python versions when the result is identical wastes CI minutes.

The coverage artifact upload is also valuable — it makes coverage data available for future integration with tools like Codecov.

I reviewed the changes and they look clean. +1 for merging.

Copy link
Contributor

@juliendenize juliendenize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still adjustments to make as now we support 3.14 and i don't think we should split linting from tests as we want the linter to work on all versions.

framsouza and others added 5 commits March 8, 2026 15:38
Co-authored-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
test_from_url made live HTTP requests to example.com and
download.samplelib.com. On macOS with uv-managed Python, SSL cert
verification fails before the request completes, causing the test to
hit the wrong error branch and fail.

Beyond the local failure, live network calls in unit tests are fragile
by design, they depend on external availability and can cause
non-deterministic CI failures.

Replace all three cases with mocked responses:
- failed request: mock requests.get raising RequestException
- invalid content: mock a 200 response with HTML body
- valid audio: mock a 200 response with a real wav file built in memory

No behavior changes to production code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants