Skip to content

ci: bound seeded Go cache size and speed up disk cleanup#38048

Merged
bircni merged 6 commits into
go-gitea:mainfrom
silverwind:ci-bound-go-cache-size
Jun 11, 2026
Merged

ci: bound seeded Go cache size and speed up disk cleanup#38048
bircni merged 6 commits into
go-gitea:mainfrom
silverwind:ci-bound-go-cache-size

Conversation

@silverwind

@silverwind silverwind commented Jun 9, 2026

Copy link
Copy Markdown
Member

Reduces the CI cache growth and disk pressure behind the flaky No space left on device failures in #37974.

go-cache — the cache-seeder saved with a restore-keys prefix fallback, so every go.sum change restored the previous cache and re-saved the union; old module versions and stale build objects accumulated (~3 GB → ~7 GB) and overflowed disk on smaller runners. Drop restore-keys from the seeder save branches so each go.sum seeds a clean, size-bounded cache. PR runs keep restore-keys for warm-start fallback.

free-disk-space — delete the unused preinstalled toolchains in parallel (~86 s → ~54 s) and log df -h / before/after.

Measured during review: the hosted ubuntu-latest fleet is heterogeneous — most runners have ~89 GB free on / (a full pgsql integration shard peaks at ~17 GB used), but a minority arrive nearly full and fail mid cache-restore. The toolchain deletion is the headroom that keeps those runners green, so it stays; the cache bound shrinks the footprint for every runner.

Authored with assistance from Claude (Opus 4.8).

The cache-seeder saved its caches with a restore-keys prefix fallback, so
every go.sum change restored the previous cache and re-saved the union. Old
module versions and stale build objects accumulated, growing the cache from
~3GB to ~7GB and exhausting runner disk (No space left on device). Drop
restore-keys from the seeder save branches so each go.sum seeds a clean,
bounded cache; PR runs keep restore-keys for warm-start fallback.

Also delete the unused preinstalled toolchains in parallel and log free
space before and after, to halve the cleanup time and make headroom visible.

Refs: go-gitea#37974

Assisted-by: Claude:Opus-4.8
@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Jun 9, 2026
@silverwind

Copy link
Copy Markdown
Member Author

Current df shows 89G free before cleanup so maybe the GHA runner disk has grown recently. Now investigating whether we can remove the disk cleanup entirely.

Temporary experiment: df shows ~89G free on / before any cleanup, so
disable the toolchain deletion and run the pgsql shards on this
action-only change to confirm db-tests still have ample disk headroom.
Adds an end-of-job df to capture peak usage. Revert once measured.

Refs: go-gitea#37974

Assisted-by: Claude:Opus-4.8
Disabling the deletion reproduced the go-gitea#37974 "No space left on device"
failure on a disk-starved runner mid cache-restore, while sibling jobs on
the common ~89G-free runners passed: the hosted fleet is heterogeneous and
the deletion is the headroom that keeps the small-disk minority green. Keep
the parallelized deletion and df logging; revert the db-test gate and
end-of-job df scaffolding used for the experiment.

Refs: go-gitea#37974

Assisted-by: Claude:Opus-4.8
@silverwind silverwind marked this pull request as ready for review June 9, 2026 12:38
@silverwind

silverwind commented Jun 9, 2026

Copy link
Copy Markdown
Member Author

Verdict: heterogeneous runner fleet. Most runs have 89G free space but some of them seem to have less than 17GB free, leading to disk space failures. Disk space cleanup is kept for those cases, df call stays so we can later debug future disk space issues.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to reduce GitHub Actions “No space left on device” CI flakes by preventing unbounded growth of the seeded Go caches and by speeding up/logging disk cleanup on runners before large cache restores.

Changes:

  • Remove restore-keys from the cache-seeder save branches so each go.sum seeds a clean, bounded cache (PR runs still use restore-keys for warm-start fallback).
  • Parallelize deletion of unused preinstalled toolchains and log df -h / before/after cleanup.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
.github/actions/go-cache/action.yml Stops cache-seeder from using restore-keys when saving caches to avoid cache “union” growth across go.sum changes.
.github/actions/free-disk-space/action.yml Speeds disk cleanup by deleting multiple toolchain directories in parallel and adds before/after free-space logging.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/actions/free-disk-space/action.yml
@github-actions github-actions Bot added the skip-changelog This PR is irrelevant for the (next) changelog, for example bug fixes for unreleased features. label Jun 9, 2026
@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Jun 9, 2026
Comment thread .github/actions/free-disk-space/action.yml Outdated
Signed-off-by: silverwind <me@silverwind.io>
@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Jun 11, 2026
@bircni bircni added the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Jun 11, 2026
@bircni bircni enabled auto-merge (squash) June 11, 2026 10:38
@bircni bircni merged commit 360f34d into go-gitea:main Jun 11, 2026
25 checks passed
@GiteaBot GiteaBot added this to the 1.28.0 milestone Jun 11, 2026
@GiteaBot GiteaBot removed the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Jun 11, 2026
@bircni bircni modified the milestones: 1.28.0, 1.27.0 Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. skip-changelog This PR is irrelevant for the (next) changelog, for example bug fixes for unreleased features.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants