Skip to content

perf: remove sandbox image cache (~4.5 min overhead)#2796

Merged
rh-hemartin merged 2 commits into
mainfrom
remove-sandbox-image-cache
Jul 2, 2026
Merged

perf: remove sandbox image cache (~4.5 min overhead)#2796
rh-hemartin merged 2 commits into
mainfrom
remove-sandbox-image-cache

Conversation

@ralphbean

Copy link
Copy Markdown
Member

Summary

  • Remove the sandbox image cache from action.yml — it adds ~4.5 min overhead per run without saving pull time
  • Drop skopeo install (only used for digest comparison in the cache logic)
  • Simplify pre-pull step to a straight podman pull (~30s)

Closes #2222

Test plan

  • Verify CI runs complete successfully without the cache steps
  • Confirm sandbox image is still pulled and available for agent runs
  • Observe ~4 min improvement in total action runtime

🤖 Generated with Claude Code

@ralphbean ralphbean requested a review from a team as a code owner June 30, 2026 19:24
The sandbox image cache never provides a net time savings. The cache key
is based on the image name (not its digest), so the cached tar is always
stale. Every run pays the full cycle: restore, load, pull, save, upload
(~4.5 min) when a bare pull takes ~30s.

Remove the cache/restore, load, save, and cache/save steps entirely.
Drop skopeo (only used for digest comparison in the cache logic).
Simplify the pre-pull step to a straight podman pull.

Closes #2222

Signed-off-by: Ralph Bean <rbean@redhat.com>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ralph Bean <rbean@redhat.com>
@ralphbean ralphbean force-pushed the remove-sandbox-image-cache branch from bb696b1 to 112cee6 Compare June 30, 2026 19:24
@qodo-code-review

Copy link
Copy Markdown

PR Summary by Qodo

Remove sandbox image tar cache and skopeo from GitHub Action pre-pull

🐞 Bug fix ⚙️ Configuration changes 🕐 10-20 Minutes

Grey Divider

AI Description

• Remove sandbox image cache restore/save steps that add ~4.5 minutes per run.
• Drop skopeo install and digest-comparison logic used only by the cache.
• Simplify sandbox pre-pull to a single podman pull with a timeout and warning fallback.
Diagram

graph TD
  R["GitHub Runner"] --> A["action.yml"] --> I["Install podman"] --> P["Pre-pull image"] --> G[("GHCR image")]
  P --> AV["Image available"]
  A -. removes .-> C["actions/cache (removed)"]

  subgraph Legend
    direction LR
    _cfg["Config (action.yml)"] ~~~ _step["Step"] ~~~ _img[("Image")]
  end
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Fix the cache by keying on image digest
  • ➕ Could make caching beneficial if the tar is reused across runs
  • ➕ Avoids stale cache uploads caused by name-only keys
  • ➖ Still pays cache upload/download + podman save/load overhead, often slower than pulling layers
  • ➖ Requires remote digest resolution (e.g., registry API/skopeo) and more failure modes
2. Preload the sandbox image on self-hosted runners
  • ➕ Eliminates per-run pull time and avoids cache tar overhead entirely
  • ➕ Most reliable option for high-frequency CI
  • ➖ Requires maintaining runner images/AMIs and updating the preloaded image operationally
  • ➖ Not applicable if using only GitHub-hosted runners
3. Use a registry/layer cache mirror (or GHCR proximity)
  • ➕ Keeps the simple podman pull flow while improving pull performance
  • ➕ Avoids tar save/load complexity
  • ➖ Requires additional infra/configuration and may not help all runner regions equally

Recommendation: Given the stated measurements (cache adds ~4.5 minutes while podman pull is ~30s), removing the tar-based cache is the best default for GitHub-hosted runners. Only revisit caching if you can key on immutable digests and demonstrate net wins, or if you move to self-hosted runners where preloading is cheaper than per-run pulls.

Files changed (1) +1 / -55

Bug fix (1) +1 / -55
action.ymlRemove sandbox image cache pipeline; pull image directly with Podman +1/-55

Remove sandbox image cache pipeline; pull image directly with Podman

• Stops restoring/saving a cached sandbox image tar via 'actions/cache' and removes the associated digest-check logic. Drops 'skopeo' from the apt dependencies and reduces the pre-pull step to a straightforward 'podman pull' with a timeout and warning fallback.

action.yml

@ralphbean ralphbean changed the title fix: remove sandbox image cache (~4.5 min overhead) perf: remove sandbox image cache (~4.5 min overhead) Jun 30, 2026
@qodo-code-review

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider

Great, no issues found!

Qodo reviewed your code and found no material issues that require review

Grey Divider

Qodo Logo

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown

Site preview

Preview: https://f4827c9b-site.fullsend-ai.workers.dev

Commit: 75ad774d1e19d0499f66fb5198cd047deec86f37

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 30, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 7:28 PM UTC · Completed 7:36 PM UTC
Commit: 112cee6 · View workflow run →

@codecov

codecov Bot commented Jun 30, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@fullsend-ai-review

fullsend-ai-review Bot commented Jun 30, 2026

Copy link
Copy Markdown

Looks good to me


Labels: PR removes sandbox image caching infrastructure from the composite action (CI performance)

Previous run

Looks good to me


Labels: PR modifies sandbox image caching in the composite action (CI infrastructure)

fullsend-ai-review[bot]

This comment was marked as outdated.

@fullsend-ai-review fullsend-ai-review Bot added ready-for-merge All reviewers approved — ready to merge component/ci CI pipelines and checks component/sandbox OpenShell sandbox environment labels Jun 30, 2026

@rh-hemartin rh-hemartin left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code (and fix) are the least run agents, and the other do not use the code image, but the base one. Drop the pull entirely and let the sandbox do its work.

The code and fix agents that use the fullsend-code image are the least
frequently run agents, and the others use the base image. Let the
sandbox pull the image on demand rather than pre-pulling it in the
action setup.

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ralph Bean <rbean@redhat.com>
@fullsend-ai-review

fullsend-ai-review Bot commented Jul 1, 2026

Copy link
Copy Markdown

🤖 Review · ❌ Terminated · Started 6:40 PM UTC · Ended 6:49 PM UTC
Commit: 0123b0b · View workflow run →

@fullsend-ai-review fullsend-ai-review Bot added ready-for-merge All reviewers approved — ready to merge and removed ready-for-merge All reviewers approved — ready to merge labels Jul 1, 2026
@fullsend-ai-review

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 6:40 PM UTC · Completed 6:49 PM UTC
Commit: 75ad774 · View workflow run →

@rh-hemartin rh-hemartin added this pull request to the merge queue Jul 2, 2026
Merged via the queue into main with commit 30b0b87 Jul 2, 2026
22 of 23 checks passed
@rh-hemartin rh-hemartin deleted the remove-sandbox-image-cache branch July 2, 2026 06:12
@fullsend-ai-retro

fullsend-ai-retro Bot commented Jul 2, 2026

Copy link
Copy Markdown

🤖 Finished Retro · ✅ Success · Started 6:14 AM UTC · Completed 6:21 AM UTC
Commit: 75ad774 · View workflow run →

@fullsend-ai-retro

Copy link
Copy Markdown

Retro: PR #2796 — Remove sandbox image cache

Timeline: Human-authored PR by ralphbean (Jun 30 – Jul 2) removing ~60 lines of tar-based image caching from action.yml to eliminate ~4.5 min overhead per CI run. The review agent approved on the first pass with "Looks good to me." A human reviewer (rh-hemartin) then requested changes, pointing out that code/fix agents are the least-run agent types and other agents use a different base image — so the remaining podman pull step should also be dropped entirely. After the author updated, the PR was re-approved and merged.

Key finding: The review agent missed a valid optimization opportunity that a human caught. The human's insight required operational knowledge about which agents use which container images and how frequently each agent type runs. The review agent evaluated the diff as correct (which it was) but couldn't assess whether the change went far enough because it lacked this runtime/infrastructure context.

1 proposal filed (review agent CI infrastructure awareness). No proposals were skipped as duplicates, though #2235 (verify PRs achieve stated goals given deployment environment) is thematically adjacent — it addresses tool availability rather than runtime topology and usage patterns.

Proposals filed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/ci CI pipelines and checks component/sandbox OpenShell sandbox environment ready-for-merge All reviewers approved — ready to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sandbox image cache adds ~4.5 min overhead per run without saving pull time

3 participants