Clean up saved image files when E2E job exits #7590

luolanzone · 2025-11-25T09:04:58Z

Multiple E2E jobs can be scheduled on the same Jenkins node.
If stale image files from previous jobs are not deleted, they
accumulate over time, consuming significant disk space. This
can eventually lead to 'no disk space left' errors when new
jobs start.

This change ensures that saved image files are cleaned up
when an E2E job exits, preventing disk space exhaustion.

XinShuYang · 2025-11-25T09:35:00Z

ci/jenkins/test-mc.sh

            cleanup_multicluster_antrea $kubeconfig
        done
    fi
+    rm -f ${WORKDIR}/antrea-ubuntu.tar "${WORKDIR}"/antrea-mcs.tar "${WORKDIR}"/nginx.tar "${WORKDIR}"/agnhost.tar


@luolanzone Thanks for the enhancement. I'm concerned this deletion could lead to a race condition, as multiple concurrent jobs will be placing the antrea-ubuntu.tar file in the same path. I suggest reconfiguring the tar file path to WORKSPACE instead because Jenkins can resolve this environment variable to the current job's real workspace path.

Also, I investigated this issue and found that the go cache was consuming 11 GB of space on the testbed. After running go clean -cache and go clean -modcache, this space can be recovered. I feel this can provide more benefits in mitigating the disk space shortage issue.

@XinShuYang do we allow the same e2e job being scheduled and run in parallel in the same Node? If not, then, there should be no concurrent issue for this deletion step.
What I observed is different e2e jobs would be in the same Node, but the path of *.tar are different. e.g:
/var/lib/jenkins/workspace/antrea-kind-ipv6-ds-conformance-for-pull-request/antrea-ubuntu.tar
/var/lib/jenkins/workspace/antrea-kind-ipv6-ds-e2e-for-pull-request/antrea-ubuntu.tar
...

are you only referring to mc related tar? you are right, even there is no such concurrent issue, we should place them in the WORKSPACE path. I will update the mc e2e script.

Updated the path of tar files in the mc script.

I am not so sure about the go clean -cache and go clean -modcache considering it will requires each e2e job to download the cache again if we clean up the cache during every execution.

@antoninbas do you have any suggestion?

@XinShuYang do we allow the same e2e job being scheduled and run in parallel in the same Node? If not, then, there should be no concurrent issue for this deletion step. What I observed is different e2e jobs would be in the same Node, but the path of *.tar are different. e.g: /var/lib/jenkins/workspace/antrea-kind-ipv6-ds-conformance-for-pull-request/antrea-ubuntu.tar /var/lib/jenkins/workspace/antrea-kind-ipv6-ds-e2e-for-pull-request/antrea-ubuntu.tar ...

Yes, this feature should have been supported in the scripts since #5734. Although we disabled it on the Jenkins configuration after the CI migration to save resource costs, we should still consider it in new code changes in case we enable it in the future.

@luolanzone I expect the Go build cache to be pretty small because we build Antrea inside docker and we only use Go "natively" to run e2e tests.
You could check the size with du -sh $(go env GOCACHE), and I read that Go deletes cache files if they haven't been used for at least 5 days.

@XinShuYang do we allow the same e2e job being scheduled and run in parallel in the same Node? If not, then, there should be no concurrent issue for this deletion step. What I observed is different e2e jobs would be in the same Node, but the path of *.tar are different. e.g: /var/lib/jenkins/workspace/antrea-kind-ipv6-ds-conformance-for-pull-request/antrea-ubuntu.tar /var/lib/jenkins/workspace/antrea-kind-ipv6-ds-e2e-for-pull-request/antrea-ubuntu.tar ...

Yes, this feature should have been supported in the scripts since #5734. Although we disabled it on the Jenkins configuration after the CI migration to save resource costs, we should still consider it in new code changes in case we enable it in the future.

Hi @XinShuYang do we ever support to run the same e2e jobs in the same Node? How do we avoid that two jobs are generating the image files and overwrite another in the same directory? I think the capability of #5734 is to enable multiple jobs (but from different kind of jobs) on the same Node.

@XinShuYang do we allow the same e2e job being scheduled and run in parallel in the same Node? If not, then, there should be no concurrent issue for this deletion step. What I observed is different e2e jobs would be in the same Node, but the path of *.tar are different. e.g: /var/lib/jenkins/workspace/antrea-kind-ipv6-ds-conformance-for-pull-request/antrea-ubuntu.tar /var/lib/jenkins/workspace/antrea-kind-ipv6-ds-e2e-for-pull-request/antrea-ubuntu.tar ...

Yes, this feature should have been supported in the scripts since #5734. Although we disabled it on the Jenkins configuration after the CI migration to save resource costs, we should still consider it in new code changes in case we enable it in the future.

Hi @XinShuYang do we ever support to run the same e2e jobs in the same Node? How do we avoid that two jobs are generating the image files and overwrite another in the same directory? I think the capability of #5734 is to enable multiple jobs (but from different kind of jobs) on the same Node.

When two same jobs run on a single node concurrently, Jenkins creates separate working directories for each. Here is the explanation from the Jenkins documentation: Each concurrently executed build occurs in its own build workspace, isolated from any other builds. By default, Jenkins appends " @<num> " to the workspace directory name, e.g. " @2 ". @luolanzone

luolanzone · 2025-11-25T14:29:56Z

/test-multicluster-e2e

1. Multiple E2E jobs can be scheduled on the same Jenkins node. If stale image files from previous jobs are not deleted, they accumulate over time, consuming significant disk space. This can eventually lead to 'no disk space left' errors when new jobs start. This change ensures that saved image files are cleaned up when an E2E job exits, preventing disk space exhaustion. 2. Clean up Golang caches unconditionally Signed-off-by: Lan Luo <[email protected]>

luolanzone · 2025-12-29T09:32:38Z

ci/jenkins/utils.sh


 function check_and_upgrade_golang() {
+    echo "====== Clean up Golang cache ======"
+    go clean -cache -modcache -testcache || true


Met the disk space issue again, I checked the Jenkins Node, the cache will be over 1G quite soon, so add a step to clean up cache unconditionally here. @XinShuYang @antoninbas can you take another look? Thanks.

root@antrea-kind-testbed:/var/lib/jenkins# du -sh $(go env GOCACHE) 1.9G /var/lib/jenkins//.cache/go-build root@antrea-kind-testbed:/var/lib/jenkins# du -sh $(go env GOMODCACHE) 2.1G /var/lib/jenkins/go/pkg/mod

Is it necessary to delete this cache for every run? Can we instead clean it only when the storage space is below a certain threshold, similar to the implement in

antrea/ci/jenkins/utils.sh

Line 20 in 636f63d

if [[ $free_space -lt $free_space_threshold ]]; then

?

luolanzone requested a review from XinShuYang November 25, 2025 09:05

luolanzone assigned antoninbas Nov 25, 2025

XinShuYang reviewed Nov 25, 2025

View reviewed changes

luolanzone force-pushed the cleanup-stale-files branch from cf95c62 to ff43d52 Compare November 25, 2025 14:23

antoninbas removed their assignment Nov 25, 2025

antoninbas self-requested a review November 25, 2025 17:10

luolanzone force-pushed the cleanup-stale-files branch from ff43d52 to 0e42b88 Compare December 29, 2025 09:28

luolanzone force-pushed the cleanup-stale-files branch from 0e42b88 to f1d77a4 Compare December 29, 2025 09:30

luolanzone commented Dec 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clean up saved image files when E2E job exits #7590

Clean up saved image files when E2E job exits #7590

Uh oh!

luolanzone commented Nov 25, 2025

Uh oh!

XinShuYang Nov 25, 2025 •

edited

Loading

Uh oh!

luolanzone Nov 25, 2025

Uh oh!

luolanzone Nov 25, 2025

Uh oh!

luolanzone Nov 26, 2025

Uh oh!

XinShuYang Nov 26, 2025

Uh oh!

antoninbas Nov 26, 2025

Uh oh!

luolanzone Dec 29, 2025

Uh oh!

XinShuYang Dec 29, 2025

Uh oh!

luolanzone commented Nov 25, 2025

Uh oh!

luolanzone Dec 29, 2025

Uh oh!

XinShuYang Dec 29, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Clean up saved image files when E2E job exits #7590

Are you sure you want to change the base?

Clean up saved image files when E2E job exits #7590

Uh oh!

Conversation

luolanzone commented Nov 25, 2025

Uh oh!

XinShuYang Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luolanzone commented Nov 25, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

XinShuYang Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

XinShuYang Nov 25, 2025 •

edited

Loading

XinShuYang Dec 29, 2025 •

edited

Loading