Skip to content

Self-hosted runner cleanup/update bloat growing over time #2708

Open
@jschwartz-cray

Description

Describe the bug
NOTE: I recognize this is on the boundary between bug, feature, and documentation request since I'm not entirely sure what I'm looking at, so please be gentle. Starting out here.

The self-hosted runner update process appears to be leaving behind multiple directories, e.g.

bin.2.296.3
externals.2.296.3
_work/_update/externals

plus another copy of externals under _work:

_work/__externals__

In an environment with a significant number of self-hosted persistent runners this really starts to add up and it's not clear what is safe to cleanup here, or if/how the normal update process will ever clean any of this up?

In my case the root externals symlink in my runners is a symlink to externals.2.299.2 so I suspect it's safe to remove externals.2.296.3 from an old version (and likewise for bin)? What about _work/_update in general, and _work/_update/externals specifically? Can _work/__externals__ be cleaned up at the end of jobs?

It seems that perhaps the runner update process is being conservative in removing things to allow rolling back to a previous version, but given the nature of it I'm not sure that's a great strategy. It would be nice if this was exposed as a configurable choice, or there was at least a config.sh (or similar) tool to remove old unneeded versions. The alternative would be for me to periodically remove all my runners and reinstall them from scratch but that's a lot of churn.

Another alternative would be a feature to allow multiple self-hosted runners on the same machine to share a common install/bin/externals, with each one being able to point to a different version. I wouldn't be too concerned with cleaning this up if there was only one copy of it instead of N.

To Reproduce

  1. Install an old version of a self-hosted persistent runner
  2. du -hs the install directory
  3. update it
  4. du -hs the install directory

Expected behavior
There is no significant difference between the size of a freshly installed runner and one which has been updated.

Runner Version and Platform

2.99.2, x86_64 Linux

OS of the machine running the runner? OSX/Windows/Linux/...
CentOS 7

What's not working?

Multiple directories are being left behind:

$ du -hs bin.2.296.3 externals.2.296.3 _work/_update/externals _work/__externals__
74M	bin.2.296.3
318M	externals.2.296.3
318M	_work/_update/externals
318M	_work/__externals__

Job Log Output

N/A

Runner and Worker's Diagnostic Logs

Example SelfUpdate-20230627-011005.log.succeed:

[2023-06-26 20:10:05-3953] --------whoami--------
runner_user
[2023-06-26 20:10:05-3982] --------whoami--------
[2023-06-26 20:10:05-3995] Waiting for Runner.Listener (21314) to complete
[2023-06-26 20:10:05-4007] Process 21314 still running
[2023-06-26 20:10:07-0022] Process 21314 still running
[2023-06-26 20:10:09-0023] Process 21314 finished running
[2023-06-26 20:10:09-0036] Sleep 1 more second to make sure process exited
[2023-06-26 20:10:10-0024] move /home/users/runner_user/runner_dir/bin /home/users/runner_user/runner_dir/bin.2.296.3
‘/home/users/runner_user/runner_dir/bin’ -> ‘/home/users/runner_user/runner_dir/bin.2.296.3’
[2023-06-26 20:10:10-0055] move /home/users/runner_user/runner_dir/externals /home/users/runner_user/runner_dir/externals.2.296.3
‘/home/users/runner_user/runner_dir/externals’ -> ‘/home/users/runner_user/runner_dir/externals.2.296.3’
[2023-06-26 20:10:10-0085] Create junction bin folder
[2023-06-26 20:10:10-0111] Create junction externals folder
[2023-06-26 20:10:10-0163] Update succeed
[2023-06-26 20:10:10-0186] update.finished file creation succeed
[2023-06-26 20:10:10-0197] Rename /home/users/runner_user/runner_dir/_diag/SelfUpdate-20230627-011005.log to be /home/users/runner_user/runner_dir/_diag/SelfUpdate-20230627-011005.log.succeed
‘/home/users/runner_user/runner_dir/_diag/SelfUpdate-20230627-011005.log’ -> ‘/home/users/runner_user/runner_dir/_diag/SelfUpdate-20230627-011005.log.succeed’

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingkeepLabel can be added as soon as we are sure the work on the issue is necessary

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions