fix: etcd client leak in the (legacy) Upgrade API#13508
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses resource leaks in the legacy Upgrade API’s etcd validation path by ensuring temporary etcd clients are properly closed, and it narrows when etcd upgrade pre-checks run to only cases where preserve=false (i.e., when the request implies wiping behavior), aligning the checks with the legacy API’s intended semantics.
Changes:
- Close per-member etcd clients created during member health validation to prevent file descriptor leaks.
- Close the etcd client created during legacy
Upgraderequest validation to prevent accumulation across repeated calls. - Gate etcd pre-checks (mutex +
ValidateForUpgrade) under!preservein the legacy Upgrade API path.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| internal/pkg/etcd/etcd.go | Ensures temporary etcd clients created for member health/quorum checks are closed. |
| internal/app/machined/internal/server/v1alpha1/v1alpha1_server.go | Ensures the Upgrade validation etcd client is closed and runs etcd pre-checks only when preserve=false. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
frezbo
left a comment
There was a problem hiding this comment.
I'm surprised no linters caught this, i guess we need better linters that catches these, I guess copilot could have caught this
this is also super old/legacy code path not touched for ages |
This doesn't affect new upgrade API. There were two leaks in the validate path, usually they are harmless, as the Upgrade triggers a reboot, so a couple of leaking clients doesn't matter, but if they API is called repeatedly (and it fails to upgrade), the leak accumulates until `etcd` container runs out of file descriptor, eventually preventing new connections to `etcd` from being established. Also put the etcd pre-checks under the `!preserve` condition (means the API requests wipe). The wipe behavior was disabled by default for a long time, and all etcd pre-checks only make sense if the wipe is called, otherwise upgrade doesn't affect etcd membership in any way. Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
6644fca to
89e307e
Compare
|
/m |
This doesn't affect new upgrade API.
There were two leaks in the validate path, usually they are harmless, as the Upgrade triggers a reboot, so a couple of leaking clients doesn't matter, but if they API is called repeatedly (and it fails to upgrade), the leak accumulates until
etcdcontainer runs out of file descriptor, eventually preventing new connections toetcdfrom being established.Also put the etcd pre-checks under the
!preservecondition (means the API requests wipe). The wipe behavior was disabled by default for a long time, and all etcd pre-checks only make sense if the wipe is called, otherwise upgrade doesn't affect etcd membership in any way.