-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vfkit: More robust state management #20506
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: nirs The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @nirs. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Can one of the admins verify this patch? |
pkg/drivers/vfkit/vfkit.go
Outdated
return nil | ||
} | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crc
has some similar code https://github.com/crc-org/crc/blob/4bdcc16077a84583ad85ea5b8b75eefab13ba82d/pkg/drivers/vfkit/driver_darwin.go#L403-L455 , would be useful to move this to vfkit instead of each project having its own implementation of this.
pkg/drivers/vfkit/vfkit.go
Outdated
if err != nil { | ||
return err | ||
} | ||
log.Infof("Sending signal %q to pid %v", sig, pid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crc
checks that the process name contains vfkit
in case the pid file has stale content, and the pid was reused by some unrelated process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a link to this code? this is very much needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is implemented now using the process package #20528.
if err := d.SetVFKitState("Stop"); err != nil { | ||
// vfkit may be already stopped, shutting down, or not listening. | ||
log.Debugf("Failed to set vfkit state to 'Stop': %s", err) | ||
return signalPidfile(d.pidfilePath(), syscall.SIGTERM) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe remove the pid file here so that GetState
returns Stopped
after Stop
was called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
signalPidfile removes the pid when the process does not exist. But if the signal is sent the process still exists and the caller will discover it by polling on GetState(). Stop() is not responsible for waiting until the process is stopped, the caller is implementing this for all drivers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that if d.SetVFKitState("Stop")
succeeds, you have a small window for a race when GetState
will return running
while vfkit is stopping
. It's probably not really important, as GetState
will eventually get to a consistent state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use the API to get running state, so we don't have this race. But vfkit docs tells abut VirtualMachineStopping - we don't return this state after calling Stop?
pkg/drivers/vfkit/vfkit.go
Outdated
@@ -136,6 +136,49 @@ func (d *Driver) GetIP() (string, error) { | |||
return d.IPAddress, nil | |||
} | |||
|
|||
func writePidfile(pidfile string, pid int) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there might be a util or lib that we are using in other parts of the code for writing or Locking PID files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using now the process package #20528, which uses the same method used in other packages for managing pids.
pkg/drivers/vfkit/vfkit.go
Outdated
} | ||
log.Infof("Sending signal %q to pid %v", sig, pid) | ||
if err := process.Signal(sig); err != nil { | ||
if err != os.ErrProcessDone { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be more portable to use process.Kill() instead of process.Signal(SIGKILL).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB: one issue with Kill
/ SIGKILL
is that the process does not have the opportunity to catch the signal to do some cleanup on its own before exiting, see containers/gvisor-tap-vsock#485 for an example of this, but vfkit could have exactly the same problem.
This ties back to your comment in crc-org/vfkit#278, vfkit client library should provide code which does the right thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, kill should be the last thing we try if stop did not work after some retries. Minikube use kill only when deleting a cluster, but maybe it should be better to try to stop gracefully and fallback to kill if there is no other way.
Current code contains multiple implementations for managing a process using pids, with various issues: - Some are unsafe, terminating a process by pid without validating that the pid belongs to the right process. Some use unclear - Using unclear terms like checkPid() (what does it mean?) - Some are missing tests Let's clean up the mess by introducing a process package. The package provides: - process.WritePidfile(): write a pid to file - process.ReadPidfile(): read pid from file - process.Exists(): tells if process matching pid and name exists - process.Terminate() terminates a process matching pid and name - process.Kil() kill a process matching pid and name The library is tested on linux, darwin, and windows. On windows we don't have a standard way to terminate a process gracefully, so process.Terminate() is the same as process.Kill(). I want to use this package in vfkit and the new vment package, and later we can use it for qemu, hyperkit, and other code using managing processes with pids.
- Simplify GetState() using process.ReadPidfile() - Simplify Start() using process.WritePidfile()
GetState() had several issues: - When accessing vfkit HTTP API, we handled only "running", "VirtualMachineStateRunning", "stopped", and "VirtualMachineStateStopped", but there are other 10 possible states, which we handled as state.None, when vfkit is running and need to be stopped. This can lead to wrong handling in the caller. - When handling "stopped" and "VirtualMachineStateStopped" we returned state.Stopped, but did not remove the pidfile. This can cause termination of unrelated process or reporting wrong status when the pid is reused. - Accessing the HTTP API will fail after we stop or kill it. This cause GetState() to fail when the process is actually stopped, which can lead to unnecessary retries and long delays (kubernetes#20503). - When retuning state.None during Remove(), we use tried to do graceful shutdown which does not make sense in minikube delete flow, and is not consistent with state.Running handling. Accessing vfkit API to check for state does not add much value for our used case, checking if the vfkit process is running, and it is not reliable. Fix all the issues by not using the HTTP API in GetState(), and use only the process state. We still use the API for stopping and killing vfkit to do graceful shutdown. This also simplifies Remove(), since we need to handle only the state.Running state. With this change we consider vfkit as stopped only when the process does not exist, which takes about 3 seconds after the state is reported as "stopped". Example stop flow: I0309 18:15:40.260249 18857 main.go:141] libmachine: Stopping "minikube"... I0309 18:15:40.263225 18857 main.go:141] libmachine: set state: {State:Stop} I0309 18:15:46.266902 18857 main.go:141] libmachine: Machine "minikube" was stopped. I0309 18:15:46.267122 18857 stop.go:75] duration metric: took 6.127761459s to stop Example delete flow: I0309 17:00:49.483078 18127 out.go:177] * Deleting "minikube" in vfkit ... I0309 17:00:49.499252 18127 main.go:141] libmachine: set state: {State:HardStop} I0309 17:00:49.569938 18127 lock.go:35] WriteFile acquiring /Users/nir/.kube/config: ... I0309 17:00:49.573977 18127 out.go:177] * Removed all traces of the "minikube" cluster.
Previously we did not check the process name when checking a pid from a pidfile. If the pidfile became state we would assume that vfkit is running and try to stop it via the HTTP API, which would never succeed. Now we detect stale pidfile and remove it.
If setting vfkit state to "Stop" fails, we used to return an error. Retrying the operation may never succeed. Fix by falling back to terminating vfkit using a signal. This terminates vfkit immediately similar to HardStop[1]. We can still fail if the pidfile is corrupted but this is unlikely and requires manual cleanup. In the case when we are sure the vfkit process does not exist, we remove the pidfile immediately, avoiding leftover pidfile if the caller does not call GetState() after Stop(). [1] crc-org/vfkit#284
We know that setting the state to `HardStop` typically fails: I0309 19:19:42.378591 21795 out.go:177] 🔥 Deleting "minikube" in vfkit ... W0309 19:19:42.397472 21795 delete.go:106] remove failed, will retry: kill: Post "http://_/vm/state": EOF This may lead to unnecessary retries and delays. Fix by falling back to sending a SIGKILL signal. Example delete flow when setting vfkit state fails: I0309 20:07:41.688259 25540 out.go:177] 🔥 Deleting "minikube" in vfkit ... I0309 20:07:41.712017 25540 main.go:141] libmachine: Failed to set vfkit state to 'HardStop': Post "http://_/vm/state": EOF
We get and set vfkit state using the HTTP API. This is not very robust since the API is provided by the process we are terminating:
HardStop
typically fails when vfkit is terminated, leading to unneeded retries and delays.Stop
may fail if vfkit is already stopped or shutting down, or does not listen to the socket.This change fixes the issues by using vfkit API only for setting state, and falling back to process package pid management functions if the API fails.
Based on #20528 for testing.