-
Notifications
You must be signed in to change notification settings - Fork 1.2k
yugabyted: stop child processes with SIGTERM instead of SIGKILL
#29676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for infallible-bardeen-164bc9 ready!Built without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify project configuration. |
c599bae to
3fa7e40
Compare
yugabyted: cleanup shared memory segments from tserver on interruptyugabyted: stop child processes with SIGTERM instead of SIGKILL
|
I tested simply using To simplify the PR, I've just switched the signal and remove the additional cleanup logic. However, I can add it back if you think having an extra safety measure is worthwhile (I'm not convinced it makes sense to live in Thanks! |
bin/yugabyted
Outdated
| for p in self.processes.values(): | ||
| p.kill() | ||
| self.script.delete_pidfile() | ||
| time.sleep(0.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I'd just ask you to bump the wait time, 0.5 is extremely short. Let's do 10 seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, a 10 second blind wait would be very annoying. Alright, 2 alternatives:
- Keep the blind wait, and change the sleep to 3 seconds
- Remove the blind wait - poll process status and only wait as long as any of the signalled processes are still alive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, makes sense. I added a wait_until_stop calls for each process, which looks like it should do the trick? Although this might just check the pidfile presence which isn't really the same as checking whether the process is actually running. Can you advise whether this approach is sufficient?
Currently,
yugabytedkills child processes withSIGKILL, which does not allow them to perform any cleanup. Fortserver, this means not cleaning up shared memory segments that were created. The current approach to cleaning these up is to use the previoustserverUUID from the configuration files in the base directory to determine the previous segment name and remove it. However, if the base directory no longer exists, this cleanup cannot be performed. As a consequence, the shared memory segments build up over time. On my machine, the limit for shared memory identifiers is 32, which means it doesn't take too long to encounter an issue with this.This PR proposes a way to cleanup these shared segments at termination without replacing the
SIGKILLsignal. It doesn't replace the previous approach, so anything that slips through here should still be cleaned up on the next startup. However, I think this may be a slightly cleaner way to handle this.When
yugabytedis cleaning up child processes, it now does the following:tserverprocess using the pid stored inself.processesyugabyted, which should preserve existing behaviortserverUUID fromself.configsto generate the segment nameThis should be a relatively minimal change that fails gracefully if any of the required information isn't available, but may help to avoid building up orphaned shared segments in certain cases.