(v1-2/3) tea.go: teardown-related deadlock & race condition fixes #1373
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
The PR addresses and resolves three possible deadlocks and a multitude of race conditions:
p.Wait()
deadlocking (externally) after the program aborts/is killed.The proposed change also factors in program abort/kill and reports such completions too.
multiple
p.Wait()
deadlocking (externally) as thep.finished
channel was not closed.The proposed change closes the respective channel before program exit, releasing such waiting functions.
p.shutdown()
deadlocking (internally) on multiple executions due top.finished
buffer constraintsThe proposed change does not send into the completion channel, but rather closes it, to signal all waiting functions.
p.Kill()
race conditions and points of failure callingp.shutdown()
again, despite it being called at program end.The proposed change cancels the program's context instead, untying
p.Kill()
from program state and allowing the natural program teardown eventually resulting inp.shutdown()
as needed.p.Kill()
is now idempotent, safe and versatile.Specifics of occasional test failures regarding
p.shutdown()
andp.Wait()
:The failed test helped uncover multiple deadlock situations that the initial code (if clause) was probably put in place to treat symptomatically (causing deadlocks on its own in the process, albeit external ones).
p.shutdown()
may get called multiple times throughout the code, in the (sometimes failing, sometimes not - depending on the race condition's timing) test's case once by our explicitp.Kill()
call and then upon exit of thep.eventLoop()
another time. This resulted in sending to thep.finished
channel twice despite it only having a buffer of one.To mitigate this problem, the logic of reporting completion through that channel was moved closer to the actual end of the program using a logically placed
defer
call right after the channel's establishment (for logic and readability).The code was also refactored to closing the
p.finished
channel on teardown, instead of sending into it, to allow unblocking of multiple blockedp.Wait()
calls instead of just one and having the rest deadlock (just one value being drained).Specifics of occasional (older and new) test failures regarding
p.Kill()
:First off, we're lucky this happened exactly here - because it doesn't always:
You may have seen this in other commits and considered it an occasional blip of the CI:
https://github.com/charmbracelet/bubbletea/actions/runs/13250713846/job/36987719494
https://github.com/charmbracelet/bubbletea/actions/runs/13796125964/job/38588086635
https://github.com/charmbracelet/bubbletea/actions/runs/13268485968/job/37041941350
But, at present there's a myriad of race conditions (seen in these test failures) and possible pitfalls associated with calling
p.Kill()
either too soon or too late. This is with the program not having fully initialised yet or another shutdown already being in progress from inside the program. It all stems from directly callingp.shutdown(true)
inside ofp.Kill()
:bubbletea/tea.go
Lines 710 to 712 in 6a1ebaa
This call makes no sense, because at the end of the natural program flow (either natural or accelerated by a quit/kill) it always gets called anyway - shutdown function then runs doubly, waits doubly, attempts to restore the terminal doubly, you get the idea.
bubbletea/tea.go
Line 658 in 6a1ebaa
Ideally a user should never need to care about when it is safe to call
p.Kill()
, it should just ensure the program is aware it needs to tear down fast and do that. Since the program is already (internal) context aware, it is safer to just cancel that context insidep.Kill()
and let the program tear down itself - always eventually ending in the neededp.shutdown()
at the end ofp.Run()
.This change makes the
p.Kill()
function versatile, idempotent and not tied to the program's state.Plus, it also eliminates all race conditions, double executions and related points of failure - which I think is sweet. 😎
Testing the proposed changes