Skip to content

hostagent: stop leaking inotify watchers and re-spawn the gRPC stream after guest-agent reconnect#4895

Open
mn-ram wants to merge 1 commit intolima-vm:masterfrom
mn-ram:fix/inotify-watcher-leak
Open

hostagent: stop leaking inotify watchers and re-spawn the gRPC stream after guest-agent reconnect#4895
mn-ram wants to merge 1 commit intolima-vm:masterfrom
mn-ram:fix/inotify-watcher-leak

Conversation

@mn-ram
Copy link
Copy Markdown
Contributor

@mn-ram mn-ram commented Apr 27, 2026

Fixes #4894.

startInotify never called notify.Stop (so the rjeczalik/notify reader goroutine and the per-mount inotify_init FDs leak) and never called inotifyClient.CloseSend() (so the guest-agent's PostInotify handler stays parked). It was also spawned exactly once and only logged on Send errors, so after the first guest-agent restart the inotify gRPC stream was dead but the goroutine kept "running" — host-side file changes silently stopped reaching the guest until limactl stop && limactl start.

This change:

  • adds defer notify.Stop(mountWatchCh) and defer inotifyClient.CloseSend(),
  • returns an error on inotifyClient.Send failure (instead of warn-and-loop), and
  • wraps the spawn site in a 10s reconnect loop so a fresh startInotify is launched against the new client whenever the stream dies — matching what watchGuestAgentEvents already does for the Events stream.

Test plan

  • go build ./...
  • go vet ./pkg/hostagent/...
  • go test ./pkg/hostagent/...
  • Manual: mountInotify: true Lima instance, touch a file on the host before/after systemctl restart lima-guestagent inside the guest — file changes continue to propagate after the fix; on master they stop after the first restart.

… after guest-agent reconnect

startInotify never called notify.Stop on the channel returned to the
rjeczalik/notify watchers, so each writable mount permanently leaked an
inotify_init FD and the library's internal reader goroutine. The
PostInotify gRPC client-stream was likewise never finalized with
CloseSend, leaving the guest-agent handler parked.

Worse, startInotify was spawned exactly once and looped silently on
inotifyClient.Send failures, so once the guest agent restarted the
stream was dead but the goroutine kept "running" with no path to
re-establish it. Host-side file changes silently stopped propagating
into the guest until the next limactl stop && start.

Add `defer notify.Stop` and `defer inotifyClient.CloseSend()`, return an
error on Send failure, and wrap the spawn in a 10s reconnect loop that
mirrors what watchGuestAgentEvents already does for the Events stream.

Signed-off-by: mn-ram <235066282+mn-ram@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

hostagent: startInotify leaks watchers + gRPC stream and silently stops working after guest-agent reconnect

1 participant