Skip to content

Fix server crash detection#137

Merged
DRMacIver merged 2 commits intomainfrom
fix/server-crash-detection
Mar 27, 2026
Merged

Fix server crash detection#137
DRMacIver merged 2 commits intomainfrom
fix/server-crash-detection

Conversation

@DRMacIver
Copy link
Copy Markdown
Member

Fix the client hanging when the hegel server process exits unexpectedly:

  • Drop channel senders in the background reader on pipe close, unblocking any waiting threads
  • Add atexit handler to kill the server subprocess on process exit (statics aren't dropped)
  • Handle server exit properly in the event loop instead of .expect()

@DRMacIver DRMacIver force-pushed the fix/server-crash-detection branch 2 times, most recently from fe0cc80 to bd2f62e Compare March 27, 2026 09:23
Clear all channel senders when the background reader detects server exit,
add atexit handler to kill the server process, and replace the event loop's
expect() with proper match-based error handling that detects server crashes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@DRMacIver DRMacIver force-pushed the fix/server-crash-detection branch from bd2f62e to 92e1eca Compare March 27, 2026 10:52
@DRMacIver DRMacIver marked this pull request as ready for review March 27, 2026 10:53
@DRMacIver DRMacIver marked this pull request as draft March 27, 2026 12:49
@DRMacIver DRMacIver marked this pull request as ready for review March 27, 2026 15:25
Copy link
Copy Markdown
Member

@Liam-DeVoe Liam-DeVoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

registering that I think our server handling logic in general is dirty and overly complicated and probably missing some cases, but I am happy to go with "it works" for now

@DRMacIver
Copy link
Copy Markdown
Member Author

registering that I think our server handling logic in general is dirty and overly complicated and probably missing some cases, but I am happy to go with "it works" for now

Yes I agree. I don't like it at all, and I think there are demonstrably bugs lurking - we're getting some intermittent hangs in CI which I'm very suspicious about. I'm hoping that for now we can just cover it in tests until the problems all give up in terror.

Part of the problem is that we've moved to the background thread model for performance and this makes error detection much harder to do. One of the virtues of the cooperative model is that streams just broke under you and you could immediately tell. I think it's worth considering moving back to that but with a better implementation.

@DRMacIver DRMacIver merged commit fe505c7 into main Mar 27, 2026
14 checks passed
@DRMacIver DRMacIver deleted the fix/server-crash-detection branch March 27, 2026 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants