Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #580 +/- ##
==========================================
- Coverage 67.88% 66.14% -1.74%
==========================================
Files 51 63 +12
Lines 8295 8658 +363
==========================================
+ Hits 5631 5727 +96
- Misses 2664 2931 +267
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
e7c84fb to
b173bca
Compare
|
@aviatesk Does this PR work correctly in Zed? I tried it in Helix for a while, and it does not seem to cause any major issues. It also appears that the original problem has been fixed, so it might be fine to merge it. |
When the client (e.g. Neovim) terminates without sending an `exit` notification, `read_task` exits its loop but `in_msg_queue` remains open. This causes `take!` in `iterate` to block indefinitely. Apply the same sentinel pattern already used for `out_msg_queue`: `read_task` puts `nothing` into `in_msg_queue` on loop exit, and `iterate` checks for it to terminate the server loop gracefully. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a client like Neovim terminates without waiting for the shutdown response, `write_task` dies from EPIPE while messages remain in `out_msg_queue`. Since `isready` only checks whether the queue has data and does not consider consumer liveness, `flush` becomes an infinite busy loop (`yield()` forever). Check `istaskdone(endpoint.write_task)` to break out of the loop when there is no consumer alive to drain the queue. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Thanks for the thorough analysis and the fix — I strongly agree with the overall approach. Let's move forward with this. I've confirmed that the normal server lifecycle (Zed, VSCode) is not affected by these changes. Since this code is fairly subtle, I'll push a few additional comments and safeguards for future reference. Once I push those, could you take a look and confirm? Then we can merge. |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
b173bca to
25bc059
Compare
| # the server loop hangs forever when the input stream closes. | ||
| # Guard with `isopen` since `close(endpoint)` may have already | ||
| # closed the channel during normal shutdown. | ||
| isopen(in_msg_queue) && put!(in_msg_queue, nothing) |
There was a problem hiding this comment.
When read_task exits during normal shutdown, close(endpoint) may have already closed in_msg_queue. Without this guard, put! might throw InvalidStateException on the closed channel. The exception is silently swallowed (since read_task is never waited on), but the guard makes the intent explicit and avoids the unnecessary exception.
|
Also, please update CHANGLEOG.md. This is an important enhancement. |
Summary
Fix a bug where the JETLS process remains running indefinitely when a client such as Neovim terminates without sending the
exitnotification.Two hang points in
Endpointare addressed.Background
In some setups, the client process appears to terminate before the language server returns the ShutdownResponse. This behavior has been observed with Neovim under their default configuration.
In such cases, JETLS is expected to shut down gracefully by monitoring the process Id.
However, two issues in
Endpointprevented this from happening and caused the server process to remain running indefinitely.Where the server hangs
sequenceDiagram participant N as Neovim participant R as read_task participant S as Server loop<br/>(calls iterate) participant W as write_task participant M as processId monitor N->>S: ShutdownRequest S->>W: ShutdownResponse (via out_msg_queue) N-xN: Process exits (no exit notification) Note over N: stdin/stdout/stderr pipes break R-->>R: readlsp returns nothing Note over R: Loop exits W-->>W: writelsp hits EPIPE W-->>W: err_handler writes to stderr Note over W: stderr also broken<br/>-> double EPIPE<br/>-> task dies S->>S: take!(in_msg_queue) Note over S: ❌ Hang 1<br/>Queue still open, blocks forever M->>M: Polls every 60s M->>S: SelfShutdownNotification Note over S: Loop exits -> finally S->>S: close(endpoint) -> flush() Note over S: ❌ Hang 2<br/>isready stays true (write_task dead)<br/>yield() loops forever(Both Mermaid charts were initially generated by Claude; I have manually reviewed them for correctness.)
Hang 1:
take!(in_msg_queue)blocksWhen stdin closes,
read_taskexits its loop, butin_msg_queueremains open. As a result,take!(in_msg_queue)initeratekeeps waiting for a message and blocks indefinitely.The processId monitor (polling every 60 seconds) eventually resolves this by putting
SelfShutdownNotificationinto the queue, but the server still fails to terminate due to Hang 2.Hang 2:
flush(endpoint)loops indefinitelywrite_taskencounters EPIPE when writing to stdout. The code attempts to recover (catch->err_handler->continue), but the defaulterr_handlerwrites to stderr using@erroranddisplay_error.After the client exits, stderr is also a broken pipe. As a result,
err_handleritself throws EPIPE, which is not caught and terminateswrite_task.After the server loop exits,
finallycallsclose(endpoint)->flush(endpoint). Theflushimplementation waits for the queue to drain:However,
isreadyonly checks whether the queue contains data; it does not consider whether a consumer is still alive. Sincewrite_taskhas already terminated, the queue never drains and the loop runs forever. This appears to be one of the main reasons the server hangs.Fixes
Commit 1: Send a sentinel to
in_msg_queueWhen
read_taskexits, it insertsnothingintoin_msg_queue.iteratedetects this sentinel and returnsnothing, which ends the server loop and proceeds to thefinallyblock.Commit 2: Detect
write_tasktermination influshThe
flushloop now checksand exits when the consumer task has already terminated. This allows
close(endpoint)in thefinallyblock to complete and the server process to exit normally.The underlying cause of Hang 2 is the unintended termination of
write_taskdue to the double EPIPE inerr_handler. That issue could be addressed separately by fixingerr_handler. This change instead ensures robustly thatflushdoes not hang regardless of whywrite_taskterminates.After the fix
Before the fix, the process would remain active every time, but now it closes immediately once the editor is shut down!
sequenceDiagram participant N as Neovim participant R as read_task participant S as Server loop<br/>(calls iterate) participant W as write_task N->>S: ShutdownRequest S->>W: ShutdownResponse (via out_msg_queue) N-xN: Process exits (no exit notification) Note over N: stdin/stdout/stderr pipes break R-->>R: readlsp returns nothing Note over R: Loop exits R->>S: put!(in_msg_queue, nothing) Note over R: ✅ Fix 1: sentinel W-->>W: double EPIPE -> task dies S->>S: take! returns nothing Note over S: Loop exits -> finally S->>W: istaskdone? -> true Note over S: ✅ Fix 2: flush breaks Note over S: Process exits normallyRelated issue
This may resolve #561, but there could be other contributing factors, so it may be better to wait before closing it. For example, I noticed that the process can remain when Neovim is closed immediately after startup (this does not seem to occur with other editors).
Notes
Testing every possible scenario is quite challenging, so this has not been verified comprehensively. However, after implementing the fix and using it for a while with Neovim, Helix, and VSCode, I confirmed that processes start, communicate, and terminate correctly.
Feedback from Neovim & other editors would be appreciated.