fix(core): never let shell exit results hang on the output drain (#25166) by MartinCajiao · Pull Request #27842 · google-gemini/gemini-cli

MartinCajiao · 2026-06-11T01:50:52Z

TLDR

Shell commands could complete while the CLI stayed stuck showing the shell as awaiting input (#25166). The exit result of a PTY execution is gated on the output-processing chain, and that gate had no error handling and no bound: a single chunk that threw anywhere in the rendering pipeline — or whose xterm write callback was never invoked — left the execution unresolved forever. The tool call then never left executing, so activeBackgroundExecutionId stayed set and the UI kept reporting an active shell after the process had already exited.

Failure chain

useShellInactivityStatus shows the awaiting/focus state while activePtyId is set.
For model-initiated commands, activePtyId derives from the executing tool call (useGeminiStream.ts): it clears only when the tool's result promise settles.
That promise settles only in finalize() inside ptyProcess.onExit (shellExecutionService.ts), which ran exclusively through:
```
Promise.race([processingChain.then(() => 'processed'), abortFired]).then(() => {
  finalize();
});
```
Three structural holes:
- a rejected processingChain rejects the race, and with no rejection handler finalize() never runs (the CLI's global unhandledRejection handler logs and continues, so this manifests as a silent hang, not a crash);
- a chunk whose headlessTerminal.write callback never fires (xterm swallows callbacks on disposed/paused terminals; Windows ConPTY keeps flushing data after exit while the PTY is destroyed immediately on exit) leaves the chain pending forever, and no timeout existed;
- finalize() itself could throw (render(true), final serialization), skipping completeWithResult.

What changed

Commit 1 — pure correctness, no behavior change on the happy path:

every output-chunk link settles even if it throws (try/catch around the chunk executor, try/finally around the write callback), with a debug log instead of a poisoned chain;
the drain race treats a rejected chain as drained and calls finalize() on both race outcomes;
finalize() is idempotent and throw-proof end to end: a failure while rendering or serializing the final buffer degrades the captured output instead of hanging the execution;
the deferred (debounced) render is guarded: it runs in a 68ms timer outside any caller's try/catch, so a throw there was an uncaught exception that kills the whole CLI — a sibling failure mode of the same unguarded rendering pipeline, surfaced by the regression tests for this change.

Commit 2 — bounded drain (idle watchdog):

after exit, if no chunk settles for a full DRAIN_STALL_TIMEOUT_MS window (2s, polled at 250ms, unref'd and cleared on finalize), the execution finalizes with the output buffered so far and logs a warning. The watchdog is idle-based — every settled chunk resets the window — so a slow but advancing drain (large final bursts against a 300k-line scrollback) is never cut short; only a genuinely stuck chain trips it.

The exit result now always reaches the scheduler; in the worst pathological case the trailing render is degraded, never the exit code, and the stall is logged for diagnosis.

Tests

shellExecutionService.test.ts (existing harness, real headless terminal):
- rendering throws while processing output → result still resolves with the exit code and the buffer-extracted output;
- a chunk throws before reaching the terminal → result still resolves, warning logged.
shellExecutionService.drain.test.ts (new, controllable terminal mock):
- a write callback that is never invoked → watchdog finalizes after the stall window, warning logged;
- a slow drain that keeps making progress past the stall window in total time → never cut short, no warning;
- nothing left to drain → resolves immediately, no watchdog side effects.
Red/green: the two resilience tests and the stuck-callback test hang (time out) against main and pass with this change.

Known boundaries (intentionally out of scope)

If node-pty never emits onExit, nothing in this file can recover — the watchdog lives inside the exit handler. That variant, if it exists in the wild, needs process-liveness tracking and separate evidence.
The child_process fallback path waits on close (stdio drain), which a grandchild holding the pipes can delay indefinitely on Windows — same symptom family, different mechanism, and not the default path (enableInteractiveShell defaults to true). Happy to follow up separately.

Fixes #25166

The exit result of a PTY execution is gated on the output processing chain: finalize() - the only path that resolves the result - ran exclusively through Promise.race(processingChain, abortFired) with no rejection handling. A chunk that threw anywhere in the rendering pipeline poisoned the chain, the race rejected, and finalize() never ran. The tool call then stayed `executing` forever and the UI kept reporting the shell as awaiting input after the process had already exited (google-gemini#25166); the global unhandledRejection handler logs and continues, so this manifested as a silent hang rather than a crash. - Settle every output chunk even when it throws (try/catch around the chunk executor, try/finally around the terminal write callback), logging instead of poisoning the chain. - Treat a rejected chain as drained and run finalize() on both race outcomes. - Make finalize() idempotent and throw-proof: failures while rendering or serializing the final buffer degrade the captured output instead of skipping completeWithResult(). - Guard the deferred (debounced) render: it runs in a timer outside any caller try/catch, where a throw becomes an uncaught exception that kills the CLI.

Even with every chunk settling, the exit result still waits on the output chain draining through the headless terminal, and a write callback that is never invoked (xterm swallows callbacks on disposed or paused terminals; Windows ConPTY keeps flushing data after exit while the PTY is destroyed immediately) used to leave the execution unresolved forever - the visible symptom of google-gemini#25166. After exit, an idle watchdog now polls drain progress: every settled chunk resets the window, so a slow but advancing drain (large final bursts against a 300k-line scrollback) is never cut short, and only a genuinely stuck chain trips it. When it fires, the execution finalizes with the output buffered so far and logs a warning so field reports can confirm which vector was hit. The interval is unref d and cleared by finalize().

gemini-code-assist · 2026-06-11T01:51:02Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical issue where shell executions could hang indefinitely in an 'awaiting input' state after the process had actually exited. The root cause was an unguarded output-processing chain that would block the finalization of the execution if any chunk failed or if the terminal write callback was never invoked. The changes introduce a robust, fault-tolerant processing pipeline and an idle-based watchdog that guarantees the execution lifecycle completes, ensuring the UI accurately reflects the process status.

Highlights

Resilience in Output Processing: Added comprehensive error handling to the shell output processing chain, ensuring that exceptions during rendering or serialization do not leave the execution in a permanently hung state.
Exit Drain Watchdog: Implemented an idle-based watchdog timer that monitors the output drain process after a shell exits, forcing finalization if the drain stalls for more than 2 seconds.
Idempotent Finalization: Refactored the finalization logic to be idempotent and robust, ensuring the CLI always completes the execution lifecycle even if final rendering or serialization fails.
Regression Testing: Added new unit tests in shellExecutionService.drain.test.ts and regression cases in shellExecutionService.test.ts to verify behavior under stalled drain conditions and rendering failures.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2026-06-11T01:51:05Z

📊 PR Size: size/L

Lines changed: 551
Additions: +492
Deletions: -59
Files changed: 3

gemini-code-assist

Code Review

This pull request introduces a post-exit output drain watchdog to resolve issues where shell executions could hang indefinitely due to stalled output processing or swallowed terminal write callbacks. It adds robust error handling around the rendering pipeline and chunk processing to ensure that failures do not block exit finalization. Additionally, new unit tests are added to verify exit finalization resilience and the watchdog behavior. The reviewer feedback suggests replacing Date.now() with performance.now() to provide a monotonic clock source, ensuring that system clock adjustments do not cause premature or delayed timeouts.

gemini-code-assist · 2026-06-11T01:54:02Z

+      let lastDrainActivityAt = Date.now();
+      const markDrainActivity = () => {
+        lastDrainActivityAt = Date.now();
+      };


Using Date.now() for measuring elapsed time or timeouts can be unreliable if the system clock is adjusted (e.g., via NTP synchronization, VM migration, or manual changes). If the clock jumps forward, it can cause premature timeouts; if it jumps backward, it can delay the timeout.

To ensure robustness, prefer using performance.now(), which provides a monotonic clock that is guaranteed to only increase and is immune to system clock adjustments.

Suggested change

let lastDrainActivityAt = Date.now();

const markDrainActivity = () => {

lastDrainActivityAt = Date.now();

};

let lastDrainActivityAt = performance.now();

const markDrainActivity = () => {

lastDrainActivityAt = performance.now();

};

gemini-code-assist · 2026-06-11T01:54:02Z

+              if (Date.now() - lastDrainActivityAt >= DRAIN_STALL_TIMEOUT_MS) {
+                res('drain-stalled');
+              }


Use performance.now() instead of Date.now() to ensure monotonic time measurement, preventing issues caused by system clock adjustments.

Suggested change

if (Date.now() - lastDrainActivityAt >= DRAIN_STALL_TIMEOUT_MS) {

res('drain-stalled');

}

if (performance.now() - lastDrainActivityAt >= DRAIN_STALL_TIMEOUT_MS) {

res('drain-stalled');

}

Addresses review feedback: Date.now() is wall-clock time, so an NTP adjustment or VM migration could fire the stall watchdog prematurely (clock jumps forward) or delay it (clock jumps backward). performance.now() is monotonic and immune to clock adjustments. The wall-clock Date.now() uses for history timestamps are untouched.

MartinCajiao · 2026-06-11T01:59:30Z

Both suggestions addressed in 5a0083b: the drain watchdog now uses the monotonic clock (performance.now()) for both the activity marker and the stall check, so NTP/wall-clock adjustments can neither fire it prematurely nor delay it. The Date.now() uses for history timestamps are intentionally unchanged (those are genuine wall-clock values). Full suite still green: 71/71, typecheck clean.

MartinCajiao added 2 commits June 10, 2026 19:36

MartinCajiao requested a review from a team as a code owner June 11, 2026 01:50

github-actions Bot added the size/l A large sized PR label Jun 11, 2026

gemini-code-assist Bot reviewed Jun 11, 2026

View reviewed changes

gemini-cli Bot added priority/p1 Important and should be addressed in the near term. area/core Issues related to User Interface, OS Support, Core Functionality 🔒 maintainer only ⛔ Do not contribute. Internal roadmap item. labels Jun 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(core): never let shell exit results hang on the output drain (#25166)#27842

fix(core): never let shell exit results hang on the output drain (#25166)#27842
MartinCajiao wants to merge 3 commits into
google-gemini:mainfrom
MartinCajiao:fix/shell-exit-stuck-awaiting-input

MartinCajiao commented Jun 11, 2026

Uh oh!

gemini-code-assist Bot commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

MartinCajiao commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MartinCajiao commented Jun 11, 2026

TLDR

Failure chain

What changed

Tests

Known boundaries (intentionally out of scope)

Uh oh!

gemini-code-assist Bot commented Jun 11, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

MartinCajiao commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jun 11, 2026 •

edited

Loading