Fix Windows terminal Ctrl-C process cleanup by neubig · Pull Request #3171 · OpenHands/software-agent-sdk

neubig · 2026-05-09T03:13:20Z

Summary

terminate descendant processes for the Windows PowerShell terminal backend when handling Ctrl-C interrupts
also clean up descendant processes when closing/resetting the Windows terminal session
add a Windows regression test showing a timed-out child process tree is stopped after C-c

Tests

uv run pytest tests/tools/terminal/test_windows_ctrl_c.py tests/tools/terminal/test_send_keys.py -q
uv run pre-commit run --files openhands-tools/openhands/tools/terminal/terminal/windows_terminal.py tests/tools/terminal/test_windows_ctrl_c.py

This pull request was created by an AI agent (OpenHands) on behalf of the user.

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:0090c24-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-0090c24-python \
  ghcr.io/openhands/agent-server:0090c24-python

All tags pushed for this build

ghcr.io/openhands/agent-server:0090c24-golang-amd64
ghcr.io/openhands/agent-server:0090c2445b34a668a45f2454b8d3667351e10e2b-golang-amd64
ghcr.io/openhands/agent-server:windows-terminal-backend-golang-amd64
ghcr.io/openhands/agent-server:0090c24-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:0090c24-golang-arm64
ghcr.io/openhands/agent-server:0090c2445b34a668a45f2454b8d3667351e10e2b-golang-arm64
ghcr.io/openhands/agent-server:windows-terminal-backend-golang-arm64
ghcr.io/openhands/agent-server:0090c24-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:0090c24-java-amd64
ghcr.io/openhands/agent-server:0090c2445b34a668a45f2454b8d3667351e10e2b-java-amd64
ghcr.io/openhands/agent-server:windows-terminal-backend-java-amd64
ghcr.io/openhands/agent-server:0090c24-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:0090c24-java-arm64
ghcr.io/openhands/agent-server:0090c2445b34a668a45f2454b8d3667351e10e2b-java-arm64
ghcr.io/openhands/agent-server:windows-terminal-backend-java-arm64
ghcr.io/openhands/agent-server:0090c24-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:0090c24-python-amd64
ghcr.io/openhands/agent-server:0090c2445b34a668a45f2454b8d3667351e10e2b-python-amd64
ghcr.io/openhands/agent-server:windows-terminal-backend-python-amd64
ghcr.io/openhands/agent-server:0090c24-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:0090c24-python-arm64
ghcr.io/openhands/agent-server:0090c2445b34a668a45f2454b8d3667351e10e2b-python-arm64
ghcr.io/openhands/agent-server:windows-terminal-backend-python-arm64
ghcr.io/openhands/agent-server:0090c24-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:0090c24-golang
ghcr.io/openhands/agent-server:0090c2445b34a668a45f2454b8d3667351e10e2b-golang
ghcr.io/openhands/agent-server:windows-terminal-backend-golang
ghcr.io/openhands/agent-server:0090c24-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:0090c24-java
ghcr.io/openhands/agent-server:0090c2445b34a668a45f2454b8d3667351e10e2b-java
ghcr.io/openhands/agent-server:windows-terminal-backend-java
ghcr.io/openhands/agent-server:0090c24-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:0090c24-python
ghcr.io/openhands/agent-server:0090c2445b34a668a45f2454b8d3667351e10e2b-python
ghcr.io/openhands/agent-server:windows-terminal-backend-python
ghcr.io/openhands/agent-server:0090c24-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

Each variant tag (e.g., 0090c24-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 0090c24-python-amd64) are also available if needed

Terminate descendant processes when interrupting or closing the Windows PowerShell backend so timed-out child process trees do not survive Ctrl-C/reset. Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-05-09T03:24:11Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-05-09T03:24:29Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-05-09T03:30:13Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
TOTAL	26271	13139	49%

report-only-changed-files is enabled. No files were changed during this commit :)

all-hands-bot

Taste Rating: 🟢 Good taste - Clean, targeted fix for a real Windows process cleanup bug.

Code Quality Assessment

This is a well-implemented fix for a legitimate bug where Windows PowerShell child processes aren't cleaned up after timeouts. The implementation is excellent:

PowerShell script logic: Correctly enumerates process tree using Win32_Process, builds parent-child map, recursively collects descendants, and terminates in reverse order (leaves first)
Safety: Excludes the script's own process ($PID), uses -ErrorAction SilentlyContinue, has proper timeout handling
Interrupt flow: Sensible fallback logic (try CTRL_BREAK → wait → terminate children → Ctrl-C input as last resort)
Testing: Good regression test that verifies real subprocess tree cleanup (not mocks)
Error handling: Appropriate try/except and return value semantics

Eval Risk Flag

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM

Per repository guidelines, terminal/stdin/stdout handling changes require eval validation before approval. While the actual risk is low (Windows-specific bug fix, most evals run on Linux, makes timeout behavior work as intended), it falls under the eval-review category.

Recommendation: Maintainer should confirm whether Windows eval validation is needed, or approve if Windows is not a critical eval platform. The code itself is production-ready.

VERDICT:
✅ Code quality: Excellent
⏸️ Merge decision: Deferred to maintainer for eval policy decision on Windows terminal changes

Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.

Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.

When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.

all-hands-bot

✅ QA Report: PASS WITH LIMITATIONS

Windows-specific functionality verified through CI; unable to test directly on Linux.

Does this PR achieve its stated goal?

Yes. The PR successfully fixes Windows terminal Ctrl-C process cleanup. CI evidence shows the new regression test test_windows_ctrl_c_interrupt_kills_child_process_tree passed on Windows, verifying that child processes are now properly terminated when Ctrl-C is sent after a timeout. The test creates a child PowerShell process that sleeps for 120 seconds, triggers a timeout, sends Ctrl-C, and confirms the child process is stopped—exactly the behavior the PR promises to deliver.

Phase	Result
Environment Setup	✅ Dependencies synced successfully
CI Status	✅ All checks passing (windows-tests: PASS, tools-tests: PASS)
Functional Verification	⚠️ Verified via Windows CI logs; direct execution not possible on Linux

Functional Verification

Test 1: Windows Ctrl-C Regression Test (via CI)

Platform constraint:
QA environment is Linux. The PR changes are Windows-specific (PowerShell terminal backend). The test is correctly marked with:

pytestmark = pytest.mark.skipif(
    platform.system() != "Windows",
    reason="Windows CTRL_BREAK/PowerShell process behavior only applies on Windows",
)

Verification approach:
Inspected CI logs for the windows-tests workflow run.

Evidence from CI logs:

windows-tests Run Windows test suite 2026-05-09T03:26:07.1850999Z 
tests/tools/terminal/test_windows_ctrl_c.py::test_windows_ctrl_c_interrupt_kills_child_process_tree PASSED

Interpretation:
The new test passed on Windows. This test:

Creates a PowerShell script that spawns a child process (Start-Sleep -Seconds 120)
Executes the script and waits for a timeout
Sends Ctrl-C via session.execute(TerminalAction(command="C-c", is_input=True))
Asserts the child process was terminated: assert not child_was_still_running

The test passing proves the fix works: child processes are now properly cleaned up.

Test 2: Linux Test Skip Behavior

Ran on Linux:

uv run pytest tests/tools/terminal/test_windows_ctrl_c.py -v

Output:

tests/tools/terminal/test_windows_ctrl_c.py::test_windows_ctrl_c_interrupt_kills_child_process_tree SKIPPED (Windows CTRL_BREAK/PowerShell process behavior only applies on Windows)

Interpretation:
Test is correctly skipped on non-Windows platforms. No false failures.

Test 3: No Regressions in send_keys Tests

Ran:

uv run pytest tests/tools/terminal/test_send_keys.py -v

Output:

23 passed, 5 warnings in 7.83s

Interpretation:
All send_keys tests passed. The changes to interrupt() and addition of _terminate_child_processes() did not break existing Ctrl-C handling or special key functionality.

Test 4: Code Quality Checks

Ran:

uv run pre-commit run --files openhands-tools/openhands/tools/terminal/terminal/windows_terminal.py tests/tools/terminal/test_windows_ctrl_c.py

Output:

Ruff format..............................................................Passed
Ruff lint................................................................Passed
PEP8 style check (pycodestyle)...........................................Passed
Type check with pyright..................................................Passed
Check import dependency rules............................................Passed
Check Tool subclass registration.........................................Passed

Interpretation:
All style, lint, and type checks passed. Code adheres to project standards.

Unable to Verify

Windows-Specific Behavior on Linux

What could not be verified:

Direct execution of the Windows terminal backend on Linux
Manual testing of the PowerShell child process termination logic
Interactive verification of Ctrl-C behavior with real PowerShell processes

What was attempted:
Attempted to run test_windows_ctrl_c_interrupt_kills_child_process_tree on Linux, but it was correctly skipped due to platform restrictions.

Why verification failed:
The QA environment runs on Linux. The PR's changes are specific to Windows PowerShell terminal behavior and require:

Windows OS
PowerShell (powershell.exe)
Windows process management APIs (CTRL_BREAK_EVENT, Win32_Process)

Suggested AGENTS.md guidance:
For future QA of Windows-specific terminal features:

Rely on CI windows-tests workflow results as the primary verification
Ensure the Windows CI runner is configured with PowerShell
For local verification, use a Windows development environment or Windows Subsystem for Linux (WSL) with PowerShell installed
Consider adding a smoke test that can run on Linux but validates the code structure (e.g., ensuring _terminate_child_processes() returns False on non-Windows platforms)

Evidence relied upon instead:
CI logs from the windows-tests workflow run confirming the new test passed.

Issues Found

None.

Verdict: This PR successfully delivers on its stated goal. The Windows CI runner verified the fix works as intended, and all code quality checks passed. The inability to test directly on Linux is a platform limitation, not a deficiency in the PR.

neubig · 2026-05-09T19:17:53Z

@OpenHands fix the code review comment

openhands-ai · 2026-05-09T19:18:09Z

I'm on it! neubig can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2026-05-09T19:21:38Z

OpenHands encountered an error: Request timeout after 30 seconds to https://xyftbatlgupjregx.prod-runtime.all-hands.dev/api/conversations/6d45534e-99d8-4bf9-a7f6-934567fb7635/ask_agent

See the conversation for more information.

neubig · 2026-05-10T15:00:00Z

@OpenHands the windows CI on this PR seems to be failing, perhaps due to a timeout. Double-check current and previous CI runs for windows-tests, diagnose the problem, and /iterate until CI passes and the PR is ready to merge.

openhands-ai · 2026-05-10T15:00:18Z

I'm on it! neubig can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>

neubig · 2026-05-10T16:02:39Z

Update on the Windows CI timeout investigation:

Confirmed the prior Windows run on 20820742 passed, while the current branch timed out repeatedly around tests/tools/browser_use/test_browser_toolset.py before the terminal regression test was reached.
Pushed a small test cleanup fix so the mocked BrowserToolExecutor in the browser toolset tests does not run real cleanup/finalizer logic against mocked attributes.
The latest SHA 0090c244 is now green: 30 checks passed, 1 skipped, 0 pending, 0 failed. In particular, Run tests / windows-tests passed.
PR is mergeable with no unresolved review threads; the remaining GitHub state is REVIEW_REQUIRED.

This comment was created by an AI agent (OpenHands) on behalf of the user.

openhands-ai · 2026-05-10T16:03:30Z

OpenHands encountered an error: Request timeout after 30 seconds to https://nbsmvawoiyeenyvd.prod-runtime.all-hands.dev/api/conversations/59d3dd67-977a-496e-97ed-40539bba9013/ask_agent

See the conversation for more information.

all-hands-bot

Taste Rating: 🟢 Good taste - Clean, targeted fix for a real Windows process cleanup bug.

The implementation is solid and well-tested. Flagging for eval verification before merge.

[EVAL-RISK FLAG]

This PR modifies terminal interrupt behavior (how commands are stopped after timeouts), which falls into the eval-risk category per repo guidelines. While the fix is correct and necessary, I recommend running lightweight evals (SWE-bench lite or similar) to verify no regression in agent behavior during command timeouts.

The risk: previously timed-out commands left child processes running, which could affect subsequent operations. With proper cleanup, timing and success rates might change.

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM

Changes core terminal interrupt behavior on Windows. Implementation is sound and addresses a real bug (child processes not cleaned up after timeout/interrupt). Risk is medium because: (1) affects command execution flow, (2) could impact benchmark timing, (3) new process termination logic enumerates all system processes (minor performance consideration). Good test coverage reduces risk.

VERDICT:
✅ Worth merging after eval verification

KEY INSIGHT:
Fixes Windows' lack of process group signals by explicitly walking and terminating the descendant process tree - necessary but adds system-wide process enumeration during interrupts.

all-hands-bot · 2026-05-11T14:37:53Z

+    Stop-Process -Id $toStop[$i] -Force -ErrorAction SilentlyContinue
+}}
+if ($toStop.Count -gt 0) {{ exit 0 }} else {{ exit 1 }}
+"""


🟡 Suggestion: Get-CimInstance Win32_Process enumerates all processes on the system. On machines with thousands of processes this could take 1-2+ seconds. Consider documenting expected latency in the method docstring, or adding a timeout check if this becomes a production issue.

Not blocking - this only runs during interrupt/close, so the overhead is acceptable for fixing the cleanup bug.

all-hands-bot · 2026-05-11T14:38:33Z

Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

Edit .agents/skills/custom-codereview-guide.md on your branch with additional context the reviewer is missing (e.g., "Performance concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.

Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.

When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.

all-hands-bot

✅ QA Report: PASS

Windows Ctrl-C process cleanup verified through Windows CI test execution; Linux environment constraint prevents direct testing.

Does this PR achieve its stated goal?

Yes. The PR successfully fixes Windows terminal Ctrl-C process cleanup by terminating descendant processes when interrupts are sent or sessions are closed. The regression test test_windows_ctrl_c_interrupt_kills_child_process_tree PASSED on the Windows CI runner, proving that child processes spawned during command execution are now properly terminated after Ctrl-C, which was the core issue this PR set out to solve. The test creates a realistic scenario: a timed-out PowerShell script that spawns a sleeping child process, sends Ctrl-C, and verifies the child is stopped—exactly the behavior promised in the PR description.

Phase	Result
Environment Setup	✅ Dependencies synced, Python 3.13.13, uv-managed monorepo
CI Status	✅ All checks passing: windows-tests PASS, tools-tests PASS, pre-commit PASS
Functional Verification	✅ Windows test passed in CI; Linux correctly skipped; implementation verified

Functional Verification

Verification 1: Windows-Specific Test Execution via CI

Platform constraint:
QA runs on Linux (uname -a → Linux runnervmeorf1 6.17.0-1010-azure). This PR modifies Windows PowerShell terminal backend behavior that requires:

Windows OS
powershell.exe
Windows process management APIs (CTRL_BREAK_EVENT, Win32_Process, Get-CimInstance)

Approach:
Inspect CI logs from the windows-tests workflow run to verify the new regression test executed and passed on actual Windows.

Step 1 — Confirm test is Windows-only:
Ran locally on Linux:

uv run pytest tests/tools/terminal/test_windows_ctrl_c.py -v

Output:

tests/tools/terminal/test_windows_ctrl_c.py::test_windows_ctrl_c_interrupt_kills_child_process_tree SKIPPED (Windows CTRL_BREAK/PowerShell process behavior only applies on Windows)

Interpretation: Test correctly skips on Linux via pytest.mark.skipif(platform.system() != "Windows"). No false positives on wrong platforms.

Step 2 — Verify test passed on Windows CI:
Queried GitHub PR checks:

gh pr view 3171 --json statusCheckRollup --jq '.statusCheckRollup[] | select(.name == "windows-tests")'

Result:

{"conclusion": "SUCCESS", "name": "windows-tests"}

Fetched Windows CI logs:

gh run view 25633045051 --log | grep test_windows_ctrl_c_interrupt_kills_child_process_tree

Output:

windows-tests   2026-05-10T15:53:14.7161408Z tests/tools/terminal/test_windows_ctrl_c.py::test_windows_ctrl_c_interrupt_kills_child_process_tree PASSED

Interpretation: The new regression test ran on Windows and passed. This test:

Creates a PowerShell script that spawns a child process (Start-Process ... Start-Sleep -Seconds 120)
Executes the script with a 1-second timeout, triggering NO_CHANGE_TIMEOUT
Sends Ctrl-C via session.execute(TerminalAction(command="C-c", is_input=True))
Verifies the child process PID no longer exists: assert not _powershell_process_exists(child_pid)

The passing assertion proves the fix works: the new _terminate_child_processes() method successfully stops descendant processes.

Verification 2: Implementation Review

Examined code changes:

git diff 3166003e..0090c244 --stat

Result:

openhands-tools/openhands/tools/terminal/terminal/windows_terminal.py | 68 ++++++++++++++++++++
tests/tools/terminal/test_windows_ctrl_c.py                           | 106 ++++++++++++++++++++++++++++++
tests/tools/browser_use/test_browser_toolset.py                       | 3 +-
3 files changed, 174 insertions(+), 3 deletions(-)

Key implementation points verified:

New _terminate_child_processes() method (lines 332-386):
- Queries all Windows processes via Get-CimInstance Win32_Process
- Builds parent→children mapping
- Recursively collects descendants of the PowerShell session PID
- Terminates in reverse order (leaves first) via Stop-Process -Force
- Returns True if any processes were stopped, False otherwise
- Properly guarded: returns False immediately if not on Windows or process is dead
Integration points:
- Called in close() (line 176): Cleans up descendants when session terminates
- Called in interrupt() (line 405): Cleans up descendants after Ctrl-C
Interrupt flow logic (lines 388-417):
- First tries CTRL_BREAK_EVENT signal (Windows-specific)
- Then waits 0.5 seconds (_INTERRUPT_GRACE_SECONDS)
- Then calls _terminate_child_processes()
- Falls back to writing C-c to stdin only if prior methods didn't work
- Returns True if any method succeeded

Interpretation: Implementation is sound. The PowerShell script correctly enumerates the process tree (using CIM instead of unreliable Get-Process -IncludeUserName), excludes the script's own PID, and terminates descendants safely with error suppression. The integration into both close() and interrupt() ensures cleanup happens in both graceful shutdown and forced interrupt scenarios.

Verification 3: No Regressions in Related Tests

Checked test file mentioned in PR description:

uv run pytest tests/tools/terminal/test_send_keys.py -q

Output:

23 passed in 7.83s

Interpretation: All send_keys tests passed. Changes to interrupt() method did not break existing terminal input handling or special key sequences.

Verification 4: Code Quality

Ran linting and type checks:

uv run pre-commit run --files openhands-tools/openhands/tools/terminal/terminal/windows_terminal.py tests/tools/terminal/test_windows_ctrl_c.py

Result: All checks passed (format, lint, type check, import rules).

Confirmed method is called in correct locations:

grep -n "_terminate_child_processes" openhands-tools/openhands/tools/terminal/terminal/windows_terminal.py

Output:

176:        self._terminate_child_processes()     # in close()
332:    def _terminate_child_processes(self) -> bool:  # method definition
405:        terminated_children = self._terminate_child_processes()  # in interrupt()

Interpretation: Code adheres to project standards. Method is properly integrated into both cleanup paths.

Unable to Verify Directly

Direct Windows Execution

What could not be verified:

Interactive PowerShell session with real child process spawning
Manual verification of Win32_Process enumeration on Windows
Performance impact of full system process enumeration during interrupt

Why verification was not possible:
QA environment is Linux. Windows-specific APIs (subprocess.CTRL_BREAK_EVENT, subprocess.STARTUPINFO, Get-CimInstance Win32_Process) are unavailable.

What was relied upon instead:
CI logs from the windows-tests workflow run (job ID 75240052629) showing the regression test passed on a real Windows runner (windows-latest).

Suggested AGENTS.md guidance for future Windows terminal QA:

Primary verification: Rely on windows-tests CI workflow results
For local testing, use a Windows development environment or Windows VM
Consider adding a smoke test that validates graceful degradation on non-Windows platforms (e.g., confirming _terminate_child_processes() returns False on Linux without crashing)
For performance validation, profile system-wide process enumeration impact on Windows machines with many running processes

Issues Found

None. The PR delivers exactly what it promises: Windows child process cleanup on Ctrl-C and session close. CI validation confirms functionality works on Windows.

Conclusion: This PR successfully achieves its stated goal of fixing Windows terminal Ctrl-C process cleanup. The Windows CI runner executed and passed the regression test, proving descendant processes are now properly terminated. All related tests passed without regression. The implementation is clean, well-structured, and properly integrated into both interrupt and close code paths.

Fix Windows terminal Ctrl-C process cleanup

2082074

Terminate descendant processes when interrupting or closing the Windows PowerShell backend so timed-out child process trees do not survive Ctrl-C/reset. Co-authored-by: openhands <openhands@all-hands.dev>

neubig force-pushed the windows-terminal-backend branch from 98881c1 to 2082074 Compare May 9, 2026 03:23

neubig enabled auto-merge (squash) May 9, 2026 03:28

neubig requested a review from all-hands-bot May 9, 2026 03:28

all-hands-bot reviewed May 9, 2026

View reviewed changes

VascoSch92 reviewed May 9, 2026

View reviewed changes

Comment thread openhands-tools/openhands/tools/terminal/terminal/windows_terminal.py Outdated

Remove unused Windows terminal flag initialization

ee62dc2

Co-authored-by: openhands <openhands@all-hands.dev>

openhands-agent added 2 commits May 10, 2026 15:18

test: clean up browser toolset reset executor

86c4990

Co-authored-by: openhands <openhands@all-hands.dev>

test: avoid browser toolset mock cleanup hang

0090c24

Co-authored-by: openhands <openhands@all-hands.dev>

neubig requested review from VascoSch92 and all-hands-bot May 11, 2026 14:34

all-hands-bot reviewed May 11, 2026

View reviewed changes

VascoSch92 approved these changes May 11, 2026

View reviewed changes

neubig merged commit 2a20689 into main May 11, 2026
41 checks passed

neubig deleted the windows-terminal-backend branch May 11, 2026 14:41

Conversation

neubig commented May 9, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Uh oh!

github-actions Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python API breakage checks — ✅ PASSED

Uh oh!

github-actions Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

REST API breakage checks (OpenAPI) — ✅ PASSED

Uh oh!

github-actions Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Code Quality Assessment

Eval Risk Flag

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

✅ QA Report: PASS WITH LIMITATIONS

Does this PR achieve its stated goal?

Test 1: Windows Ctrl-C Regression Test (via CI)

Test 2: Linux Test Skip Behavior

Test 3: No Regressions in send_keys Tests

Test 4: Code Quality Checks

Windows-Specific Behavior on Linux

Issues Found

Uh oh!

Uh oh!

neubig commented May 9, 2026

Uh oh!

openhands-ai Bot commented May 9, 2026

Uh oh!

openhands-ai Bot commented May 9, 2026

Uh oh!

neubig commented May 10, 2026

Uh oh!

openhands-ai Bot commented May 10, 2026

Uh oh!

neubig commented May 10, 2026

Uh oh!

openhands-ai Bot commented May 10, 2026

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot commented May 11, 2026

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

✅ QA Report: PASS

Does this PR achieve its stated goal?

Verification 1: Windows-Specific Test Execution via CI

Verification 2: Implementation Review

Verification 3: No Regressions in Related Tests

Verification 4: Code Quality

Direct Windows Execution

Issues Found

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

neubig commented May 9, 2026 •

edited by github-actions Bot

Loading

github-actions Bot commented May 9, 2026 •

edited

Loading

github-actions Bot commented May 9, 2026 •

edited

Loading

github-actions Bot commented May 9, 2026 •

edited

Loading