fix(playground): propagate AbortSignal through cancel chain#2535
Open
quanru wants to merge 5 commits into
Open
fix(playground): propagate AbortSignal through cancel chain#2535quanru wants to merge 5 commits into
quanru wants to merge 5 commits into
Conversation
…tion Add an end-to-end test that drives PlaygroundServer + ComputerAgent with a custom SlowAction sleeping 60s while listening to ExecutorContext.abortSignal. After POST /execute, POST /cancel must propagate the abort within 25s, and the resulting error string must mention abort/cancel — proving the rejection came from the cooperative cancel chain rather than the natural timer expiring. The test is gated to Linux via describe.runIf so local mac runs skip the libnut healthCheck mouse jiggle. Also add a focused PlaygroundServer unit test that asserts /cancel invokes abortController.abort() and the in-flight callActionInActionSpace observes signal.aborted, plus a documenting test for the buggy shape where executeAction never forwards abortSignal. Wire packages/computer to depend on @midscene/playground as a devDependency and add a headless-linux-cancel job to .github/workflows/headless-linux.yml that runs the new e2e suite under Xvfb. The previously dropped chrome-extension: job header (lost during an earlier insert) is restored so all 7 jobs are parsed. This commit intentionally lands the tests without the fix so the new CI job demonstrates the regression. The fix follows in the next commit.
Deploying midscene with
|
| Latest commit: |
4ea2767
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://0b2e7204.midscene.pages.dev |
| Branch Preview URL: | https://fix-playground-stop-propagat.midscene.pages.dev |
The cancel regression test never invokes a model — SlowAction just listens to ExecutorContext.abortSignal. But PlaygroundServer's lazy ModelConfigManager.initialize() throws when MIDSCENE_MODEL_NAME is unset, which is the default on the headless-linux CI runner. The bare /execute call was therefore failing in 33ms with "Model configuration is incomplete" instead of exercising the SlowAction path, masking the real abort assertion. Set dummy MIDSCENE_MODEL_NAME / BASE_URL / API_KEY at module load time so the test runs in any environment without bleeding fake config into other processes. Locally the existing shell values still take precedence.
PlaygroundServer dispatches /execute through a code path that calls
throwErrorIfNonVLModel before invoking the customAction. With MIDSCENE_MODEL_FAMILY
unset the test was failing in 61ms with "MIDSCENE_MODEL_FAMILY is not set to
a visual language model" instead of reaching the SlowAction abort assertion.
Set a valid dummy VL family ('qwen3-vl') so the check passes without bleeding
real model behavior into the test (SlowAction never calls a model).
Hitting Stop in the playground after a long action used to keep driving the
mouse/keyboard for the rest of the natural action timer. Root cause: /cancel
only ran agent.destroy() + recreate, never telling the in-flight
executeAction → callActionInActionSpace → device action to abort. Turn the
whole path into a cooperative AbortController chain.
L1 packages/playground/src/server.ts
- Hold a Map<requestId, AbortController> and create one per /execute.
- /cancel now fires controller.abort() before falling back to
destroy+recreate, so in-flight LLM streams and Device sleep loops exit
within milliseconds instead of after the natural timer.
- Plumb abortController.signal into executeAction.
L2 packages/playground/src/common.ts + types.ts
- executeAction forwards abortSignal to all three branches
(callActionInActionSpace, aiAssert, agent[actionType]).
L3 packages/core/src/agent/{agent,tasks,task-builder}.ts + types.ts
- callActionInActionSpace accepts { abortSignal } and throws
immediately on already-aborted signals.
- runPlans threads abortSignal through convertPlanToExecutable and
checks signal.aborted at each task boundary.
- Agent.destroy aborts the active controller before flushing the
report, so the destroy path can't strand an in-flight LLM call.
L4 packages/computer/src/device.ts
- destroy() sets this.destroyed = true on its first line.
- Replace internal `await sleep(...)` loops in smoothMoveMouse,
performScroll, typeText, smartTypeString, dragAndDrop with an
abortable variant that throws once destroyed flips.
- Public entry points (mouseMove, tap, hover, keyboardType, scroll,
drag, screenshotBase64) gate on this.destroyed up front so a stale
handle can't drive libnut after cancel.
L5 packages/visualizer/src/hooks/usePlaygroundExecution.ts
- handleStop now flips interruptedFlagRef and setLoading(false)
synchronously, then awaits cancelExecution — the spinner no longer
waits on the server's destroy/flush tail.
The headless-linux-cancel CI job (added in the previous commits) was red
on this branch with "abort signal did not propagate" and SlowAction
running its full 60s timer; this commit should turn it green.
… arg
The previous fix added a third `{ abortSignal }` argument when forwarding to
callActionInActionSpace. Two existing common.test.ts assertions still
expected the two-argument shape and broke on CI. Update them to expect the
new options bag explicitly.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
After running multiple steps in the Computer playground and clicking Stop, the
mouse/keyboard would keep being driven — because
/cancelonly ranagent.destroy()+ recreate and never told the in-flightexecuteAction → callActionInActionSpace → device actionto stop. The fix turns the cancelpath into a cooperative abort chain.
Commit order (intentional)
test(computer): add regression coverage ...— adds the e2e test, thefocused unit test, the
headless-linux-cancelCI job, and wires@midscene/playgroundas a devDependency ofpackages/computer. No fixyet — this commit is meant to turn the new
headless-linux-canceljobred on CI, proving the test catches the regression.
test(computer): set dummy model env ...— small follow-up: the testnever calls a model, but
PlaygroundServervalidatesMIDSCENE_MODEL_NAMEat startup, so dummy values are needed for theheadless runner to even reach the abort assertion.
fix(playground): ...(next push) — the actual L1–L5 fix:taskAbortControllersmap +/cancelcallscontroller.abort()before falling back to destroy+recreateexecuteAction: forwardsabortSignalto all three branchescallActionInActionSpace,runPlans, andAgent.destroyplumb / fire the signal
ComputerDevice:destroy()setsdestroyed=truefirst; internalsleeploops insmoothMoveMouse / performScroll / typeText / dragAndDropbecome abortablehandleStop: flipsinterruptedFlagRefandsetLoading(false)synchronously before awaiting the cancel POSTOnce the fix lands,
headless-linux-cancelshould turn green.Test plan
headless-linux-cancelred on commits 1+2 (regression caught)headless-linux-cancelgreen on commit 3 (fix verified)fix to reproduce → restore fix to verify)
pnpm run lintclean before each commit