Skip to content

Commit 945f7a6

Browse files
Haruhiyukiclaude
andcommitted
fix: 0.2.1 — macOS capsule.raise 真正等焦点切换完成
P0 (macOS): subagent 真机实验发现 capsule.raise 报 ok=true 但 validate_geometry 立刻看到 is_foreground=false 死锁。 根因:Swift helper 的 activate(pid:) 是 fire-and-forget: NSWorkspace.shared.runningApplications.first(...)?.activate(...) return ["ok": true] NSWorkspace activate 是异步的,macOS 14+ Stage Manager / 焦点窃取 保护下 100-500ms 内 frontmostApplication 仍是调用者。JS 端 sleep(180) 不够(osascript 路径注释明确说"100-200ms 误报,500ms 才稳",但 helper 路径没同步改)。 修: - Swift helper 新 activateAndWaitForeground(pid, timeoutMs=1500): 发出 activate 后 polling NSWorkspace.frontmostApplication 直到 匹配目标 PID 或 timeout。返回 (success, frontPid)。 - window.activate / window.raise 用此函数;ok=true 时返 {ok:true, frontmost_pid};超时返 {ok:false, reason:"foreground_timeout", target_pid, frontmost_pid, hint}。 hint 明确告诉调用方"macOS 焦点窃取保护,让用户手动 cmd+tab"。 - moveWindow 末尾的 activate 也用新函数(之前 usleep(120) 同样不够)。 - darwin-helper.raiseWindow 检查 ok=false 抛 VisionMcpError(GEOMETRY_MISMATCH) 带 details.reason 让上层 ERROR_HINTS 分情况展示。 - ERROR_HINTS.GEOMETRY_MISMATCH 重写: · foreground_timeout → 明确说 repair_minimal 救不了,要用户 cmd+tab · 尺寸 → repair_minimal · 窗口拖出 → capsule.migrate_window 之前一刀切"用 repair_minimal"误导 agent 反复无效尝试。 验证(手动 JSON-RPC 测): > {"method":"window.activate","params":{"handle":"96680:0"}} # Notes < {"ok":true,"frontmost_pid":96680} # 同步等到切完 之前会立即返 {"ok":true} 而 frontmost 还是 Claude Code/Terminal。 仍是 patch 级(行为更准但 API 不变)。56/56 tests pass。 helper 重编译(254KB → 255KB)。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 32133b3 commit 945f7a6

12 files changed

Lines changed: 89 additions & 18 deletions

File tree

.claude-plugin/marketplace.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "vision-mcp",
3-
"version": "0.2.0",
3+
"version": "0.2.1",
44
"description": "Vision-MCP plugins — 桌面 GUI 操作的性能 / 长期成本优化层",
55
"owner": {
66
"name": "Haruhiyuki",
@@ -11,7 +11,7 @@
1111
"name": "vision-mcp",
1212
"description": "Vision-MCP: 让 Agent 在使用桌面软件时沉淀指令化操作方法的框架(macOS + Windows)",
1313
"source": "./",
14-
"version": "0.2.0",
14+
"version": "0.2.1",
1515
"keywords": [
1616
"mcp",
1717
"vision",

.claude-plugin/plugin.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "vision-mcp",
3-
"version": "0.2.0",
3+
"version": "0.2.1",
44
"description": "Vision-MCP: 视觉为主的桌面 GUI 操作 MCP server + Skill。Agent 可像人一样使用真实桌面应用(截图 → 视觉/AX 双轨识别 → click/type/key),支持持续修正 + 安全审批。",
55
"author": {
66
"name": "Vision-MCP Authors",

native/macos/src/main.swift

Lines changed: 39 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -597,6 +597,24 @@ func activate(pid: pid_t) {
597597
}
598598
}
599599

600+
/// activate + polling 直到目标 PID 变成 frontmostApplication,或 timeoutMs 超时。
601+
/// macOS 14+ Stage Manager / 焦点窃取保护下,NSWorkspace.activate 是异步的——立刻调用
602+
/// validate_geometry 会看到 is_foreground=false,给上层误判。这里同步等到 NSWorkspace
603+
/// 真切到目标 PID 才返回。
604+
/// 返回 (success, frontmost_pid_at_exit)。success=false 时上层应直接抛 error。
605+
func activateAndWaitForeground(pid: pid_t, timeoutMs: Int = 1500) -> (Bool, pid_t) {
606+
activate(pid: pid)
607+
let deadline = Date().addingTimeInterval(Double(timeoutMs) / 1000.0)
608+
var frontPid: pid_t = NSWorkspace.shared.frontmostApplication?.processIdentifier ?? 0
609+
if frontPid == pid { return (true, frontPid) }
610+
while Date() < deadline {
611+
usleep(60_000) // 60ms
612+
frontPid = NSWorkspace.shared.frontmostApplication?.processIdentifier ?? 0
613+
if frontPid == pid { return (true, frontPid) }
614+
}
615+
return (false, frontPid)
616+
}
617+
600618
func moveWindow(handle: String, rect: CGRect) -> [String: Any]? {
601619
guard let (_, axWin, _) = findWindow(handle: handle) else { return nil }
602620
if axBoolAttr(axWin, kAXMinimizedAttribute) == true {
@@ -609,8 +627,8 @@ func moveWindow(handle: String, rect: CGRect) -> [String: Any]? {
609627
AXUIElementSetAttributeValue(axWin, kAXPositionAttribute as CFString, posVal)
610628
AXUIElementSetAttributeValue(axWin, kAXSizeAttribute as CFString, sizeVal)
611629
let pid = pid_t(handle.split(separator: ":").first!)!
612-
activate(pid: pid)
613-
usleep(120_000)
630+
// 同步等到 PID 真切前台。120ms 在 macOS 14+ 焦点窃取保护下经常不够。
631+
_ = activateAndWaitForeground(pid: pid)
614632
if let (_, _, desc) = findWindow(handle: handle) {
615633
return toWindowInfo(desc)
616634
}
@@ -926,14 +944,29 @@ func handle(method: String, params: [String: Any]) -> Any {
926944
case "window.activate":
927945
guard let handle = params["handle"] as? String,
928946
let pid = pid_t(handle.split(separator: ":").first.map(String.init) ?? "") else { return ["error": "bad handle"] }
929-
activate(pid: pid)
930-
return ["ok": true]
947+
let (ok, frontPid) = activateAndWaitForeground(pid: pid)
948+
if ok { return ["ok": true, "frontmost_pid": Int(frontPid)] }
949+
// 超时——给上层提供详细诊断让 agent 知道为什么 raise 失败
950+
return [
951+
"ok": false,
952+
"reason": "foreground_timeout",
953+
"target_pid": Int(pid),
954+
"frontmost_pid": Int(frontPid),
955+
"hint": "macOS 焦点窃取保护拦截了 activation;调用者可能持续 frontmost。让用户手动 cmd+tab 到目标 app,或在 Claude Code 中先点击桌面再重试"
956+
]
931957
case "window.raise":
932958
// window.raise 等价于 window.activate(保留向后兼容)
933959
guard let handle = params["handle"] as? String,
934960
let pid = pid_t(handle.split(separator: ":").first.map(String.init) ?? "") else { return ["error": "bad handle"] }
935-
activate(pid: pid)
936-
return ["ok": true]
961+
let (ok, frontPid) = activateAndWaitForeground(pid: pid)
962+
if ok { return ["ok": true, "frontmost_pid": Int(frontPid)] }
963+
return [
964+
"ok": false,
965+
"reason": "foreground_timeout",
966+
"target_pid": Int(pid),
967+
"frontmost_pid": Int(frontPid),
968+
"hint": "macOS 焦点窃取保护拦截了 activation;调用者可能持续 frontmost。让用户手动 cmd+tab 到目标 app,或在 Claude Code 中先点击桌面再重试"
969+
]
937970
case "ax.dump":
938971
guard let handle = params["handle"] as? String else { return ["error": "handle required"] }
939972
let maxNodes = (params["max_nodes"] as? NSNumber)?.intValue ?? 500

native/macos/vision-mcp-helper

1.45 KB
Binary file not shown.

packages/cli/CHANGELOG.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,14 @@
11
# @vision-mcp/cli
22

3+
## 0.2.1
4+
5+
### Patch Changes
6+
7+
- 修复 macOS capsule.raise 报 ok 但窗口没真切前台的死锁。Swift helper 的 `window.activate` / `window.raise` 现在内部 polling 等到目标 PID 真成为 frontmost(timeout 1500ms),超时返回 `{ok: false, reason: "foreground_timeout", target_pid, frontmost_pid, hint}`。JS adapter `raiseWindow` 检查 ok=false 直接抛 `GEOMETRY_MISMATCH` 带详细诊断。`ERROR_HINTS.GEOMETRY_MISMATCH` 改写为分情况提示,明确 foreground_timeout 是焦点窃取保护、repair_minimal 救不了,让用户手动 cmd+tab 一次。
8+
- Updated dependencies
9+
- @vision-mcp/core@0.2.1
10+
- @vision-mcp/server@0.2.1
11+
312
## 0.2.0
413

514
### Minor Changes

packages/cli/package.json

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@vision-mcp/cli",
3-
"version": "0.2.0",
3+
"version": "0.2.1",
44
"description": "Vision-MCP CLI:视觉为主的桌面 GUI 操作 MCP server + 工具集(init / serve / capsule / snapshot / patch / install-helper 等)",
55
"keywords": [
66
"mcp",
@@ -41,8 +41,8 @@
4141
"prepublishOnly": "node ./scripts/prepublish.mjs"
4242
},
4343
"dependencies": {
44-
"@vision-mcp/core": "^0.2.0",
45-
"@vision-mcp/server": "^0.2.0",
44+
"@vision-mcp/core": "^0.2.1",
45+
"@vision-mcp/server": "^0.2.1",
4646
"yaml": "^2.7.0"
4747
},
4848
"repository": {

packages/core/CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
# @vision-mcp/core
22

3+
## 0.2.1
4+
5+
### Patch Changes
6+
7+
- 修复 macOS capsule.raise 报 ok 但窗口没真切前台的死锁。Swift helper 的 `window.activate` / `window.raise` 现在内部 polling 等到目标 PID 真成为 frontmost(timeout 1500ms),超时返回 `{ok: false, reason: "foreground_timeout", target_pid, frontmost_pid, hint}`。JS adapter `raiseWindow` 检查 ok=false 直接抛 `GEOMETRY_MISMATCH` 带详细诊断。`ERROR_HINTS.GEOMETRY_MISMATCH` 改写为分情况提示,明确 foreground_timeout 是焦点窃取保护、repair_minimal 救不了,让用户手动 cmd+tab 一次。
8+
39
## 0.2.0
410

511
### Minor Changes

packages/core/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@vision-mcp/core",
3-
"version": "0.2.0",
3+
"version": "0.2.1",
44
"description": "Vision-MCP 核心库:数据模型、Capsule、Runtime、Repair、Locator、Trace",
55
"type": "module",
66
"main": "./dist/index.js",

packages/core/src/platform/darwin-helper.ts

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -213,8 +213,23 @@ export class DarwinHelperAdapter implements PlatformAdapter {
213213
}
214214

215215
async raiseWindow(handle: string): Promise<void> {
216-
await this.bridge.request("window.activate", { handle });
217-
await sleep(180);
216+
// helper 内部 polling 等到目标 PID 真切前台或 timeout(1500ms)。
217+
// 之前 sleep(180) 在 macOS 14+ Stage Manager / 焦点窃取保护下经常不够,
218+
// 导致 capsule.raise 报 ok=true 但紧接 validate_geometry 看到 is_foreground=false。
219+
const r = await this.bridge.request<{
220+
ok: boolean;
221+
reason?: string;
222+
target_pid?: number;
223+
frontmost_pid?: number;
224+
hint?: string;
225+
}>("window.activate", { handle });
226+
if (!r.ok) {
227+
throw new VisionMcpError(
228+
"GEOMETRY_MISMATCH",
229+
`raise 失败: ${r.reason ?? "unknown"}(target_pid=${r.target_pid}, frontmost_pid=${r.frontmost_pid})。${r.hint ?? ""}`,
230+
{ details: { reason: r.reason, target_pid: r.target_pid, frontmost_pid: r.frontmost_pid } },
231+
);
232+
}
218233
}
219234

220235
/**

packages/server/CHANGELOG.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,13 @@
11
# @vision-mcp/server
22

3+
## 0.2.1
4+
5+
### Patch Changes
6+
7+
- 修复 macOS capsule.raise 报 ok 但窗口没真切前台的死锁。Swift helper 的 `window.activate` / `window.raise` 现在内部 polling 等到目标 PID 真成为 frontmost(timeout 1500ms),超时返回 `{ok: false, reason: "foreground_timeout", target_pid, frontmost_pid, hint}`。JS adapter `raiseWindow` 检查 ok=false 直接抛 `GEOMETRY_MISMATCH` 带详细诊断。`ERROR_HINTS.GEOMETRY_MISMATCH` 改写为分情况提示,明确 foreground_timeout 是焦点窃取保护、repair_minimal 救不了,让用户手动 cmd+tab 一次。
8+
- Updated dependencies
9+
- @vision-mcp/core@0.2.1
10+
311
## 0.2.0
412

513
### Minor Changes

0 commit comments

Comments
 (0)