You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> 调整 Open Browser Use skill 说明:连续相关任务不应每次新开重复 tab,而应能从已完成分组里找回已有 tab 继续使用。
12
+
13
+
### 🛠 Changes Overview
14
+
15
+
**Scope:**`skills/open-browser-use`
16
+
17
+
**Key Actions:**
18
+
19
+
-**[Skill guidance]**: Added a core workflow step requiring agents to inspect existing user tabs before opening a new tab.
20
+
-**[Tab lifecycle]**: Documented that related follow-up tasks should claim matching tabs from `✅ Open Browser Use` or handoff groups and finalize them back as deliverables when appropriate.
21
+
-**[Release]**: Bumped runtime/package versions to `0.1.40` and added the user-facing release note.
Copy file name to clipboardExpand all lines: skills/open-browser-use/SKILL.md
+11-7Lines changed: 11 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,13 +14,14 @@ Open Browser Use connects an MV3 Chrome extension, a local native messaging host
14
14
1. Check setup with `open-browser-use ping` or `obu ping`. If it fails because setup is missing, read [references/installation.md](references/installation.md).
15
15
2. Pick the right browser/profile if multiple are installed. See "Browser and profile handling" below before issuing browser commands.
16
16
3. Choose a unique browser session id for the current agent task before opening or claiming tabs. Prefer the surrounding runtime's conversation/session id when available; otherwise create a short unique id such as `obu-<task-slug>-<timestamp>`. Reuse that same id for every Open Browser Use command in this task.
17
-
3. Name the current browser task group before opening or claiming tabs. Use a short task label followed by ` - OBU`; if no better task label is available, use `Task - OBU`.
18
-
4. Use the CLI for simple inspection or one-shot actions: `info`, `tabs`, `user-tabs`, `history`, `open-tab`, `navigate`, `cdp`, and `call`.
19
-
5. Use `open-browser-use run` / `obu run` for CLI-level multi-step orchestration when a small line-oriented action plan is enough and writing SDK code would be unnecessary.
20
-
6. If the surrounding agent runtime supports local MCP servers, configure `obu mcp` and call the exposed browser tools directly. Use the `run_action_plan` MCP tool for the same line-oriented orchestration from MCP. Read [references/sdk-and-protocol.md](references/sdk-and-protocol.md).
21
-
7. Use the JavaScript, Python, or Go SDK for larger multi-step workflows, event subscriptions, richer control flow, or when the surrounding agent runtime already runs code. Read [references/sdk-and-protocol.md](references/sdk-and-protocol.md).
22
-
8. Before ending browser work, release or keep session tabs with `open-browser-use finalize-tabs --session-id "$OBU_SESSION_ID" --keep '<json-array>'`, the MCP `finalize_tabs` tool, or the SDK `finalizeTabs` / `finalize_tabs` / `FinalizeTabs` method.
23
-
9. If communication fails after setup, read [references/troubleshooting.md](references/troubleshooting.md).
17
+
4. Name the current browser task group before opening or claiming tabs. Use a short task label followed by ` - OBU`; if no better task label is available, use `Task - OBU`.
18
+
5. Before opening a new tab, run `user-tabs` / `user_tabs` and check whether the task continues from an existing tab, including tabs in `✅ Open Browser Use` or an earlier `handoff` task group. If the URL/title/group clearly matches the current task, claim that tab and continue from it instead of opening a duplicate.
19
+
6. Use the CLI for simple inspection or one-shot actions: `info`, `tabs`, `user-tabs`, `history`, `open-tab`, `navigate`, `cdp`, and `call`.
20
+
7. Use `open-browser-use run` / `obu run` for CLI-level multi-step orchestration when a small line-oriented action plan is enough and writing SDK code would be unnecessary.
21
+
8. If the surrounding agent runtime supports local MCP servers, configure `obu mcp` and call the exposed browser tools directly. Use the `run_action_plan` MCP tool for the same line-oriented orchestration from MCP. Read [references/sdk-and-protocol.md](references/sdk-and-protocol.md).
22
+
9. Use the JavaScript, Python, or Go SDK for larger multi-step workflows, event subscriptions, richer control flow, or when the surrounding agent runtime already runs code. Read [references/sdk-and-protocol.md](references/sdk-and-protocol.md).
23
+
10. Before ending browser work, release or keep session tabs with `open-browser-use finalize-tabs --session-id "$OBU_SESSION_ID" --keep '<json-array>'`, the MCP `finalize_tabs` tool, or the SDK `finalizeTabs` / `finalize_tabs` / `FinalizeTabs` method.
24
+
11. If communication fails after setup, read [references/troubleshooting.md](references/troubleshooting.md).
24
25
25
26
## Operating Rules
26
27
@@ -29,6 +30,8 @@ Open Browser Use connects an MV3 Chrome extension, a local native messaging host
29
30
- Do not assume Codex.app helpers, Node REPL globals, or a bundled plugin UI are available. Use the installed `open-browser-use` / `obu` CLI or the published SDKs.
30
31
- Do not guess tab ids. List tabs first, then use ids returned by `tabs`, `user-tabs`, `open-tab`, or SDK calls.
31
32
- Prefer `claim-tab` / `claimUserTab` for existing user tabs. Claiming should be based on the current `user-tabs` result and visible evidence such as URL, title, recency, or group.
33
+
- For follow-up tasks, inspect `user-tabs` before opening a tab and reuse a matching tab from `✅ Open Browser Use` or a previous handoff group. A deliverable tab can be claimed back into the new task session, worked on, and finalized as `deliverable` again when it remains the user-facing result. This keeps repeated work on the same page converged to one live tab.
34
+
- Do not claim unrelated deliverable tabs just because they are in `✅ Open Browser Use`. If several tabs plausibly match, prefer the most recent exact URL/title match; ask the user when the match is ambiguous.
32
35
- Use `--socket` only when the user or runtime provides an explicit socket. Otherwise let the CLI and SDKs discover the active socket registry.
33
36
- Do not rely on the CLI fallback session `obu-cli` for agent tasks. Always pass a task-unique `--session-id` to CLI and MCP commands, or set `sessionId` / `session_id` / `SessionID` in SDK clients. The fallback exists for quick manual use and can reuse stale task groups across unrelated agent sessions.
34
37
- Direct CLI subcommands and `open-browser-use run` can share the same browser session only when they use the same explicit `--session-id`. Finalize that same session before ending browser work.
@@ -155,6 +158,7 @@ each individual browser operation.
155
158
- Session tabs are tabs Open Browser Use has created or claimed for the current agent workflow.
156
159
- Use one unique session id per agent task or conversation. Do not share the fallback `obu-cli` session across unrelated tasks.
157
160
- Task session groups should be named from the task, using the pattern `<short task> - OBU`. Use `Task - OBU` as the fallback name.
161
+
- At the start of a related follow-up task, list all user tabs and check `tabGroup`, `title`, and `url` before creating anything new. Claim an existing matching deliverable or handoff tab into the current session; only open a new tab when no suitable tab exists.
158
162
- Keep no tabs by default: `open-browser-use finalize-tabs --session-id "$OBU_SESSION_ID" --keep '[]'`.
159
163
- Keep a tab only when the user needs that live page after the turn. Omit research, source, search, intermediate, duplicate, blank, error, and login/navigation tabs after extracting what you need.
160
164
- Keep a tab with `status: "deliverable"` when the tab itself is the user-facing output or requested open page, such as a created or edited document, dashboard, checkout/cart, submitted form result, or a page the user explicitly asked to inspect directly.
0 commit comments