Skip to content

Commit 1431522

Browse files
committed
Improve skill tab reuse guidance
1 parent 2a62e64 commit 1431522

10 files changed

Lines changed: 53 additions & 14 deletions

File tree

apps/chrome-extension/manifest.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"manifest_version": 3,
33
"name": "Open Browser Use",
44
"description": "Open Browser Use Chrome automation extension.",
5-
"version": "0.1.39",
5+
"version": "0.1.40",
66
"icons": {
77
"16": "icons/icon-16.png",
88
"32": "icons/icon-32.png",

cmd/open-browser-use/main.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ import (
2727
"github.com/spf13/cobra"
2828
)
2929

30-
const version = "0.1.39"
30+
const version = "0.1.40"
3131
const defaultChromeExtensionID = "bgjoihaepiejlfjinojjfgokghnodnhd"
3232
const defaultCLISessionID = "obu-cli"
3333
const defaultMCPSessionID = "obu-mcp"
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
## [2026-05-22 15:59] | Task: Reuse deliverable tabs
2+
3+
### 🤖 Execution Context
4+
5+
- **Agent ID**: `Codex`
6+
- **Base Model**: `GPT-5`
7+
- **Runtime**: `Codex CLI`
8+
9+
### 📥 User Query
10+
11+
> 调整 Open Browser Use skill 说明:连续相关任务不应每次新开重复 tab,而应能从已完成分组里找回已有 tab 继续使用。
12+
13+
### 🛠 Changes Overview
14+
15+
**Scope:** `skills/open-browser-use`
16+
17+
**Key Actions:**
18+
19+
- **[Skill guidance]**: Added a core workflow step requiring agents to inspect existing user tabs before opening a new tab.
20+
- **[Tab lifecycle]**: Documented that related follow-up tasks should claim matching tabs from `✅ Open Browser Use` or handoff groups and finalize them back as deliverables when appropriate.
21+
- **[Release]**: Bumped runtime/package versions to `0.1.40` and added the user-facing release note.
22+
23+
### 🧠 Design Intent (Why)
24+
25+
连续相关任务经常指向同一个页面。明确要求先查找并 claim 已有 deliverable / handoff tab,可以避免完成分组里持续堆积重复页面,同时仍保留无明确匹配时新开 tab 的安全边界。
26+
27+
### 📁 Files Modified
28+
29+
- `skills/open-browser-use/SKILL.md`
30+
- `docs/releases/feature-release-notes.md`
31+
- `cmd/open-browser-use/main.go`
32+
- `apps/chrome-extension/manifest.json`
33+
- `packages/*/package.json`
34+
- `packages/open-browser-use-python/pyproject.toml`

docs/releases/feature-release-notes.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44

55
| 日期 | 功能域 | 用户价值 | 变更摘要 |
66
| --- | --- | --- | --- |
7+
| 2026-05-22 | Skill Tab Reuse | Agent 处理连续相关任务时会优先复用已交付或 handoff 的匹配标签页,不再把同一个页面反复堆进完成分组。 | 发布 `0.1.40` patch 版本,强化 `skills/open-browser-use` 的 Core Workflow、Operating Rules 和 Tab Lifecycle:新开 tab 前先检查 `user-tabs`,从 `✅ Open Browser Use` 或旧 handoff group 中 claim 明确匹配的 tab,并在有歧义时避免误 claim。 |
78
| 2026-05-17 | macOS Native Host Socket | macOS 用户在较长 `TMPDIR` 环境下也能稳定连接 native host,不再因为 Unix socket 路径过长导致 popup 只显示 `Native host has exited`| 发布 `0.1.39` patch 版本,Unix 默认 socket 根目录恢复为固定短路径 `/tmp/open-browser-use`,Windows 继续使用系统临时目录;bind 失败错误补充 socket 路径长度,并新增默认路径、实际 bind、长路径诊断的回归测试。 |
89
| 2026-05-16 | Chrome Extension Popup | 用户打开 popup 时能直接看到当前 extension 版本,并根据自己的系统看到合适的 CLI 安装命令;macOS 保留 npm 和 Homebrew 两条路径,Windows/Linux 只显示 npm。 | 发布 `0.1.38` patch 版本,popup 新增版本 pill 和 OS-aware CLI install 面板,补充 OBU render harness 验证,并同步 README skill 安装命令去掉不再需要的 `--copy`|
910
| 2026-05-16 | Windows CLI Install | Windows 用户可以通过 `npm i -g open-browser-use && open-browser-use setup` 完成 native host 注册和 Chrome Web Store 扩展连接,也可以从 GitHub Release 下载 Windows zip 直接运行 CLI。 | 发布 `0.1.37` patch 版本,补齐 Windows registry setup、stable exe copy、`--parent-window` native host 启动识别、Windows socket/profile 路径、release Windows zip 制品,并用真实 Windows Chrome 跑通 `info``user-tabs``open-tab``page-info``finalize-tabs`|

packages/browser-client-rewrite/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@open-browser-use/browser-client-rewrite",
3-
"version": "0.1.39",
3+
"version": "0.1.40",
44
"private": true,
55
"type": "module",
66
"scripts": {

packages/browser-use-protocol/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@open-browser-use/browser-use-protocol",
3-
"version": "0.1.39",
3+
"version": "0.1.40",
44
"private": true,
55
"description": "Browser Use native pipe framing and JSON-RPC helpers.",
66
"type": "module",

packages/open-browser-use-cli/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "open-browser-use",
3-
"version": "0.1.39",
3+
"version": "0.1.40",
44
"description": "Open Browser Use native host and CLI binary.",
55
"license": "MIT",
66
"type": "module",

packages/open-browser-use-js/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "open-browser-use-sdk",
3-
"version": "0.1.39",
3+
"version": "0.1.40",
44
"description": "JavaScript/TypeScript SDK for Open Browser Use.",
55
"license": "MIT",
66
"type": "module",

packages/open-browser-use-python/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "open-browser-use-sdk"
3-
version = "0.1.39"
3+
version = "0.1.40"
44
description = "Python SDK for Open Browser Use."
55
readme = "README.md"
66
requires-python = ">=3.10"

skills/open-browser-use/SKILL.md

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -14,13 +14,14 @@ Open Browser Use connects an MV3 Chrome extension, a local native messaging host
1414
1. Check setup with `open-browser-use ping` or `obu ping`. If it fails because setup is missing, read [references/installation.md](references/installation.md).
1515
2. Pick the right browser/profile if multiple are installed. See "Browser and profile handling" below before issuing browser commands.
1616
3. Choose a unique browser session id for the current agent task before opening or claiming tabs. Prefer the surrounding runtime's conversation/session id when available; otherwise create a short unique id such as `obu-<task-slug>-<timestamp>`. Reuse that same id for every Open Browser Use command in this task.
17-
3. Name the current browser task group before opening or claiming tabs. Use a short task label followed by ` - OBU`; if no better task label is available, use `Task - OBU`.
18-
4. Use the CLI for simple inspection or one-shot actions: `info`, `tabs`, `user-tabs`, `history`, `open-tab`, `navigate`, `cdp`, and `call`.
19-
5. Use `open-browser-use run` / `obu run` for CLI-level multi-step orchestration when a small line-oriented action plan is enough and writing SDK code would be unnecessary.
20-
6. If the surrounding agent runtime supports local MCP servers, configure `obu mcp` and call the exposed browser tools directly. Use the `run_action_plan` MCP tool for the same line-oriented orchestration from MCP. Read [references/sdk-and-protocol.md](references/sdk-and-protocol.md).
21-
7. Use the JavaScript, Python, or Go SDK for larger multi-step workflows, event subscriptions, richer control flow, or when the surrounding agent runtime already runs code. Read [references/sdk-and-protocol.md](references/sdk-and-protocol.md).
22-
8. Before ending browser work, release or keep session tabs with `open-browser-use finalize-tabs --session-id "$OBU_SESSION_ID" --keep '<json-array>'`, the MCP `finalize_tabs` tool, or the SDK `finalizeTabs` / `finalize_tabs` / `FinalizeTabs` method.
23-
9. If communication fails after setup, read [references/troubleshooting.md](references/troubleshooting.md).
17+
4. Name the current browser task group before opening or claiming tabs. Use a short task label followed by ` - OBU`; if no better task label is available, use `Task - OBU`.
18+
5. Before opening a new tab, run `user-tabs` / `user_tabs` and check whether the task continues from an existing tab, including tabs in `✅ Open Browser Use` or an earlier `handoff` task group. If the URL/title/group clearly matches the current task, claim that tab and continue from it instead of opening a duplicate.
19+
6. Use the CLI for simple inspection or one-shot actions: `info`, `tabs`, `user-tabs`, `history`, `open-tab`, `navigate`, `cdp`, and `call`.
20+
7. Use `open-browser-use run` / `obu run` for CLI-level multi-step orchestration when a small line-oriented action plan is enough and writing SDK code would be unnecessary.
21+
8. If the surrounding agent runtime supports local MCP servers, configure `obu mcp` and call the exposed browser tools directly. Use the `run_action_plan` MCP tool for the same line-oriented orchestration from MCP. Read [references/sdk-and-protocol.md](references/sdk-and-protocol.md).
22+
9. Use the JavaScript, Python, or Go SDK for larger multi-step workflows, event subscriptions, richer control flow, or when the surrounding agent runtime already runs code. Read [references/sdk-and-protocol.md](references/sdk-and-protocol.md).
23+
10. Before ending browser work, release or keep session tabs with `open-browser-use finalize-tabs --session-id "$OBU_SESSION_ID" --keep '<json-array>'`, the MCP `finalize_tabs` tool, or the SDK `finalizeTabs` / `finalize_tabs` / `FinalizeTabs` method.
24+
11. If communication fails after setup, read [references/troubleshooting.md](references/troubleshooting.md).
2425

2526
## Operating Rules
2627

@@ -29,6 +30,8 @@ Open Browser Use connects an MV3 Chrome extension, a local native messaging host
2930
- Do not assume Codex.app helpers, Node REPL globals, or a bundled plugin UI are available. Use the installed `open-browser-use` / `obu` CLI or the published SDKs.
3031
- Do not guess tab ids. List tabs first, then use ids returned by `tabs`, `user-tabs`, `open-tab`, or SDK calls.
3132
- Prefer `claim-tab` / `claimUserTab` for existing user tabs. Claiming should be based on the current `user-tabs` result and visible evidence such as URL, title, recency, or group.
33+
- For follow-up tasks, inspect `user-tabs` before opening a tab and reuse a matching tab from `✅ Open Browser Use` or a previous handoff group. A deliverable tab can be claimed back into the new task session, worked on, and finalized as `deliverable` again when it remains the user-facing result. This keeps repeated work on the same page converged to one live tab.
34+
- Do not claim unrelated deliverable tabs just because they are in `✅ Open Browser Use`. If several tabs plausibly match, prefer the most recent exact URL/title match; ask the user when the match is ambiguous.
3235
- Use `--socket` only when the user or runtime provides an explicit socket. Otherwise let the CLI and SDKs discover the active socket registry.
3336
- Do not rely on the CLI fallback session `obu-cli` for agent tasks. Always pass a task-unique `--session-id` to CLI and MCP commands, or set `sessionId` / `session_id` / `SessionID` in SDK clients. The fallback exists for quick manual use and can reuse stale task groups across unrelated agent sessions.
3437
- Direct CLI subcommands and `open-browser-use run` can share the same browser session only when they use the same explicit `--session-id`. Finalize that same session before ending browser work.
@@ -155,6 +158,7 @@ each individual browser operation.
155158
- Session tabs are tabs Open Browser Use has created or claimed for the current agent workflow.
156159
- Use one unique session id per agent task or conversation. Do not share the fallback `obu-cli` session across unrelated tasks.
157160
- Task session groups should be named from the task, using the pattern `<short task> - OBU`. Use `Task - OBU` as the fallback name.
161+
- At the start of a related follow-up task, list all user tabs and check `tabGroup`, `title`, and `url` before creating anything new. Claim an existing matching deliverable or handoff tab into the current session; only open a new tab when no suitable tab exists.
158162
- Keep no tabs by default: `open-browser-use finalize-tabs --session-id "$OBU_SESSION_ID" --keep '[]'`.
159163
- Keep a tab only when the user needs that live page after the turn. Omit research, source, search, intermediate, duplicate, blank, error, and login/navigation tabs after extracting what you need.
160164
- Keep a tab with `status: "deliverable"` when the tab itself is the user-facing output or requested open page, such as a created or edited document, dashboard, checkout/cart, submitted form result, or a page the user explicitly asked to inspect directly.

0 commit comments

Comments
 (0)