Skip to content

Commit 4889958

Browse files
committed
Updates agent-browser skill
1 parent 5618137 commit 4889958

2 files changed

Lines changed: 77 additions & 6 deletions

File tree

.claude/skills/agent-browser/SKILL.md

Lines changed: 76 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,46 @@ All commands are run via Bash: `agent-browser <command> [args]`.
2525
| Wait | `agent-browser wait <selector>` or `agent-browser wait <ms>` |
2626
| Semantic find | `agent-browser find role button click` |
2727
| Close | `agent-browser close` |
28+
| Hover | `agent-browser hover <selector-or-ref>` |
29+
| Double-click | `agent-browser dblclick <selector-or-ref>` |
30+
| Select dropdown | `agent-browser select <selector-or-ref> <value>` |
31+
| Check/Uncheck | `agent-browser check <sel>` / `agent-browser uncheck <sel>` |
32+
| Scroll | `agent-browser scroll <up\|down\|left\|right> [px]` |
33+
| Upload file | `agent-browser upload <sel> <file...>` |
34+
| Evaluate JS | `agent-browser eval "<js>"` |
35+
| Save PDF | `agent-browser pdf <path>` |
36+
| Go back/forward | `agent-browser back` / `agent-browser forward` |
37+
| Reload | `agent-browser reload` |
38+
| Get element count | `agent-browser get count <sel>` |
39+
| Check visibility | `agent-browser is visible <sel>` |
40+
| Check enabled | `agent-browser is enabled <sel>` |
41+
| Drag and drop | `agent-browser drag <src> <dst>` |
42+
43+
### Debug Quick Reference
44+
45+
| Action | Command |
46+
|--------|---------|
47+
| View console logs | `agent-browser console` |
48+
| View page errors | `agent-browser errors` |
49+
| Start trace | `agent-browser trace start` |
50+
| Stop trace | `agent-browser trace stop [path]` |
51+
| Start video recording | `agent-browser record start <path>` |
52+
| Stop video recording | `agent-browser record stop` |
53+
| Highlight element | `agent-browser highlight <sel>` |
54+
| Debug mode | Add `--debug` flag to any command |
55+
| Headed mode (visible) | Add `--headed` flag to any command |
56+
| Network requests log | `agent-browser network requests` |
57+
58+
### Browser Settings
59+
60+
| Setting | Command |
61+
|---------|---------|
62+
| Set viewport | `agent-browser set viewport <w> <h>` |
63+
| Set device | `agent-browser set device <name>` |
64+
| Set dark/light mode | `agent-browser set media light` or `agent-browser set media dark` |
65+
| Set geolocation | `agent-browser set geo <lat> <lng>` |
66+
| Go offline | `agent-browser set offline on` |
67+
| Set headers | `agent-browser set headers '<json>'` |
2868

2969
## Selectors
3070

@@ -33,13 +73,34 @@ Three types:
3373
- **Snapshot refs**: `@e1`, `@e2` (from `snapshot` output)
3474
- **Semantic**: `find role button "Submit"` (ARIA roles, text, labels)
3575

76+
## Defaults
77+
78+
- **Default test URL**: `http://localhost:5173`
79+
- **Default login credentials**: `admin@example.com` / `test1234`
80+
- **Default screenshot mode**: Light mode (run `agent-browser set media light` before capturing) — required when creating docs for `@wiki`
81+
82+
## Dev Server Check (REQUIRED)
83+
84+
Before running any browser automation, **always** check if the target port needs its own dev server spun up to avoid accidentally using a dev server from a different worktree:
85+
86+
```bash
87+
# 1. Check what's already running on the target port
88+
lsof -i :5173 -t 2>/dev/null && echo "Port in use" || echo "Port free"
89+
90+
# 2. If in use, verify the process CWD matches this worktree
91+
ls -l /proc/$(lsof -i :5173 -t 2>/dev/null | head -1)/cwd 2>/dev/null
92+
```
93+
94+
If the running server's working directory does NOT match the current worktree, start a new dev server on an available port and use that instead.
95+
3696
## Agent Workflow
3797

38-
1. **Navigate + snapshot**: `agent-browser open <url> && agent-browser snapshot --json`
39-
2. **Parse refs** from JSON output to identify interactive elements
40-
3. **Act** using refs: `agent-browser click @e2`, `agent-browser fill @e3 "hello"`
41-
4. **Re-snapshot** after each action to observe new state
42-
5. **Screenshot** when visual verification is needed
98+
1. **Check dev server** (see above) — start one if needed
99+
2. **Navigate + snapshot**: `agent-browser open <url> && agent-browser snapshot --json`
100+
3. **Parse refs** from JSON output to identify interactive elements
101+
4. **Act** using refs: `agent-browser click @e2`, `agent-browser fill @e3 "hello"`
102+
5. **Re-snapshot** after each action to observe new state
103+
6. **Screenshot** when visual verification is needed (light mode for wiki docs)
43104

44105
## JSON Output
45106

@@ -52,6 +113,16 @@ Add `--json` to any command for structured output:
52113

53114
Use `--session <name>` for isolated sessions or `--profile <path>` for persistent cookies/storage.
54115

116+
## Troubleshooting
117+
118+
If you run into issues with any command, run `agent-browser --help` to check available commands and flags. Common fixes:
119+
120+
- **Element not found**: Re-run `agent-browser snapshot` to get fresh refs — refs change after page mutations
121+
- **Timeout**: Use `agent-browser wait <sel>` before interacting with dynamically loaded elements
122+
- **Can't see what's happening**: Add `--headed` to see the browser, or `--debug` for verbose output
123+
- **Stale session**: Run `agent-browser close` and start fresh
124+
- **Check specific command help**: Most commands support `--json` for structured output to debug responses
125+
55126
## Full Documentation
56127

57128
See https://github.com/vercel-labs/agent-browser for complete docs, cloud provider setup, and WebSocket streaming.

wiki

Submodule wiki updated from 9b80455 to fa70ff1

0 commit comments

Comments
 (0)