Skip to content

Commit 3e6d53f

Browse files
authored
Merge pull request #16 from proyecto26/develop
Release 1.0.0
2 parents 12846cd + 4cab243 commit 3e6d53f

File tree

647 files changed

+37318
-43330
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

647 files changed

+37318
-43330
lines changed

.claude/agents/backend-engineer.md

Lines changed: 868 additions & 0 deletions
Large diffs are not rendered by default.

.claude/agents/devops-engineer.md

Lines changed: 497 additions & 0 deletions
Large diffs are not rendered by default.

.claude/agents/frontend-engineer.md

Lines changed: 801 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 253 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,253 @@
1+
2+
---
3+
name: agent-browser
4+
description: Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.
5+
allowed-tools: Bash(agent-browser:*)
6+
---
7+
8+
# Browser Automation with agent-browser
9+
10+
## Quick start
11+
12+
```bash
13+
agent-browser open <url> # Navigate to page
14+
agent-browser snapshot -i # Get interactive elements with refs
15+
agent-browser click @e1 # Click element by ref
16+
agent-browser fill @e2 "text" # Fill input by ref
17+
agent-browser close # Close browser
18+
```
19+
20+
## Core workflow
21+
22+
1. Navigate: `agent-browser open <url>`
23+
2. Snapshot: `agent-browser snapshot -i` (returns elements with refs like `@e1`, `@e2`)
24+
3. Interact using refs from the snapshot
25+
4. Re-snapshot after navigation or significant DOM changes
26+
27+
## Commands
28+
29+
### Navigation
30+
```bash
31+
agent-browser open <url> # Navigate to URL
32+
agent-browser back # Go back
33+
agent-browser forward # Go forward
34+
agent-browser reload # Reload page
35+
agent-browser close # Close browser
36+
```
37+
38+
### Snapshot (page analysis)
39+
```bash
40+
agent-browser snapshot # Full accessibility tree
41+
agent-browser snapshot -i # Interactive elements only (recommended)
42+
agent-browser snapshot -c # Compact output
43+
agent-browser snapshot -d 3 # Limit depth to 3
44+
agent-browser snapshot -s "#main" # Scope to CSS selector
45+
```
46+
47+
### Interactions (use @refs from snapshot)
48+
```bash
49+
agent-browser click @e1 # Click
50+
agent-browser dblclick @e1 # Double-click
51+
agent-browser focus @e1 # Focus element
52+
agent-browser fill @e2 "text" # Clear and type
53+
agent-browser type @e2 "text" # Type without clearing
54+
agent-browser press Enter # Press key
55+
agent-browser press Control+a # Key combination
56+
agent-browser keydown Shift # Hold key down
57+
agent-browser keyup Shift # Release key
58+
agent-browser hover @e1 # Hover
59+
agent-browser check @e1 # Check checkbox
60+
agent-browser uncheck @e1 # Uncheck checkbox
61+
agent-browser select @e1 "value" # Select dropdown
62+
agent-browser scroll down 500 # Scroll page
63+
agent-browser scrollintoview @e1 # Scroll element into view
64+
agent-browser drag @e1 @e2 # Drag and drop
65+
agent-browser upload @e1 file.pdf # Upload files
66+
```
67+
68+
### Get information
69+
```bash
70+
agent-browser get text @e1 # Get element text
71+
agent-browser get html @e1 # Get innerHTML
72+
agent-browser get value @e1 # Get input value
73+
agent-browser get attr @e1 href # Get attribute
74+
agent-browser get title # Get page title
75+
agent-browser get url # Get current URL
76+
agent-browser get count ".item" # Count matching elements
77+
agent-browser get box @e1 # Get bounding box
78+
```
79+
80+
### Check state
81+
```bash
82+
agent-browser is visible @e1 # Check if visible
83+
agent-browser is enabled @e1 # Check if enabled
84+
agent-browser is checked @e1 # Check if checked
85+
```
86+
87+
### Screenshots & PDF
88+
```bash
89+
agent-browser screenshot # Screenshot to stdout
90+
agent-browser screenshot path.png # Save to file
91+
agent-browser screenshot --full # Full page
92+
agent-browser pdf output.pdf # Save as PDF
93+
```
94+
95+
### Video recording
96+
```bash
97+
agent-browser record start ./demo.webm # Start recording (uses current URL + state)
98+
agent-browser click @e1 # Perform actions
99+
agent-browser record stop # Stop and save video
100+
agent-browser record restart ./take2.webm # Stop current + start new recording
101+
```
102+
Recording creates a fresh context but preserves cookies/storage from your session. If no URL is provided, it automatically returns to your current page. For smooth demos, explore first, then start recording.
103+
104+
### Wait
105+
```bash
106+
agent-browser wait @e1 # Wait for element
107+
agent-browser wait 2000 # Wait milliseconds
108+
agent-browser wait --text "Success" # Wait for text
109+
agent-browser wait --url "**/dashboard" # Wait for URL pattern
110+
agent-browser wait --load networkidle # Wait for network idle
111+
agent-browser wait --fn "window.ready" # Wait for JS condition
112+
```
113+
114+
### Mouse control
115+
```bash
116+
agent-browser mouse move 100 200 # Move mouse
117+
agent-browser mouse down left # Press button
118+
agent-browser mouse up left # Release button
119+
agent-browser mouse wheel 100 # Scroll wheel
120+
```
121+
122+
### Semantic locators (alternative to refs)
123+
```bash
124+
agent-browser find role button click --name "Submit"
125+
agent-browser find text "Sign In" click
126+
agent-browser find label "Email" fill "user@test.com"
127+
agent-browser find first ".item" click
128+
agent-browser find nth 2 "a" text
129+
```
130+
131+
### Browser settings
132+
```bash
133+
agent-browser set viewport 1920 1080 # Set viewport size
134+
agent-browser set device "iPhone 14" # Emulate device
135+
agent-browser set geo 37.7749 -122.4194 # Set geolocation
136+
agent-browser set offline on # Toggle offline mode
137+
agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers
138+
agent-browser set credentials user pass # HTTP basic auth
139+
agent-browser set media dark # Emulate color scheme
140+
```
141+
142+
### Cookies & Storage
143+
```bash
144+
agent-browser cookies # Get all cookies
145+
agent-browser cookies set name value # Set cookie
146+
agent-browser cookies clear # Clear cookies
147+
agent-browser storage local # Get all localStorage
148+
agent-browser storage local key # Get specific key
149+
agent-browser storage local set k v # Set value
150+
agent-browser storage local clear # Clear all
151+
```
152+
153+
### Network
154+
```bash
155+
agent-browser network route <url> # Intercept requests
156+
agent-browser network route <url> --abort # Block requests
157+
agent-browser network route <url> --body '{}' # Mock response
158+
agent-browser network unroute [url] # Remove routes
159+
agent-browser network requests # View tracked requests
160+
agent-browser network requests --filter api # Filter requests
161+
```
162+
163+
### Tabs & Windows
164+
```bash
165+
agent-browser tab # List tabs
166+
agent-browser tab new [url] # New tab
167+
agent-browser tab 2 # Switch to tab
168+
agent-browser tab close # Close tab
169+
agent-browser window new # New window
170+
```
171+
172+
### Frames
173+
```bash
174+
agent-browser frame "#iframe" # Switch to iframe
175+
agent-browser frame main # Back to main frame
176+
```
177+
178+
### Dialogs
179+
```bash
180+
agent-browser dialog accept [text] # Accept dialog
181+
agent-browser dialog dismiss # Dismiss dialog
182+
```
183+
184+
### JavaScript
185+
```bash
186+
agent-browser eval "document.title" # Run JavaScript
187+
```
188+
189+
## Example: Form submission
190+
191+
```bash
192+
agent-browser open https://example.com/form
193+
agent-browser snapshot -i
194+
# Output shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]
195+
196+
agent-browser fill @e1 "user@example.com"
197+
agent-browser fill @e2 "password123"
198+
agent-browser click @e3
199+
agent-browser wait --load networkidle
200+
agent-browser snapshot -i # Check result
201+
```
202+
203+
## Example: Authentication with saved state
204+
205+
```bash
206+
# Login once
207+
agent-browser open https://app.example.com/login
208+
agent-browser snapshot -i
209+
agent-browser fill @e1 "username"
210+
agent-browser fill @e2 "password"
211+
agent-browser click @e3
212+
agent-browser wait --url "**/dashboard"
213+
agent-browser state save auth.json
214+
215+
# Later sessions: load saved state
216+
agent-browser state load auth.json
217+
agent-browser open https://app.example.com/dashboard
218+
```
219+
220+
## Sessions (parallel browsers)
221+
222+
```bash
223+
agent-browser --session test1 open site-a.com
224+
agent-browser --session test2 open site-b.com
225+
agent-browser session list
226+
```
227+
228+
## JSON output (for parsing)
229+
230+
Add `--json` for machine-readable output:
231+
```bash
232+
agent-browser snapshot -i --json
233+
agent-browser get text @e1 --json
234+
```
235+
236+
## Debugging
237+
238+
```bash
239+
agent-browser open example.com --headed # Show browser window
240+
agent-browser console # View console messages
241+
agent-browser errors # View page errors
242+
agent-browser record start ./debug.webm # Record from current page
243+
agent-browser record stop # Save recording
244+
agent-browser open example.com --headed # Show browser window
245+
agent-browser --cdp 9222 snapshot # Connect via CDP
246+
agent-browser console # View console messages
247+
agent-browser console --clear # Clear console
248+
agent-browser errors # View page errors
249+
agent-browser errors --clear # Clear errors
250+
agent-browser highlight @e1 # Highlight element
251+
agent-browser trace start # Start recording trace
252+
agent-browser trace stop trace.zip # Stop and save trace
253+
```

0 commit comments

Comments
 (0)