You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: skills/browser/SKILL.md
+4-7Lines changed: 4 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,12 +1,10 @@
1
1
---
2
2
name: browser-automation
3
3
description: |
4
-
Vision-driven browser automation using Midscene. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all visible elements on screen regardless of technology stack.
4
+
Vision-driven browser automation using Midscene. Operates from screenshots — no DOM or accessibility labels needed.
5
5
6
-
Runs in a headless Puppeteer browser — does NOT take over the user's mouse or keyboard. The user can continue using their computer while automation runs.
7
-
This is the preferred skill for testing web applications. Only use "Desktop Computer Automation" for native desktop apps.
8
-
9
-
Also supports CDP (Chrome DevTools Protocol) mode — connect to an existing Chrome browser started with --remote-debugging-port, including remote browsers in Docker or CI environments.
6
+
Runs in headless Puppeteer — does NOT take over the user's mouse or keyboard.
7
+
Also supports CDP mode to connect to an existing Chrome via remote debugging.
10
8
11
9
Use this skill when the user wants to:
12
10
- Browse, navigate, or open web pages
@@ -16,8 +14,7 @@ description: |
16
14
- Take screenshots of web pages
17
15
- Automate multi-step web workflows
18
16
- Test what was just built, see if it works in browser
19
-
- Connect to an existing Chrome via CDP, DevTools Protocol, or remote debugging
20
-
- Automate a remote browser in Docker, cloud, or CI environment
17
+
- Connect to Chrome via CDP, DevTools Protocol, or remote debugging
0 commit comments