You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add smart commands — natural language element interaction without LLM
Like Stagehand's act("Click login") but zero LLM cost:
- `smart-click <text>` — click by visible text, fuzzy matching across
textContent, aria-label, title, placeholder, value. Shows score + alternatives
- `smart-fill <label> <value>` — fill input by label/placeholder text,
React-compatible, searches associated labels, aria-label, name, id
- `smart-select <label> <option>` — select dropdown option by text
All use fuzzy scoring: exact(100) > startsWith(80) > includes(60) > partial(20-50)
No CSS selectors needed. No LLM API calls. Pure DOM intelligence.
Available as MCP tools: browser_smart_click, browser_smart_fill, browser_smart_select
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
"""Run a .cdp script file — sequential commands, one per line.
2989
3276
@@ -3878,6 +4165,12 @@ def _register_tools(self):
3878
4165
"inputSchema": {"type": "object", "properties": {"selector": {"type": "string", "description": "CSS selector to match elements (e.g. 'table tr', '.product', 'a')"}, "format": {"type": "string", "enum": ["text", "json", "list"], "description": "Output format: 'text' (one per line), 'json' (full structure with attrs), 'list' (numbered)", "default": "text"}}, "required": ["selector"]}},
3879
4166
{"name": "browser_observe", "description": "List all interactive elements on the current page with their available actions (CLICK, FILL, NAVIGATE, TOGGLE, SELECT, SUBMIT, UPLOAD). Like Stagehand observe() but deterministic — no LLM needed. Shows what you CAN DO on the page. Use this to understand page structure before acting.",
{"name": "browser_smart_click", "description": "Click an element by its visible text — no CSS selector needed. Uses fuzzy matching across text content, aria-label, title, placeholder, and value. Returns match score and alternatives. Like Stagehand act('Click login') but without LLM cost. Use when you know WHAT to click but not the exact selector.",
4169
+
"inputSchema": {"type": "object", "properties": {"text": {"type": "string", "description": "Visible text of the element to click (e.g. 'Login', 'Submit Order', 'Learn more')"}}, "required": ["text"]}},
4170
+
{"name": "browser_smart_fill", "description": "Fill an input by its label or placeholder text — no CSS selector needed. Finds input by associated label, placeholder, aria-label, name, or id. React-compatible value setting. Use when you know WHAT field to fill but not the exact selector.",
4171
+
"inputSchema": {"type": "object", "properties": {"label": {"type": "string", "description": "Label or placeholder text of the input (e.g. 'Email', 'Password', 'Search')"}, "value": {"type": "string", "description": "Value to fill in the input"}}, "required": ["label", "value"]}},
4172
+
{"name": "browser_smart_select", "description": "Select a dropdown option by label and option text — no CSS selector needed. Finds the select element by label/name, then selects the matching option. Use for dropdown interactions without knowing selectors.",
4173
+
"inputSchema": {"type": "object", "properties": {"label": {"type": "string", "description": "Label text of the select dropdown"}, "option": {"type": "string", "description": "Text of the option to select"}}, "required": ["label", "option"]}},
3881
4174
{"name": "browser_describe", "description": "Get a comprehensive page description combining three data sources: (1) accessibility tree with @N references for interactive elements, (2) a PNG screenshot saved to disk, and (3) visible text content. Use this when browser_a11y alone is insufficient — for canvas/WebGL content, visual verification, or complex dynamic UIs. This is the vision fallback tool.",
{"name": "browser_assert", "description": "Assert that an element matching the CSS selector exists and optionally contains expected text. Returns PASS or FAIL with details. Use this for automated testing — verify page state after navigation or interaction. Checks visibility by default.",
0 commit comments