-
Notifications
You must be signed in to change notification settings - Fork 1.6k
agent-browser upload command cannot handle dynamically created file input elements #1102
Description
Issue: agent-browser upload command cannot handle dynamically created file input elements
Description
The upload command in agent-browser works differently from Playwright MCP's browser_file_upload in terms of usability and flexibility.
Comparison
Playwright MCP (Working)
// 1. Click button to trigger file dialog
await page.getByRole('button', { name: '添加材料' }).click();
// → File chooser dialog opens, MCP detects it
// 2. Upload directly without specifying element
await browser_file_upload({ paths: ["test.md"] });
// → Works! File is uploaded successfully
agent-browser (Not Working for dynamic inputs)
1. Need to manually specify the input element
agent-browser upload "input[type=file]" "test.md"
→ Fails if input has no ID or cannot be located
2. For dynamically created inputs (created via JS), there's no way to locate them
Root Cause
Both use CDP's Input.setInputFiles under the hood, but agent-browser lacks the ability to:
- Detect when a native file dialog is open
- Associate the open dialog with the correct input element
- Handle JavaScript-dynamically created elements
Mechanism difference:
- Playwright MCP: Reactive - detects file chooser dialog automatically
- agent-browser: Imperative - requires manual element targeting
Example Scenario
On many web apps (e.g., meeting-canvas), file upload buttons trigger a dynamically created without any ID or selector:
// Frontend code creates input dynamically
const input = document.createElement("input");
input.type = "file";
input.accept = ".md,.txt,.pptx,..."; // Accepts various file types
input.click(); // Opens native file dialog
For this pattern:
- Playwright MCP: ✅ Can upload via dialog detection
- agent-browser: ❌ Cannot locate the dynamic input element
Request
Consider adding similar dialog-aware functionality to agent-browser:
- Detect native file chooser dialog state
- Auto-associate open dialogs with the triggering input
- Allow upload without explicitly specifying the input selector when a dialog is open
This would make agent-browser's upload command as user-friendly as Playwright MCP's browser_file_upload.