Skip to content

agent-browser upload command cannot handle dynamically created file input elements #1102

@moyaduo

Description

@moyaduo

Issue: agent-browser upload command cannot handle dynamically created file input elements

Description

The upload command in agent-browser works differently from Playwright MCP's browser_file_upload in terms of usability and flexibility.

Comparison

Playwright MCP (Working)

// 1. Click button to trigger file dialog
await page.getByRole('button', { name: '添加材料' }).click();
// → File chooser dialog opens, MCP detects it

// 2. Upload directly without specifying element
await browser_file_upload({ paths: ["test.md"] });
// → Works! File is uploaded successfully

agent-browser (Not Working for dynamic inputs)

1. Need to manually specify the input element

agent-browser upload "input[type=file]" "test.md"

→ Fails if input has no ID or cannot be located

2. For dynamically created inputs (created via JS), there's no way to locate them

Root Cause

Both use CDP's Input.setInputFiles under the hood, but agent-browser lacks the ability to:

  1. Detect when a native file dialog is open
  2. Associate the open dialog with the correct input element
  3. Handle JavaScript-dynamically created elements

Mechanism difference:

  • Playwright MCP: Reactive - detects file chooser dialog automatically
  • agent-browser: Imperative - requires manual element targeting

Example Scenario

On many web apps (e.g., meeting-canvas), file upload buttons trigger a dynamically created without any ID or selector:

// Frontend code creates input dynamically
const input = document.createElement("input");
input.type = "file";
input.accept = ".md,.txt,.pptx,..."; // Accepts various file types
input.click(); // Opens native file dialog

For this pattern:

  • Playwright MCP: ✅ Can upload via dialog detection
  • agent-browser: ❌ Cannot locate the dynamic input element

Request

Consider adding similar dialog-aware functionality to agent-browser:

  1. Detect native file chooser dialog state
  2. Auto-associate open dialogs with the triggering input
  3. Allow upload without explicitly specifying the input selector when a dialog is open

This would make agent-browser's upload command as user-friendly as Playwright MCP's browser_file_upload.

upload_test_demo.html

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions