Skip to content

feat: Add screenshot capture functionality#33

Open
halapenyoharry wants to merge 2 commits intoeyalzh:mainfrom
halapenyoharry:add-screenshot-feature
Open

feat: Add screenshot capture functionality#33
halapenyoharry wants to merge 2 commits intoeyalzh:mainfrom
halapenyoharry:add-screenshot-feature

Conversation

@halapenyoharry
Copy link
Copy Markdown

Summary

This PR adds screenshot capture functionality to browser-control-mcp, enabling AI assistants to capture visual snapshots of browser tabs. This is particularly useful for visual collaboration scenarios like working with ComfyUI workflows.

Changes

  • Added new capture-browser-screenshot MCP tool
  • Implemented screenshot capture using Firefox's captureVisibleTab API
  • Screenshots are saved to system temp directory with timestamps
  • Added required permissions to manifest.json
  • Updated TypeScript types for proper message handling

Implementation Details

  • Server side: New tool in server.ts and method in browser-api.ts
  • Extension side: Handler in message-handler.ts using native browser API
  • Security: Screenshots saved locally only, no network transmission

Testing

  • Built successfully with npm run build
  • Ready for testing with Firefox/Zen browser

Use Case

Perfect for AI assistants helping with visual workflows:

  • Debugging UI issues
  • Capturing workflow states (e.g., ComfyUI)
  • Visual documentation
  • Collaborative troubleshooting

Future Enhancements

  • Full-page screenshots (currently viewport only)
  • Multiple image formats
  • Element-specific captures

Fixes #[issue-number] (if applicable)

halapenyoharry and others added 2 commits August 13, 2025 11:31
- Add new 'capture-browser-screenshot' tool to MCP server
- Implement screenshot capture in Firefox extension using captureVisibleTab API
- Save screenshots to temp directory with timestamp
- Add required permissions to manifest.json
- Update TypeScript types for screenshot messages
- Add comprehensive documentation for the feature

This enables AI assistants to capture visual snapshots of browser tabs,
making it perfect for workflow collaboration scenarios like ComfyUI.
- Add activeTab permission to manifest.json for better screenshot support
- Add comprehensive debug logging to screenshot capture process
- Improve error handling and API availability checking
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant