Skip to content

Web app workflow automation: forms, dashboards, SaaS tools via Playwright #661

@kovtcharov

Description

@kovtcharov

Summary

Extend browser automation (#458) to enable reusable workflow automation for common web apps — banking dashboards, SaaS tools, government portals, internal tools. The agent learns multi-step web workflows and replays them on demand.

Strategic Context

From the OpenClaw strategy (§9.5):

Financial tracking and budgeting — Sensitive Data: Yes, AMD Local Advantage: Very strong (privacy) — Tier 2: Fast follow

Browser automation is the universal integration layer. Instead of building dedicated API adapters for every SaaS tool, the agent uses the same web interface humans use.

Use Cases Enabled

Financial Tracking (Tier 2)

  • Log into banking dashboard, extract transaction history
  • Categorize spending, generate budget summaries
  • Monitor account balances and alert on anomalies
  • "Show me my spending this month by category"

Business Tool Automation

  • Fill out expense reports, timesheets, HR forms
  • Extract data from CRM dashboards (Salesforce, HubSpot)
  • Submit support tickets, check order status
  • "Fill out my timesheet for this week"

Government and Utility Portals

  • Check permit status, download documents
  • Pay utility bills, monitor usage
  • "Check if my building permit was approved"

Architecture

Built on #458 (BrowserToolsMixin) and extends with:

  1. Workflow recorder — Agent watches user demonstrate a task, records steps
  2. Parameterized replay — Replay recorded workflows with different inputs
  3. Structured data extraction — Parse tables, forms, dashboards into agent-readable data

Dependencies

Acceptance Criteria

  • Agent can navigate a multi-step web workflow (login → navigate → extract → report)
  • Agent can fill out web forms from natural language instructions
  • Agent can extract tabular data from web dashboards
  • Workflow recording captures steps for parameterized replay
  • Privacy: all browser data stays local, no cloud relay

Metadata

Metadata

Assignees

No one assigned

    Labels

    browser-useBrowser automation and control featuresdomain:agent-coreFramework, tools, registry, memory, skills, orchestrationenhancementNew feature or requesttrack:consumer-appHermes-competitor consumer product — mobile-first, voice + messaging + memory + skills

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions