-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Open
Description
I’m observing significant performance overhead when using Skyvern for relatively simple browser automation tasks, specifically job application form filling.
For context, even a basic workflow involving 5 to 6 input fields (text/email/select fields, no complex client-side validation) is taking around 4–5 minutes end-to-end to complete. This latency feels disproportionately high for the workload size and becomes a blocker for scaling high-volume automation use cases.
Observed Behavior
- Each interaction (page load, DOM inspection, field inference, typing) appears to incur noticeable delays
- Overall execution time grows linearly even for trivial forms
- The majority of time seems spent between steps rather than on actual browser rendering or network requests
Expected Behavior
For simple deterministic form-filling flows:
- Sub-minute execution time
- Lower per-action overhead once the DOM is stable
Possible Causes (Speculative)
From initial investigation, the slowdown may be related to one or more of the following:
- Repeated DOM snapshotting / accessibility tree extraction per action
- LLM-in-the-loop decision making for each small interaction instead of a compiled action plan
- Conservative wait strategies (implicit waits, retries, or safety buffers)
Questions / Workarounds
- Are there recommended performance-oriented configs for simple, deterministic flows (e.g., reduced reasoning depth, aggressive timeouts, disabled re-validation)?
- Is it possible to:
- Cache DOM state between actions?
- Precompute an execution plan instead of invoking reasoning per step?
- Are there benchmarks or guidance on expected latency per action?
- Any roadmap plans around fast-path execution for low-complexity automations?
Use Case Impact
This significantly affects:
- High-throughput job application automation
- Cost efficiency when running multiple agents concurrently
- Comparison with lower-level tools (e.g., Playwright/Puppeteer + heuristics), which complete similar tasks in seconds
Appreciate any guidance on tuning, architectural insights, or upcoming optimizations in this area.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels