-
4b5fc78: ### Bug Fixes
- Material Design checkbox/radio parity - Restored Playwright-parity behavior for
check/uncheckactions on Material Design controls. These components hide the native<input>off-screen and use overlay elements that intercept coordinate-based clicks; the actions now detect this pattern and fall back to a JS.click()to correctly toggle state. Also improvesischeckedto handle nested hidden inputs and ARIA-only checkboxes (#837) - Punctuation handling in
typecommand - Fixed incorrect virtual key (VK) codes being used for punctuation characters (e.g..,@) in thetypeaction, which previously caused those characters to be dropped or mistyped (#836)
- Material Design checkbox/radio parity - Restored Playwright-parity behavior for
-
a3d9662: ### Bug Fixes
- Restored WebSocket streaming - Fixed broken WebSocket streaming in the native daemon by keeping the StreamServer instance alive so the broadcast channel remains open, and ensuring CDP session IDs and connection status are correctly propagated to stream clients (#826)
- Filtered internal Chrome targets - Fixed auto-connect discovery incorrectly attempting to attach to Chrome-internal pages (e.g.
chrome://,chrome-extension://,devtools://URLs), which could cause unexpected connection failures (#827)
-
51d9ab4: ### Bug Fixes
- Appium v3 iOS capabilities - Added
appium:vendor prefix to iOS capabilities (e.g.,appium:automationName,appium:deviceName,appium:platformVersion) to comply with the Appium v3 WebDriver protocol requirements (#810) - Snapshot
--selectorscoping - Fixedsnapshot --selectorso that the output is properly scoped to the matched element's subtree rather than returning the full accessibility tree. The selector now resolves the target DOM node's backend IDs and filters the accessibility tree to only include nodes within that subtree (#825)
- Appium v3 iOS capabilities - Added
-
daf7263: ### Bug Fixes
- Fixed video duration being reported incorrectly when using real-time ffmpeg encoding for screen recording (#812)
- Removed obsolete
BrowserManagerTypeScript API references that no longer reflect the current CLI-based usage model (#821)
- Updated README to replace outdated
BrowserManagerprogrammatic API examples with the current CLI-based approach usingexecSyncandagent-browsercommands (#821) - Removed the Programmatic API section covering
BrowserManagerscreencast and input injection methods, which are no longer part of the public API (#821)
-
25a1526: ### New Features
- Brave Browser support - Added auto-discovery of Brave Browser for CDP connections on macOS, Linux, and Windows. The agent will now automatically detect and connect to Brave alongside Chrome, Chromium, and Canary installations (#817)
- Postinstall message - The post-install message now detects existing Chrome installations on the system. If a compatible browser is found, it confirms the path and notes it will be used automatically instead of prompting an install. If no browser is detected, the warning is clearer and mentions that installation can be skipped when using
--cdp,--provider,--engine, or--executable-path(#815)
-
fa91c22: ### Bug Fixes
- Stale accessibility tree reference fallback - Fixed an issue where interacting with an element whose
backend_node_idhad become stale (e.g. after the DOM was replaced) would fail with aCould not compute box modelCDP error. Element resolution now re-queries the accessibility tree using role/name lookup to obtain a fresh node ID before retrying the operation (#806)
- Stale accessibility tree reference fallback - Fixed an issue where interacting with an element whose
-
fc091d2: ### Bug Fixes
- Daemon panic on broken stderr pipe - Replaced all
eprintln!calls withwriteln!(std::io::stderr(), ...)wrapped inlet _ =to silently discard write errors, preventing the daemon from panicking when the parent process drops the stderr pipe during Chrome launch (#802)
- Daemon panic on broken stderr pipe - Replaced all
-
e2ebde2: ### Bug Fixes
- Broadcast channel lag handling - Fixed an issue where broadcast channel lag errors were incorrectly treated as stream closure, causing premature termination of event listeners in reload, response body, download, and navigation wait operations. Lagged messages are now skipped and the loop continues instead of breaking (#797)
- Removed unused pnpm setup steps from the
global-installCI job, simplifying the workflow configuration (#798)
-
e365909: ### Bug Fixes
- Chrome launch retry - Chrome will now retry launching up to 3 times with a 500ms delay between attempts, improving resilience against transient startup failures (#791)
- Remote CDP snapshot hang - Resolved an issue where snapshots would hang indefinitely over remote CDP (WSS) connections by removing WebSocket message and frame size limits to accommodate large responses (e.g.
Accessibility.getFullAXTree), accepting binary frames from remote proxies such as Browserless, and immediately clearing pending commands when the connection closes rather than waiting for the 30-second timeout (#792)
-
944fa01: ### New Features
- Linux musl (Alpine) builds - Added pre-built binaries for linux-musl targeting both x64 and arm64 architectures, enabling native support for Alpine Linux and other musl-based distributions without requiring glibc (#784)
- Consecutive
--auto-connectcommands - Added support for issuing multiple consecutive--auto-connectcommands without requiring a full browser relaunch; external connections are now correctly identified and reused (#786) - External browser disconnect behavior - When using
--auto-connector--cdp, closing the agent session now disconnects cleanly without shutting down the user's browser process
- Restored
refsdict in--jsonsnapshot output - Therefsmap containing role and name metadata for referenced elements is now correctly included in JSON snapshot responses (#787) - Fixed e2e test assertions for
diff_snapshotanddomain_filterto correctly reflect expected behavior (#783) - Fixed Chrome temp-dir cleanup test failing on Windows (#766)
-
bd05917: ### Bug Fixes
- Fixed AX tree deserialization to accept integer
nodeIdandchildIdsvalues for compatibility with Lightpanda, which sends numeric IDs where Chrome sends strings (#775) - Fixed misleading SIGPIPE comment to accurately describe the default Rust SIGPIPE behavior and why it is reset to
SIG_DFL(#776) - Fixed WebM recording output to use the VP9 codec (
libvpx-vp9) instead of H.264, producing valid WebM files; also adds a padding filter to ensure even frame dimensions (#779)
- Fixed AX tree deserialization to accept integer
-
235fa88: ### Full Native Rust
- 100% native Rust -- Removed the entire Node.js/Playwright daemon. The Rust native daemon is now the only implementation. No Node.js runtime or Playwright dependency required. (#754)
- 99x smaller install -- Install size reduced from 710 MB to 7 MB by eliminating the Node.js dependency tree.
- 18x less memory -- Daemon memory usage reduced from 143 MB to 8 MB.
- 1.6x faster cold start -- Cold start time reduced from 1002ms to 617ms.
- Benchmarks -- Added benchmark suite comparing native vs Node.js daemon performance.
- Chromium installer hardened -- Fixed zip path traversal vulnerability in Chrome for Testing installer.
- Fixed
--headed falseflag not being respected in CLI (#757) - Fixed "not found" error pattern in
to_ai_friendly_errorincorrectly catching non-element errors (#759) - Fixed storage local key lookup parsing and text output (#761)
- Fixed Lightpanda engine launch with release binaries (#760)
- Hardened Lightpanda startup timeouts (#762)
-
56bb92b: ### New Features
- Browserless.io provider -- Added browserless.io as a browser provider, supported in both Node.js and native daemon paths. Connect to remote Browserless instances with
--provider browserlessorAGENT_BROWSER_PROVIDER=browserless. Configurable viaBROWSERLESS_API_KEY,BROWSERLESS_API_URL, andBROWSERLESS_BROWSER_TYPEenvironment variables. (#502, #746) clipboardcommand -- Read from and write to the browser clipboard. Supportsread,write <text>,copy(simulates Ctrl+C), andpaste(simulates Ctrl+V) operations. (#749)- Screenshot output configuration -- New global flags
--screenshot-dir,--screenshot-quality,--screenshot-formatand correspondingAGENT_BROWSER_SCREENSHOT_DIR,AGENT_BROWSER_SCREENSHOT_QUALITY,AGENT_BROWSER_SCREENSHOT_FORMATenvironment variables for persistent screenshot settings. (#749)
- Fixed
wait --textnot working in native daemon path (#749) - Fixed
BrowserManager.navigate()and package entry point (#748) - Fixed extensions not being loaded from
config.json(#750) - Fixed scroll on page load (#747)
- Fixed HTML retrieval by using
browser.getLocator()for selector operations (#745)
- Browserless.io provider -- Added browserless.io as a browser provider, supported in both Node.js and native daemon paths. Connect to remote Browserless instances with
-
942b8cd: ### New Features
inspectcommand - Opens Chrome DevTools for the active page by launching a local proxy server that forwards the DevTools frontend to the browser's CDP WebSocket. Commands continue to work while DevTools is open. Implemented in both Node.js and native paths. (#736)get cdp-urlsubcommand - Retrieve the Chrome DevTools Protocol WebSocket URL for the active page, useful for external debugging tools. (#736)- Native screenshot annotate - The
--annotateflag for screenshots now works in the native Rust daemon, bringing parity with the Node.js path. (#706)
- KERNEL_API_KEY now optional - External credential injection no longer requires
KERNEL_API_KEYto be set, making it easier to use Kernel with pre-configured environments. (#687) - Browserbase simplified - Removed the
BROWSERBASE_PROJECT_IDrequirement, reducing setup friction for Browserbase users. (#625)
- Fixed Browserbase API using incorrect endpoint to release sessions (#707)
- Fixed CDP connect paths using hardcoded 10s timeout instead of
getDefaultTimeout()(#704) - Fixed lone Unicode surrogates causing errors by sanitizing with
toWellFormed()(#720) - Fixed CDP connection failure on IPv6-first systems (#717)
- Fixed recordings not inheriting the current viewport settings (#718)
- 94cd888: Added support for device scale factor (retina display) in the viewport command via an optional scale parameter. Also added webview target type support for better Electron application compatibility, and the pages list now includes target type information.
-
94521e7: ### New Features
- Lightpanda browser engine support - Added
--engine <name>flag to select the browser engine (chromeby default, orlightpanda), implying--nativemode. Configurable viaAGENT_BROWSER_ENGINEenvironment variable (#646) - Dialog dismiss command - Added support for
dismisssubcommand in dialog command parsing (#605)
- Daemon startup error reporting - Daemon startup errors are now surfaced directly instead of showing an opaque timeout message (#614)
- CDP port discovery - Replaced broken hand-rolled HTTP client with
reqwestfor more reliable CDP port discovery (#619) - Chrome extensions - Extensions now load correctly by forcing headed mode when extensions are present (#652)
- Google Translate bar suppression - Suppressed the Google Translate bar in native headless mode to avoid interference (#649)
- Auth cookie persistence - Auth cookies are now persisted on browser close in native mode (#650)
- Fixed native auth login failing due to incompatible encryption format (#648)
- Improved snapshot usage guidance and added reproducibility check (#630)
- Added
--engineflag to the README options table
- Added benchmarks to the CLI codebase (#637)
- Lightpanda browser engine support - Added
- 7d2c895: Fixed an issue where the --native flag was being passed to child processes even when not explicitly specified on the command line. The flag is now only forwarded when the user explicitly provides it, consistent with how other CLI flags like --allow-file-access and --download-path are handled.
- 01ac557: Added AGENT_BROWSER_HEADED environment variable support for running the browser in headed mode, and improved temporary profile cleanup when launching Chrome directly. Also includes documentation clarification that browser extensions work in both headed and headless modes.
- c4180c8: Improved Chrome launch reliability by automatically detecting containerized environments (Docker, Podman, Kubernetes) and enabling --no-sandbox when needed. Added support for discovering Playwright-installed Chromium browsers and enhanced error messages with helpful diagnostics when Chrome fails to launch.
- 05018b3: Added experimental native Rust daemon (
--nativeflag,AGENT_BROWSER_NATIVE=1env, or"native": truein config). The native daemon communicates with Chrome directly via CDP, eliminating Node.js and Playwright dependencies. Supports 150+ commands with full parity to the default Node.js daemon. Includes WebDriver backend for Safari/iOS, CDP protocol codegen, request tracking, frame context management, and comprehensive e2e and parity tests.
- 62241b5: Fixed Windows compatibility issues including proper handling of extended-length path prefixes from canonicalize(), prevention of MSYS/Git Bash path translation that could mangle arguments, and improved daemon startup reliability. Also added ARM64 Windows support in postinstall shims and expanded CI testing with a full daemon lifecycle test on Windows.
- 6aea316: Documentation site improvements and internal tooling updates including enhanced code blocks, mobile navigation, and docs chat components. CLI connection and output handling refinements. Skill creator reference documentation and scripts have been reorganized.
- 7bd8ce9: Added support for chrome:// and chrome-extension:// URLs in navigation and recording commands. These special browser URLs are now preserved as-is instead of having https:// incorrectly prepended.
- 2e38882: - Added security hardening: authentication vault, content boundary markers, domain allowlist, action policy, action confirmation, and output length limits.
- Added
--download-pathflag (andAGENT_BROWSER_DOWNLOAD_PATHenv /downloadPathconfig key) to set a default download directory. - Added
--selectorflag toscrollcommand for scrolling within specific container elements.
- Added
- b7665e5: - Added
keyboardcommand for raw keyboard input -- type with real keystrokes, insert text, and press shortcuts at the currently focused element without needing a selector.- Added
--color-schemeflag andAGENT_BROWSER_COLOR_SCHEMEenv var for persistent dark/light mode preference across browser sessions. - Fixed IPC EAGAIN errors (os error 35/11) by adding backpressure-aware socket writes, command serialization, and lowering the default Playwright timeout to 25s (configurable via
AGENT_BROWSER_DEFAULT_TIMEOUT). - Fixed remote debugging (CDP) reconnection.
- Fixed state load failing when no browser is running.
- Fixed
--annotateflag warning appearing when not explicitly passed via CLI.
- Added
- ebd8717: Added new diff commands for comparing snapshots, screenshots, and URLs between page states. You can now run visual pixel diffs against baseline images, compare accessibility tree snapshots with customizable depth and selectors, and diff two URLs side-by-side with optional screenshot comparison.
- 69ffad0: Add annotated screenshots with the new --annotate flag, which overlays numbered labels on interactive elements and prints a legend mapping each label to its element ref. This enables multimodal AI models to reason about visual layout while using the same @eN refs for subsequent interactions. The flag can also be set via the AGENT_BROWSER_ANNOTATE environment variable.
- c6fc7df: Added documentation for command chaining with && across README, CLI help output, docs, and skill files, explaining how to efficiently chain multiple agent-browser commands in a single shell invocation since the browser persists via a background daemon.
- 5dc40b4: Added configuration file support with automatic loading from user and project directories, new profiler commands for Chrome DevTools profiling, computed styles getter, browser extension loading, storage state management, and iOS device emulation. Expanded click command with new-tab option, improved find command with additional actions and filtering options, and enhanced CDP connection to accept WebSocket URLs. Documentation has been significantly expanded with new sections for configuration, profiling, and proxy support.
- 1112a16: Added session persistence with automatic save/restore of cookies and localStorage across browser restarts using --session-name flag, with optional AES-256-GCM encryption for saved state data. New state management commands allow listing, showing, renaming, clearing, and cleaning up old session files. Also added --new-tab option for click commands to open links in new tabs.
- 323b6cd: Fix all Clippy lint warnings in the Rust CLI: remove redundant import, use
.first()instead of.get(0), use.copied()instead of.map(|s| *s), use.contains()instead of.iter().any(), usethen_someinstead of lazythen, and simplify redundant match guards.
- d03e238: Added support for custom executable path in CLI browser launch options. Documentation site received UI improvements including a new chat component with sheet-based interface and updated dependencies.
- 76d23db: Documentation site migrated to MDX for improved content authoring, added AI-powered docs chat feature, and updated README with Homebrew installation instructions for macOS users.
- ae34945: Added --allow-file-access flag to enable opening and interacting with local file:// URLs (PDFs, HTML files) by passing Chromium flags that allow JavaScript access to local files. Added -C/--cursor flag for snapshots to include cursor-interactive elements like divs with onclick handlers or cursor:pointer styles, which is useful for modern web apps using custom clickable elements.
- 9d021bd: Add iOS Simulator and real device support for mobile Safari testing via Appium. New CLI commands include
device listto show available simulators,tapandswipefor touch interactions, and the--deviceflag to specify which iOS device to use. Configure with-p iosprovider flag orAGENT_BROWSER_PROVIDER=iosenvironment variable.
- 17dba8f: Add --stdin flag for eval command to read JavaScript from stdin, enabling heredoc usage for multiline scripts
- daeede4: Add --stdin flag for the eval command to read JavaScript from stdin, enabling heredoc usage for multiline scripts. Also fix binary permission issues on macOS/Linux when postinstall scripts don't run (e.g., with bun).
- 0dc36f2: Add --stdin flag for eval command to read JavaScript from stdin, enabling heredoc usage for multiline scripts
- 2771588: Added base64 encoding support for the eval command with -b/--base64 flag to avoid shell escaping issues when executing JavaScript. Updated documentation with AI agent setup instructions and reorganized the docs structure by consolidating agent mode content into the installation page.
- d24f753: Fixed browser launch options not being passed correctly when using persistent profiles, ensuring args, userAgent, proxy, and ignoreHTTPSErrors settings now work properly. Added pre-flight checks for socket path length limits and directory write permissions to provide clearer error messages when daemon startup fails. Improved error handling to properly exit with failure status when browser launch fails.
- d75350a: Improved daemon connection reliability by adding automatic retry logic for transient errors like connection resets, broken pipes, and temporary resource unavailability. The CLI now cleans up stale socket and PID files before starting a new daemon, and includes better detection of daemon responsiveness to handle race conditions during shutdown.
- cb2f8c3: Fixed version synchronization to automatically update Cargo.lock alongside Cargo.toml during releases, and made the CLI binary executable. This ensures the Rust CLI version stays in sync with the npm package version.
- 759302e: Fixed "Daemon not found" error when running through AI agents (e.g., Claude Code) by resolving symlinks in the executable path. Previously, npm global bin symlinks weren't being resolved correctly, causing intermittent daemon discovery failures.
- 4116a8a: Replaced shell-based CLI wrappers with a cross-platform Node.js wrapper to enable npx support on Windows. Added postinstall logic to patch npm's bin entry on global installs, allowing the native binary to be invoked directly with zero overhead. Added CI tests to verify global installation works correctly across all platforms.
- 7e6336f: Fixed the Windows CMD wrapper to use the native binary directly instead of routing through Node.js, improving startup performance and reliability. Added retry logic to the CI install command to handle transient failures during browser installation.
- 8eec634: Improved release workflow to validate binary file sizes and ensure binaries are executable after npm install. Updated documentation site with a new mobile navigation system and added v0.8.0 changelog entries. Reformatted CHANGELOG.md for better readability.
- Kernel cloud browser provider - Connect to Kernel (https://kernel.sh) for remote browser infrastructure via
-p kernelflag orAGENT_BROWSER_PROVIDER=kernel. Supports stealth mode, persistent profiles, and automatic profile find-or-create. - Ignore HTTPS certificate errors - New
--ignore-https-errorsflag for working with self-signed certificates and development environments - Enhanced cookie management - Extended
cookies setcommand with--url,--domain,--path,--httpOnly,--secure,--sameSite, and--expiresflags for setting cookies before page load
- Fixed tab list command not recognizing new pages opened via clicks or
target="_blank"links (#275) - Fixed
checkcommand hanging indefinitely (#272) - Fixed
set devicenot applying deviceScaleFactor - HiDPI screenshots now work correctly (#270) - Fixed state load and profile persistence not working in v0.7.6 (#268)
- Screenshots now save to temp directory when no path is provided (#247)
- Daemon and stream server now reject cross-origin connections (#274)
- a4d0c26: Allow null values for the screenshot selector field. Previously, passing a null selector would fail validation, but now it is properly handled as an optional value.
- 8c2a6ec: Fix GitHub release workflow to handle existing releases. If a release already exists, binaries are uploaded to it instead of failing.
- 957b5e5: Fix binary permissions on install. npm doesn't preserve execute bits, so postinstall now ensures the native binary is executable.
- 161d8f5: Fix native binary distribution in npm package. Native binaries for all platforms (Linux x64/arm64, macOS x64/arm64, Windows x64) are now correctly included when publishing.
-
6afede2: Fix native binary distribution in npm package
Native binaries for all platforms (Linux x64/arm64, macOS x64/arm64, Windows x64) are now included in the npm package. Previously, the release workflow published to npm before building binaries, causing "No binary found" errors on installation.
- Fix native binary distribution in npm package. Native binaries for all platforms (Linux x64/arm64, macOS x64/arm64, Windows x64) are now included in the npm package. Previously, the release workflow published to npm before building binaries, causing "No binary found" errors on installation.
-
316e649: ## New Features
- Cloud browser providers - Connect to Browserbase or Browser Use for remote browser infrastructure via
-pflag orAGENT_BROWSER_PROVIDERenv var - Persistent browser profiles - Store cookies, localStorage, and login sessions across browser restarts with
--profile - Remote CDP WebSocket URLs - Connect to remote browser services via WebSocket URL (e.g.,
--cdp "wss://...") - Download commands - New
downloadcommand andwait --downloadfor file downloads with ref support - Browser launch configuration - New
--args,--user-agent, and--proxy-bypassflags for fine-grained browser control - Enhanced skills - Hierarchical structure with references and templates for Claude Code
- Screenshot command now supports refs and has improved error messages
- WebSocket URLs work in
connectcommand - Fixed socket file location (uses
~/.agent-browserinstead of TMPDIR) - Windows binary path fix (.exe extension)
- State load and path-based actions now show correct output messages
- Added Claude Code marketplace plugin installation instructions
- Updated skill documentation with references and templates
- Improved error documentation
- Cloud browser providers - Connect to Browserbase or Browser Use for remote browser infrastructure via