Skip to content

Bug: snapshot -i output format changed in v0.22.1 (Rust daemon) — breaks structured data.refs parsing #1024

@vinitslal

Description

@vinitslal

agent-browser version: v0.22.1 (Rust daemon)
Last working version: v0.19.0 (Node.js daemon)
Platform: macOS 14.6 (darwin arm64)
Node.js: v22.x
Remote browser: Browserless v2.x (headless Chrome, remote over WSS)
Connection: --cdp wss://...


Summary

The snapshot -i command in v0.22.1 produces a different JSON output structure than v0.19.0 for certain pages. Specifically, the structured data.refs dictionary — which maps element references to their role and name — is either absent or structured differently on some pages. This is a breaking change for consumers that rely on the data.refs dict for programmatic element extraction.

The format change is page-specific: some pages (e.g., a simple listing page) produce parseable output, while others (e.g., a form page with radio buttons) do not.

Reproduction

Setup

# Connect to a remote browser instance via CDP
agent-browser navigate "https://www.oakstreethealth.com/book-appointment/patient-info?slot_id=38316423&clinic_id=5&provider_id=1922685056" --cdp "$CDP_URL"
agent-browser wait --load networkidle --cdp "$CDP_URL"

This page contains a form with 3 radio buttons and a submit button.

v0.19.0 output (snapshot -i)

The JSON response includes a data.refs dictionary with structured entries for every interactive element:

{
  "success": true,
  "data": {
    "refs": {
      "e14": {
        "name": "I've never scheduled an appointment at Oak Street Health",
        "role": "radio"
      },
      "e15": {
        "name": "I've previously scheduled an appointment with Oak Street Health",
        "role": "radio"
      },
      "e16": {
        "name": "None of these",
        "role": "radio"
      },
      "e17": {
        "name": "Save & Continue",
        "role": "button"
      }
    },
    "snapshot": "- radio \"I've never scheduled...\" [ref=e14]\n- radio \"I've previously scheduled...\" [ref=e15]\n..."
  }
}

This allows reliable programmatic extraction:

import json
data = json.loads(output)
refs = data["data"]["refs"]
radios = {k: v for k, v in refs.items() if v["role"] == "radio"}
# → {'e14': {...}, 'e15': {...}, 'e16': {...}}  ← 3 radio options found

v0.22.1 output (snapshot -i)

The same page produces output where the data.refs dict is absent or differently structured, forcing fallback to regex parsing of the inline data.snapshot text. On this particular page, the inline text also does not contain parseable radio elements in the expected format.

Result: 0 radio elements extracted (vs. 3 on v0.19.0).

The inconsistency is page-specific

On a simpler page (doctor listing with links and buttons), v0.22.1 produces output that can be parsed. On a form page with radio buttons, it cannot:

Page 1 (simple listing page):  v0.22.1 ✅ — refs extractable
Page 2 (form with radio buttons): v0.22.1 ❌ — refs NOT extractable

Both pages work correctly on v0.19.0.

Impact

This is a production-impacting regression. Our automation pipeline:

  1. Takes a snapshot -i of the page
  2. Parses data.refs to find radio buttons, text fields, buttons
  3. Presents extracted options to the user or clicks the appropriate element

When step 2 returns 0 results, the system falls back to an LLM agent which hallucinates form options instead of extracting them from the actual page. Multiple user sessions were impacted before we diagnosed and reverted to v0.19.0.

Expected behavior

snapshot -i should produce a consistent data.refs dictionary across all pages, matching the v0.19.0 format:

{
  "data": {
    "refs": {
      "<ref_id>": { "name": "<accessible name>", "role": "<aria role>" }
    },
    "snapshot": "<inline text representation>"
  }
}

All interactive elements (radio buttons, checkboxes, buttons, text fields, links) should appear in the data.refs dict regardless of page complexity or DOM structure.

Request

  1. Is the data.refs dict format intentionally removed in the Rust daemon? If so, is there a flag (e.g., --json-refs, --legacy-format) to restore it?
  2. If not intentional, can the Rust daemon be updated to emit the same data.refs structure as the Node.js daemon (v0.19.0)?
  3. If the format is changing, can the inline data.snapshot text be guaranteed to contain parseable [ref=eNN] annotations for all interactive elements, including radio buttons in form contexts?

We are currently pinned to v0.19.0 and cannot upgrade until snapshot -i output is backward-compatible or we have a migration path.

Workaround

Pinned to v0.19.0 (Node.js daemon):

{
  "dependencies": {
    "agent-browser": "0.19.0"
  }
}

Environment details

  • OS: macOS 14.6 (darwin arm64)
  • Node.js: v22.x
  • Browser: Headless Chrome via remote CDP (--cdp wss://...)
  • Test page: Oak Street Health booking form (public URL)
  • agent-browser flags: snapshot -i -c --cdp <wss-url>

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions