Skip to content

Stream input_keyboard ignores CDP commands — macOS native key-binding chords are no-ops in the viewport #1453

Description

@nedtwigg

generated with Opus 4.8 high and verified by a human

Summary

The streaming WebSocket's input_keyboard message can deliver text, plain navigation keys, and modifier-bit chords, but it cannot trigger macOS native key-binding commands — and that's a whole class of chords, not just clipboard ops. On macOS, Chromium routes select-all, copy/cut/paste, undo/redo, caret navigation (line/document), and line/word deletion through the responder chain (the standard key bindings), which CDP exposes via the commands array on Input.dispatchKeyEvent (e.g. commands: ["selectAll"], ["moveToBeginningOfLine"]). The stream protocol's input_keyboard message has no field that maps to commands, and the daemon drops the field when it's sent inline. As a result, a human pair-browsing in the viewport (or any stream client) loses these chords entirely inside the remote page on macOS.

This is distinct from web-app JS shortcuts: pages still receive the chord keydowns (so a page's own cmd+k handler fires). What's missing is OS-level editing/navigation behavior in native inputs, textareas, and contentEditable.

Scope (verified against 0.27.0)

Affected — no-ops over the stream, need CDP commands:

Chord Intended action CDP command
Cmd+A select all selectAll
Cmd+C / Cmd+X copy / cut copy / cut
Cmd+Z / Cmd+Shift+Z undo / redo undo / redo
Cmd+← / Cmd+→ move to line start / end moveToBeginningOfLine / moveToEndOfLine
Cmd+↑ / Cmd+↓ move to document start / end moveToBeginningOfDocument / moveToEndOfDocument
Cmd+⌫ delete to line start deleteToBeginningOfLine

(The Shift-held selection variants — …AndModifySelection — are equally affected.)

Works already over the stream — NOT part of this bug, included for contrast:

  • typing, plain arrows, Home/End, PageUp/Down
  • Shift+arrows / Shift+Home (selection extension via the modifiers bitfield)
  • Option+⌫ (delete previous word — Chromium handles this from the Alt modifier directly)

Environment

  • agent-browser 0.27.0
  • macOS, Chrome for Testing (default chrome engine)
  • Connecting directly to the session stream WS (stream status --jsonws://127.0.0.1:<port>)

Reproduction

Standalone, no dashboard required:

// keychords-repro.mjs — node keychords-repro.mjs  (agent-browser on PATH, macOS)
import { execFileSync } from 'node:child_process';
const SESSION = 'keychords-repro';
const ab = (...a) => execFileSync('agent-browser', ['--session', SESSION, ...a], { encoding: 'utf8' });
const evj = (js) => JSON.parse(ab('eval', js, '--json')).data.result;
// textarea content "line one\nline two": len 17, newline at index 8.
ab('open', 'data:text/html,' + encodeURIComponent('<textarea id=t style="width:320px;height:90px" autofocus>line one\nline two</textarea>'));
const port = JSON.parse(ab('stream', 'status', '--json')).data.port;
const ws = new WebSocket(`ws://127.0.0.1:${port}`);
await new Promise((res, rej) => { ws.onopen = res; ws.onerror = () => rej(new Error('ws failed')); });
ws.onmessage = () => {};
const sleep = (ms) => new Promise(r => setTimeout(r, ms));
const send = (o) => ws.send(JSON.stringify({ type:'input_keyboard', text:'', ...o }));
const reset = (c) => evj(`(()=>{const t=document.getElementById('t');t.value='line one\\nline two';t.focus();t.setSelectionRange(${c},${c});return t.selectionStart})()`);
const caret = () => evj(`document.getElementById('t').selectionStart`);
const value = () => evj(`document.getElementById('t').value`);
const sel = () => evj(`(()=>{const t=document.getElementById('t');return t.value.slice(t.selectionStart,t.selectionEnd)})()`);
const tap = async (key, code, vk, modifiers) => { send({eventType:'keyDown',key,code,windowsVirtualKeyCode:vk,modifiers}); send({eventType:'keyUp',key,code,windowsVirtualKeyCode:vk,modifiers}); await sleep(300); };

console.log('--- need CDP `commands` (all no-ops over the stream) ---');
reset(11); await tap('a','KeyA',65,4);              console.log('Cmd+A  select-all    -> sel  ', JSON.stringify(sel()),   '(want "line one\\nline two")');
reset(11); await tap('ArrowLeft','ArrowLeft',37,4); console.log('Cmd+Left line-start  -> caret', caret(),                 '          (want 9 = start of line two)');
reset(11); await tap('ArrowUp','ArrowUp',38,4);     console.log('Cmd+Up  doc-start    -> caret', caret(),                 '          (want 0)');
reset(11); await tap('Backspace','Backspace',8,4);  console.log('Cmd+Bksp del-to-bol  -> val  ', JSON.stringify(value()), '(want "line one\\n")');
reset(0);
send({eventType:'keyDown',key:'Q',code:'KeyQ',windowsVirtualKeyCode:81,text:'Q',modifiers:0}); send({eventType:'keyUp',key:'Q',code:'KeyQ',windowsVirtualKeyCode:81,modifiers:0}); await sleep(200);
await tap('z','KeyZ',90,4);                          console.log('Cmd+Z   undo "Q"     -> val  ', JSON.stringify(value().slice(0,9)), '(want "line one" = Q undone)');

console.log('--- the `commands` field is dropped even when sent inline ---');
reset(11); send({eventType:'keyDown',key:'a',code:'KeyA',windowsVirtualKeyCode:65,modifiers:4,commands:['selectAll']}); send({eventType:'keyUp',key:'a',code:'KeyA',windowsVirtualKeyCode:65,modifiers:4}); await sleep(300);
console.log('input_keyboard.commands -> sel  ', JSON.stringify(sel()), '(want "line one\\nline two")');

console.log('--- control: already work over the stream (NOT part of this bug) ---');
reset(8); await tap('Home','Home',36,8);            console.log('Shift+Home selection -> sel  ', JSON.stringify(sel()),   '(want "line one")');
reset(8); await tap('Backspace','Backspace',8,1);   console.log('Option+Bksp del-word -> val  ', JSON.stringify(value()), '(want "line \\nline two")');

ws.close(); ab('close');

Expected vs. Actual

Observed output (agent-browser 0.27.0, macOS):

--- need CDP `commands` (all no-ops over the stream) ---
Cmd+A  select-all    -> sel   ""                    (want "line one\nline two")   ❌
Cmd+Left line-start  -> caret 11                    (want 9)                       ❌
Cmd+Up  doc-start    -> caret 11                    (want 0)                       ❌
Cmd+Bksp del-to-bol  -> val   "line one\nline two"  (want "line one\n")            ❌
Cmd+Z   undo "Q"     -> val   "Qline one"           (want "line one")              ❌
--- the `commands` field is dropped even when sent inline ---
input_keyboard.commands -> sel   ""                 (want "line one\nline two")    ❌
--- control: already work over the stream (NOT part of this bug) ---
Shift+Home selection -> sel   "line one"            (want "line one")              ✅
Option+Bksp del-word -> val   "line \nline two"     (want "line \nline two")       ✅

The two control cases confirm the modifiers bitfield and ordinary key dispatch work end-to-end; only the native key-binding-command path is missing.

Root cause

CDP Input.dispatchKeyEvent accepts an optional commands array of editing/navigation commands (selectAll, copy, cut, paste, undo, redo, moveToBeginningOfLine, moveToEndOfDocument, deleteToBeginningOfLine, … and their …AndModifySelection variants). On macOS these are how Chromium applies the standard key bindings — a raw keyDown with metaKey/altKey set is not sufficient for this command class. The stream input_keyboard message has no field that maps to commands, and the daemon's deserializer drops the field when sent inline, so it can never reach Input.dispatchKeyEvent.

Suggested fix

  1. Add an optional commands: string[] field to the input_keyboard stream message and forward it to CDP Input.dispatchKeyEvent.commands.
  2. In the dashboard viewport's keyboard forwarder, populate it for the macOS key-binding chords above, alongside the existing modifier-bit handling — e.g.:
    • Meta+a/c/x/v["selectAll"] / ["copy"] / ["cut"] / ["paste"]
    • Meta+z / Meta+Shift+z["undo"] / ["redo"]
    • Meta+Arrow{Left,Right,Up,Down}["moveToBeginningOfLine"] / ["moveToEndOfLine"] / ["moveToBeginningOfDocument"] / ["moveToEndOfDocument"] (use the …AndModifySelection variant when Shift is held)
    • Meta+Backspace["deleteToBeginningOfLine"]

Prior art

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions