generated with Opus 4.8 high and verified by a human
Summary
The streaming WebSocket's input_keyboard message can deliver text, plain navigation keys, and modifier-bit chords, but it cannot trigger macOS native key-binding commands — and that's a whole class of chords, not just clipboard ops. On macOS, Chromium routes select-all, copy/cut/paste, undo/redo, caret navigation (line/document), and line/word deletion through the responder chain (the standard key bindings), which CDP exposes via the commands array on Input.dispatchKeyEvent (e.g. commands: ["selectAll"], ["moveToBeginningOfLine"]). The stream protocol's input_keyboard message has no field that maps to commands, and the daemon drops the field when it's sent inline. As a result, a human pair-browsing in the viewport (or any stream client) loses these chords entirely inside the remote page on macOS.
This is distinct from web-app JS shortcuts: pages still receive the chord keydowns (so a page's own cmd+k handler fires). What's missing is OS-level editing/navigation behavior in native inputs, textareas, and contentEditable.
Scope (verified against 0.27.0)
Affected — no-ops over the stream, need CDP commands:
| Chord |
Intended action |
CDP command |
| Cmd+A |
select all |
selectAll |
| Cmd+C / Cmd+X |
copy / cut |
copy / cut |
| Cmd+Z / Cmd+Shift+Z |
undo / redo |
undo / redo |
| Cmd+← / Cmd+→ |
move to line start / end |
moveToBeginningOfLine / moveToEndOfLine |
| Cmd+↑ / Cmd+↓ |
move to document start / end |
moveToBeginningOfDocument / moveToEndOfDocument |
| Cmd+⌫ |
delete to line start |
deleteToBeginningOfLine |
(The Shift-held selection variants — …AndModifySelection — are equally affected.)
Works already over the stream — NOT part of this bug, included for contrast:
- typing, plain arrows, Home/End, PageUp/Down
- Shift+arrows / Shift+Home (selection extension via the
modifiers bitfield)
- Option+⌫ (delete previous word — Chromium handles this from the Alt modifier directly)
Environment
- agent-browser 0.27.0
- macOS, Chrome for Testing (default
chrome engine)
- Connecting directly to the session stream WS (
stream status --json → ws://127.0.0.1:<port>)
Reproduction
Standalone, no dashboard required:
// keychords-repro.mjs — node keychords-repro.mjs (agent-browser on PATH, macOS)
import { execFileSync } from 'node:child_process';
const SESSION = 'keychords-repro';
const ab = (...a) => execFileSync('agent-browser', ['--session', SESSION, ...a], { encoding: 'utf8' });
const evj = (js) => JSON.parse(ab('eval', js, '--json')).data.result;
// textarea content "line one\nline two": len 17, newline at index 8.
ab('open', 'data:text/html,' + encodeURIComponent('<textarea id=t style="width:320px;height:90px" autofocus>line one\nline two</textarea>'));
const port = JSON.parse(ab('stream', 'status', '--json')).data.port;
const ws = new WebSocket(`ws://127.0.0.1:${port}`);
await new Promise((res, rej) => { ws.onopen = res; ws.onerror = () => rej(new Error('ws failed')); });
ws.onmessage = () => {};
const sleep = (ms) => new Promise(r => setTimeout(r, ms));
const send = (o) => ws.send(JSON.stringify({ type:'input_keyboard', text:'', ...o }));
const reset = (c) => evj(`(()=>{const t=document.getElementById('t');t.value='line one\\nline two';t.focus();t.setSelectionRange(${c},${c});return t.selectionStart})()`);
const caret = () => evj(`document.getElementById('t').selectionStart`);
const value = () => evj(`document.getElementById('t').value`);
const sel = () => evj(`(()=>{const t=document.getElementById('t');return t.value.slice(t.selectionStart,t.selectionEnd)})()`);
const tap = async (key, code, vk, modifiers) => { send({eventType:'keyDown',key,code,windowsVirtualKeyCode:vk,modifiers}); send({eventType:'keyUp',key,code,windowsVirtualKeyCode:vk,modifiers}); await sleep(300); };
console.log('--- need CDP `commands` (all no-ops over the stream) ---');
reset(11); await tap('a','KeyA',65,4); console.log('Cmd+A select-all -> sel ', JSON.stringify(sel()), '(want "line one\\nline two")');
reset(11); await tap('ArrowLeft','ArrowLeft',37,4); console.log('Cmd+Left line-start -> caret', caret(), ' (want 9 = start of line two)');
reset(11); await tap('ArrowUp','ArrowUp',38,4); console.log('Cmd+Up doc-start -> caret', caret(), ' (want 0)');
reset(11); await tap('Backspace','Backspace',8,4); console.log('Cmd+Bksp del-to-bol -> val ', JSON.stringify(value()), '(want "line one\\n")');
reset(0);
send({eventType:'keyDown',key:'Q',code:'KeyQ',windowsVirtualKeyCode:81,text:'Q',modifiers:0}); send({eventType:'keyUp',key:'Q',code:'KeyQ',windowsVirtualKeyCode:81,modifiers:0}); await sleep(200);
await tap('z','KeyZ',90,4); console.log('Cmd+Z undo "Q" -> val ', JSON.stringify(value().slice(0,9)), '(want "line one" = Q undone)');
console.log('--- the `commands` field is dropped even when sent inline ---');
reset(11); send({eventType:'keyDown',key:'a',code:'KeyA',windowsVirtualKeyCode:65,modifiers:4,commands:['selectAll']}); send({eventType:'keyUp',key:'a',code:'KeyA',windowsVirtualKeyCode:65,modifiers:4}); await sleep(300);
console.log('input_keyboard.commands -> sel ', JSON.stringify(sel()), '(want "line one\\nline two")');
console.log('--- control: already work over the stream (NOT part of this bug) ---');
reset(8); await tap('Home','Home',36,8); console.log('Shift+Home selection -> sel ', JSON.stringify(sel()), '(want "line one")');
reset(8); await tap('Backspace','Backspace',8,1); console.log('Option+Bksp del-word -> val ', JSON.stringify(value()), '(want "line \\nline two")');
ws.close(); ab('close');
Expected vs. Actual
Observed output (agent-browser 0.27.0, macOS):
--- need CDP `commands` (all no-ops over the stream) ---
Cmd+A select-all -> sel "" (want "line one\nline two") ❌
Cmd+Left line-start -> caret 11 (want 9) ❌
Cmd+Up doc-start -> caret 11 (want 0) ❌
Cmd+Bksp del-to-bol -> val "line one\nline two" (want "line one\n") ❌
Cmd+Z undo "Q" -> val "Qline one" (want "line one") ❌
--- the `commands` field is dropped even when sent inline ---
input_keyboard.commands -> sel "" (want "line one\nline two") ❌
--- control: already work over the stream (NOT part of this bug) ---
Shift+Home selection -> sel "line one" (want "line one") ✅
Option+Bksp del-word -> val "line \nline two" (want "line \nline two") ✅
The two control cases confirm the modifiers bitfield and ordinary key dispatch work end-to-end; only the native key-binding-command path is missing.
Root cause
CDP Input.dispatchKeyEvent accepts an optional commands array of editing/navigation commands (selectAll, copy, cut, paste, undo, redo, moveToBeginningOfLine, moveToEndOfDocument, deleteToBeginningOfLine, … and their …AndModifySelection variants). On macOS these are how Chromium applies the standard key bindings — a raw keyDown with metaKey/altKey set is not sufficient for this command class. The stream input_keyboard message has no field that maps to commands, and the daemon's deserializer drops the field when sent inline, so it can never reach Input.dispatchKeyEvent.
Suggested fix
- Add an optional
commands: string[] field to the input_keyboard stream message and forward it to CDP Input.dispatchKeyEvent.commands.
- In the dashboard viewport's keyboard forwarder, populate it for the macOS key-binding chords above, alongside the existing modifier-bit handling — e.g.:
Meta+a/c/x/v → ["selectAll"] / ["copy"] / ["cut"] / ["paste"]
Meta+z / Meta+Shift+z → ["undo"] / ["redo"]
Meta+Arrow{Left,Right,Up,Down} → ["moveToBeginningOfLine"] / ["moveToEndOfLine"] / ["moveToBeginningOfDocument"] / ["moveToEndOfDocument"] (use the …AndModifySelection variant when Shift is held)
Meta+Backspace → ["deleteToBeginningOfLine"]
Prior art
generated with Opus 4.8 high and verified by a human
Summary
The streaming WebSocket's
input_keyboardmessage can deliver text, plain navigation keys, and modifier-bit chords, but it cannot trigger macOS native key-binding commands — and that's a whole class of chords, not just clipboard ops. On macOS, Chromium routes select-all, copy/cut/paste, undo/redo, caret navigation (line/document), and line/word deletion through the responder chain (the standard key bindings), which CDP exposes via thecommandsarray onInput.dispatchKeyEvent(e.g.commands: ["selectAll"],["moveToBeginningOfLine"]). The stream protocol'sinput_keyboardmessage has no field that maps tocommands, and the daemon drops the field when it's sent inline. As a result, a human pair-browsing in the viewport (or any stream client) loses these chords entirely inside the remote page on macOS.This is distinct from web-app JS shortcuts: pages still receive the chord keydowns (so a page's own
cmd+khandler fires). What's missing is OS-level editing/navigation behavior in native inputs, textareas, and contentEditable.Scope (verified against 0.27.0)
Affected — no-ops over the stream, need CDP
commands:selectAllcopy/cutundo/redomoveToBeginningOfLine/moveToEndOfLinemoveToBeginningOfDocument/moveToEndOfDocumentdeleteToBeginningOfLine(The Shift-held selection variants —
…AndModifySelection— are equally affected.)Works already over the stream — NOT part of this bug, included for contrast:
modifiersbitfield)Environment
chromeengine)stream status --json→ws://127.0.0.1:<port>)Reproduction
Standalone, no dashboard required:
Expected vs. Actual
Observed output (agent-browser 0.27.0, macOS):
The two control cases confirm the
modifiersbitfield and ordinary key dispatch work end-to-end; only the native key-binding-command path is missing.Root cause
CDP
Input.dispatchKeyEventaccepts an optionalcommandsarray of editing/navigation commands (selectAll,copy,cut,paste,undo,redo,moveToBeginningOfLine,moveToEndOfDocument,deleteToBeginningOfLine, … and their…AndModifySelectionvariants). On macOS these are how Chromium applies the standard key bindings — a rawkeyDownwithmetaKey/altKeyset is not sufficient for this command class. The streaminput_keyboardmessage has no field that maps tocommands, and the daemon's deserializer drops the field when sent inline, so it can never reachInput.dispatchKeyEvent.Suggested fix
commands: string[]field to theinput_keyboardstream message and forward it to CDPInput.dispatchKeyEvent.commands.Meta+a/c/x/v→["selectAll"]/["copy"]/["cut"]/["paste"]Meta+z/Meta+Shift+z→["undo"]/["redo"]Meta+Arrow{Left,Right,Up,Down}→["moveToBeginningOfLine"]/["moveToEndOfLine"]/["moveToBeginningOfDocument"]/["moveToEndOfDocument"](use the…AndModifySelectionvariant when Shift is held)Meta+Backspace→["deleteToBeginningOfLine"]Prior art
press Control+aand other modifier key chords #980 / press does not correctly send modifier chords like Meta+, #983 added modifier-chord parsing (andtextsuppression under Ctrl/Meta) to the CLIpresscommand, but it sends modifier bits only — it does not populate CDPcommands, so none of the native chords above are addressed by that fix.windowsVirtualKeyCode" thread for the sameinput_keyboardforwarding path; this report is the separate, still-open key-binding-command gap.