One Swift file, ~500 lines. This doc explains the shape.
Parent (shell / Claude Code / MCP server)
│
│ spawn("webview-cli", [args])
│ stdin = A2UI JSONL or agent:// load commands
│ stdout = final JSON result
│ stderr = diagnostics
▼
webview-cli (NSApplication.accessory)
├── NSWindow (borderless, programmatic — no XIB)
│ └── WKWebView
│ ├── WKURLSchemeHandler for agent://
│ ├── WKScriptMessageHandler for complete / ready
│ └── WKNavigationDelegate (didFinish, didFail*)
├── stdin reader (DispatchSourceRead on FD 0)
├── Timer (for --timeout)
└── NSEvent local monitor (Escape key)
WKWebView cannot run without a live NSApplication main run loop — there is no "headless WKWebView" on macOS. Minimum viable setup:
let app = NSApplication.shared
app.setActivationPolicy(.accessory) // no Dock icon
app.delegate = delegate
app.run() // blocks; drives the event loop.accessory activation policy means no Dock icon and no menu bar — the binary looks and feels like a CLI tool even though it owns a full Cocoa app internally.
mainparses CLI args intoConfigNSApplicationstarts, activation policy set to.accessoryapplicationDidFinishLaunchingfires →AppCoordinator.run():- Configures
WKWebViewConfiguration(message handlers, user script, scheme handler) - Creates
NSWindow+WKWebView - If
--a2ui: loads the built-in renderer from the scheme handler, reads stdin JSONL in a background thread, flushes to JS ondidFinish - If
--url: validates URL, loads it - Starts optional
--timeouttimer
- Configures
- User interacts. Web content calls
window.webkit.messageHandlers.complete.postMessage(...). userContentController(_:didReceive:)catches the message →emitAndExit(status: "completed", data: body, code: 0)emitAndExitserializes JSON, writes to stdout, callsexit(0).
NSApp.terminate always calls exit(0) internally, which loses our exit code semantics. Direct exit(N) preserves the cancel/timeout/error codes. applicationWillTerminate still fires via the OS's SIGTERM handler to provide a safety-net JSON emit if the process is killed externally before emitAndExit runs.
WKURLSchemeHandler is registered at WKWebViewConfiguration.setURLSchemeHandler(_, forURLScheme: "agent"). Requests to agent://host/<path> hit AgentSchemeHandler.webView(_:start:) which looks up <path> in an in-memory [String: (Data, String)] map (content + MIME type).
Resources are pushed into the map via stdin load commands. No HTTP server, no port allocation, no temp files. This enables:
--a2uimode: the renderer HTML/CSS/JS are preloaded before the webview navigates toagent://host/index.html- Custom HTML mode: agents pipe arbitrary static sites (base64-encoded) and the webview serves them from memory
Embedded as a string literal in main.swift (a2uiRendererHTML, a2uiRendererCSS, a2uiRendererJS). The JS is ~180 lines of vanilla ES — no React, no framework, no build step. It:
- Accepts a JSONL array via
window.__a2uiLoad(jsonString)(called from Swift viaevaluateJavaScript) - Parses messages into a
Map<componentId, component>(adjacency list model) - Recursively renders from the
beginRendering.rootID → DOM - Wires Button clicks to
window.webkit.messageHandlers.complete.postMessage({action, data, context})
Why embed instead of ship as separate files? The binary-is-the-release story. brew install drops one file. No post-install steps, no surprising paths, no version skew between binary and JS.
Naive approach: evaluateJavaScript("__a2uiLoad('\(payload)')") breaks on quotes and multibyte chars. Current approach: base64-encode in Swift, decode in JS with TextDecoder('utf-8').
let b64 = Data(payload.utf8).base64EncodedString()
let js = "window.__a2uiLoad(new TextDecoder('utf-8').decode(Uint8Array.from(atob('\(b64)'), c => c.charCodeAt(0))))"This preserves em-dashes, emoji, CJK, and any other non-ASCII content the agent might generate.
A subtle bug in an earlier version: stdin could finish before the webview had loaded the renderer script, so window.__a2uiLoad was undefined when Swift called evaluateJavaScript. Fix:
- stdin read completes → store payload in
pendingA2UIPayload WKNavigationDelegate.webView(_:didFinish:)fires → setrendererReady = trueflushA2UIIfReady()runs whenever either side completes — guarded by both flags
This handles both "stdin first" and "navigation first" orderings correctly.
- No persistent state: each invocation is isolated. Cookies, localStorage, and
WKWebsiteDataStoreare ephemeral. OAuth flows that require persisted sessions need a separate mechanism (out of scope for v0.1). - No IPC beyond stdio: no sockets, no D-Bus, no XPC. If the parent wants to stream updates, use
evaluateJavaScriptvia a futurepatchcommand (roadmap v1.1). - No sandboxing: v0.1 assumes trusted agents. CSP + App Sandbox entitlements planned for v1.1.