Skip to content

Latest commit

 

History

History
416 lines (290 loc) · 9.01 KB

File metadata and controls

416 lines (290 loc) · 9.01 KB

tui-driver Specification

Version: 0.1.0 (Draft)

tui-driver is a protocol for driving terminals programmatically. Like WebDriver/CDP for browsers, but for terminals.

Overview

import { Terminal } from 'tui-driver'

let term = await Terminal.launch({ cols: 100, rows: 35 })

await term.type('ls -la')
await term.press('Enter')
await term.waitForStable()
await term.screenshot({ path: 'output.png' })

await term.close()

Configuration

Configuration is defined in tui-driver.config.js:

import { defineConfig } from 'tui-driver'

export default defineConfig({
  // Required: which driver to use
  driver: 'x11',  // 'x11' | 'pty' | 'iterm2' | 'kitty'

  // Terminal dimensions (in characters)
  cols: 80,
  rows: 24,

  // Font configuration (driver-dependent)
  font: {
    family: 'monospace',
    size: 14
  },

  // Environment variables to inject
  env: {},

  // Working directory for commands
  cwd: process.cwd(),

  // Screenshot output directory
  screenshotDir: '.tui-driver/screenshots'
})

Configuration can be overridden at launch:

let term = await Terminal.launch({
  cols: 120,  // Override config
  rows: 40
})

Terminal API

Terminal.launch(options?): Promise<TerminalSession>

Launch a new terminal session.

Options: (all optional, falls back to config file)

Option Type Description
cols number Terminal width in columns
rows number Terminal height in rows
env Record<string, string> Environment variables
cwd string Working directory

Returns: A TerminalSession instance.

Throws:

  • DriverNotFoundError - Driver specified in config not available
  • DependencyError - Driver dependencies not installed

TerminalSession API

session.type(text): Promise<void>

Type text into the terminal character by character.

await term.type('echo "hello world"')

Parameters:

  • text: string - Text to type

Notes:

  • Does not press Enter automatically
  • Special characters are typed literally
  • For control characters, use press()

session.press(key): Promise<void>

Press a key or key combination.

await term.press('Enter')
await term.press('ctrl+c')
await term.press('Alt+F4')

Parameters:

  • key: string - Key to press

Supported keys:

Category Keys
Special Enter, Escape, Tab, Backspace, Delete, Space
Navigation Up, Down, Left, Right, Home, End, PageUp, PageDown
Function F1 - F12
Modifiers ctrl+<key>, alt+<key>, shift+<key>, meta+<key>

Examples:

await term.press('Enter')       // Enter key
await term.press('ctrl+c')      // Ctrl+C (interrupt)
await term.press('ctrl+l')      // Ctrl+L (clear)
await term.press('Alt+Tab')     // Alt+Tab
await term.press('shift+Tab')   // Shift+Tab (reverse tab)

session.waitForStable(options?): Promise<void>

Wait for the terminal output to stabilize (stop changing).

await term.waitForStable()
await term.waitForStable({ timeout: 5000 })

Options:

Option Type Default Description
timeout number 10000 Max wait time in ms
interval number 100 Time between stability checks in ms
stable number 3 Consecutive stable frames required

Throws:

  • TimeoutError - Terminal did not stabilize within timeout

Notes:

  • Compares screenshots to detect changes
  • Useful after running commands or interacting with TUIs
  • More reliable than arbitrary sleep() calls

session.waitForText(text, options?): Promise<void>

Wait for specific text to appear in the terminal.

await term.waitForText('$')  // Wait for prompt
await term.waitForText('Build complete')

Parameters:

  • text: string - Text to wait for

Options:

Option Type Default Description
timeout number 10000 Max wait time in ms

Throws:

  • TimeoutError - Text did not appear within timeout

Notes:

  • Implementation is driver-dependent
  • Some drivers read terminal buffer directly
  • Others may use OCR (less reliable)
  • Not all drivers support this method

session.screenshot(options?): Promise<Buffer | string>

Capture a screenshot of the terminal.

// Get buffer
let buffer = await term.screenshot()

// Save to file
let path = await term.screenshot({ path: 'output.png' })

// Save with auto-generated name
let path = await term.screenshot({ name: 'help-screen' })
// -> .tui-driver/screenshots/help-screen.png

Options:

Option Type Description
path string Full path to save screenshot
name string Name for screenshot (saved to screenshotDir)

Returns:

  • If path or name provided: string (path to saved file)
  • Otherwise: Buffer (PNG image data)

session.close(): Promise<void>

Close the terminal session and clean up resources.

await term.close()

Notes:

  • Kills any running processes
  • Cleans up temporary files
  • Safe to call multiple times
  • Should be called in test cleanup/finally block

Driver Interface

Drivers must implement the Driver interface:

/**
 * @typedef {Object} Driver
 * @property {string} name - Driver identifier
 * @property {() => boolean} isAvailable - Check if driver can run in this environment
 * @property {(options: LaunchOptions) => Promise<DriverSession>} launch - Create session
 */

/**
 * @typedef {Object} DriverSession
 * @property {(text: string) => Promise<void>} type
 * @property {(key: string) => Promise<void>} press
 * @property {(options?: WaitForStableOptions) => Promise<void>} waitForStable
 * @property {(text: string, options?: WaitForTextOptions) => Promise<void>} waitForText
 * @property {(options?: ScreenshotOptions) => Promise<Buffer>} screenshot
 * @property {() => Promise<void>} close
 */

Driver Registration

Drivers register themselves with the core library:

// In tui-driver-x11/index.js
import { registerDriver } from 'tui-driver'

registerDriver({
  name: 'x11',

  isAvailable() {
    // Check for Xvfb, xterm, xdotool, etc.
    return checkDependencies()
  },

  async launch(options) {
    // Create and return a DriverSession
    return createX11Session(options)
  }
})

Driver Discovery

When Terminal.launch() is called:

  1. Load config file (tui-driver.config.js)
  2. Look up driver by name from config
  3. Check driver.isAvailable()
  4. If available, call driver.launch(options)
  5. Wrap in TerminalSession and return

Errors

All errors extend TuiDriverError:

class TuiDriverError extends Error {
  code: string
}

class DriverNotFoundError extends TuiDriverError {
  code = 'DRIVER_NOT_FOUND'
  driver: string  // Requested driver name
}

class DependencyError extends TuiDriverError {
  code = 'DEPENDENCY_ERROR'
  missing: string[]  // List of missing dependencies
}

class TimeoutError extends TuiDriverError {
  code = 'TIMEOUT'
  timeout: number  // Timeout value in ms
}

class SessionClosedError extends TuiDriverError {
  code = 'SESSION_CLOSED'
}

Available Drivers

x11 (tui-driver-x11)

Uses Xvfb + xterm + xdotool for headless terminal rendering on Linux.

System dependencies:

  • xvfb - Virtual framebuffer
  • xterm - Terminal emulator
  • xdotool - Keyboard/mouse automation
  • imagemagick - Screenshot conversion

Platforms: Linux

Install deps: apt-get install xvfb xterm xdotool imagemagick

pty (tui-driver-pty) [Future]

Pure JavaScript driver using xterm-headless and node-pty.

System dependencies: None (pure JS)

Platforms: Linux, macOS, Windows

Notes:

  • Most portable option
  • Deterministic rendering
  • No real terminal emulator - renders internally

iterm2 (tui-driver-iterm2) [Future]

Uses iTerm2's Python scripting API for native macOS testing.

System dependencies:

  • iTerm2 with Python API enabled

Platforms: macOS

kitty (tui-driver-kitty) [Future]

Uses Kitty terminal's remote control protocol.

System dependencies:

  • Kitty terminal

Platforms: Linux, macOS


Integration with Vizzly

tui-driver produces PNG screenshots. Vizzly compares screenshots. They work together naturally:

// tests/cli-visual.test.js
import { test } from 'node:test'
import { Terminal } from 'tui-driver'

test('vizzly help renders correctly', async () => {
  let term = await Terminal.launch()

  await term.type('vizzly --help')
  await term.press('Enter')
  await term.waitForStable()
  await term.screenshot({ name: 'vizzly-help' })

  await term.close()
})

Run with Vizzly:

vizzly tdd run "node --test tests/cli-visual.test.js"

Vizzly picks up the screenshots from .tui-driver/screenshots/ and handles comparison, baselines, and review.


Changelog

0.1.0 (Draft)

  • Initial specification
  • Core API: launch, type, press, waitForStable, waitForText, screenshot, close
  • Driver interface defined
  • x11 driver specified