Skip to content

Large-recording capture/replay — chunked disk layout, lazy step-load, GB-scale #144

@thomas-stegemann

Description

@thomas-stegemann

Why

Recordings today scale to KB-to-single-digit-MB before the architecture pushes back:

  • localStorage (bowire_recordings) — 5-10 MB origin quota. Past that, the workbench fails to save new steps.
  • Disk (~/.bowire/recordings.json) — single combined JSON, loaded into memory on every read, rewritten on every write. A few MB starts feeling laggy; tens of MB blocks the UI; few GB is impossible.
  • In-flight memory — each recording is held fully in recordingsList[] in the workbench. Big bodies (binary responses, large GraphQL payloads, image base64) sit in memory the whole session.

That's fine for the "capture a 6-step login flow, replay it as a test" use case. It breaks the moment an operator wants to:

  • Record an MQTT subscription with 10k messages for replay analysis.
  • Capture a gRPC stream that runs for an hour (large body + duration).
  • Snapshot the response-set of a paginated API across thousands of pages.
  • Build a fixture for an LLM context evaluation (large structured responses).

All of which are real workflows the workbench's positioning ("debug + test + mock everything multi-protocol") implies are in scope.

Proposal — chunked, lazy-loaded recording storage

Storage shape

Each recording lives in its own directory under ~/.bowire/recordings/<recordingId>/:

~/.bowire/recordings/<recordingId>/
  recording.json          # metadata + step-index manifest
  steps/0000.json         # individual step bodies, chunked
  steps/0001.json
  steps/...
  bodies/<contentHash>    # large binary bodies, content-addressed
  • recording.json carries only the recording-level metadata + a step manifest (id, timestamp, method, byte count, body hash). Stays small (< 1 MB even for 100k-step recordings).
  • Step bodies live in their own JSON files. Append-only during capture; lazy-read on replay.
  • Large bodies (> 1 MB threshold) get content-hashed into bodies/ so duplicate large blobs (same response repeated 50×) deduplicate naturally.

Memory shape

  • recordingsList[] carries only the metadata + step count, not the bodies.
  • Step bodies fetched on-demand when the operator opens the detail view or replay starts.
  • Streaming replay reads + dispatches one step at a time, so memory stays flat regardless of total recording size.

Capture

  • Each captured step writes its body to the corresponding steps/N.json immediately (append-write, fsync per step). No in-memory accumulation.
  • Workbench keeps a circular "last 50 steps" preview cache for the detail-pane scroll buffer.

Replay

  • Replay reads steps in order from disk, decoding one at a time.
  • For very-large bodies, decode-streaming directly into the target socket / channel without materializing the whole thing in JS.

Storage backends

Limits + UX

  • A hard size cap (default 5 GB per recording, configurable via Bowire:Recording:MaxBytes) so a runaway capture doesn't fill the disk.
  • The Recordings sidebar shows per-recording size + step count.
  • A "Trim recording" action lets the operator drop steps before a chosen index (e.g. discard the warm-up calls).

Acceptance

  • ~/.bowire/recordings/<id>/recording.json + steps/N.json layout.
  • Append-write per step during capture (no in-memory batch).
  • Body content-addressing for de-duplication.
  • Lazy step-fetch in the detail view.
  • Streaming replay reads one step at a time.
  • Max-bytes hard cap with operator-visible warning.
  • Per-recording size + step count in the sidebar.
  • localStorage continues to hold metadata only; bodies never go there.
  • Existing single-JSON recordings auto-migrate on first launch.
  • Optional SQLite backend behind a config switch.

Composes with

Out of scope

  • Cloud / S3 / blob storage backends — local-first is the foundation; remote backends as a separate plugin once the seam exists.
  • Compression — body content typically compresses well, but adding it changes the on-disk format; deserve a separate decision.
  • Recording-of-recording (meta-capture) — too narrow.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:workbenchUI / workbench surfaceroadmapTracked on the public Project board

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    No status

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions