Skip to content

Switch from CBOR plus postcard to a custom codec called phon.#369

Open
fasterthanlime wants to merge 226 commits into
mainfrom
phon-codec
Open

Switch from CBOR plus postcard to a custom codec called phon.#369
fasterthanlime wants to merge 226 commits into
mainfrom
phon-codec

Conversation

@fasterthanlime

Copy link
Copy Markdown
Contributor

The new codec is specifically made for Vox, and it has both a self-describing mode and a compact mode. It is literally the perfect fit for what we are doing. It does not use variable length integers, as it is more concerned with decode speed than it is about payload size. The assumption is that we're dealing with non-embedded hardware, And that we care about throughput.

We still assume that both ends might have different schemas, and we bank on having to do some amount of translation. We are building a JIT from day one, but it is not based on crane lift. It is based on the copy and patch technique. The JIT is a day one concern, not just for Rust, but also for Swift, and also for TypeScript, in the form of emitting JavaScript code that is evaluated through new function.

Instead of using normal little endian integers, the codec also tries to maintain alignment. Uh, in practice, it should be really nice to use. I think.

Branch for migrating vox's codec to phon. Repoints all facet/facet-format/
facet-cargo-toml crates at the local sibling checkouts (the same facet phon uses)
via [patch.crates-io], and forces them local in the lockfile (facet* 0.46.5,
facet-format* local). Fixes the first sealed-enum fallout: vox-postcard's
serialize_dynamic_value now has a _ arm for the non_exhaustive ValueType.

BLOCKED on the rest of the tree: figue 2.0.6 (crates.io, what vox pins) does not
compile against the sealed facet, and the local figue is 4.0.3 (incompatible
major) so it can't be patched in. moire and others likely need the same sealed-
enum migration. Gated on the facet+figue+moire releases, or a vox->figue-4 bump.
…acet

xtask's figue 2.0.x predated the facet seal; bump to the local figue 4.0.3
(patched in, with its own sealed-enum _ arms) so the whole vox workspace compiles
against the same local facet phon uses. cargo check --workspace is green.
Add phon_encode / phon_decode / phon_decode_borrowed to the codec::args
divan module so phon's copy-and-patch JIT is measured in the SAME harness,
on the same logical GnarlyPayload, as vox-jit (Cranelift) and vox-postcard
(reflective). Programs + NativeEncode/NativeDecode are compiled outside the
timed loop, mirroring the vox-jit arms.

phon wire is NOT byte-compatible with vox-postcard for this payload: postcard
uses LEB128 varints for ints/lengths, phon uses fixed-width LE (GnarlyPayload
n=1: phon 572 bytes vs postcard 486). Per the task, we don't force one onto
the other — each codec round-trips its own wire. Correctness is asserted
before each bench.

phon{,-engine,-ir,-jit} added as path deps to ../../../phon/rust; transitive
facet resolves through vox's local [patch.crates-io].
A net-new crate (touches nothing existing) that encodes/decodes #[derive(Facet)]
values through phon's typed schema-driven path, mirroring the slice of
vox-postcard's surface the driver uses: to_vec / from_slice. It derives the
schema + descriptor from the facet Shape and runs phon's interpreter codec.

The wire is phon-compact (fixed-width LE + u32 length prefixes + alignment
padding), deliberately NOT postcard-compatible — the codec swap breaks the old
wire by design.

Foundation for the Rust codec migration: declares the phon crates in
[workspace.dependencies] and adds vox-phon to the workspace. Tests round-trip a
rich struct (scalars/String/Option/Vec/nested/enum/u64), every enum variant
(unit/newtype/struct), and empty collections + None. clippy clean.

Follow-ups: per-type descriptor/program caching, the native JIT fast path,
borrowed/zero-copy decode, and Message-envelope handling for its opaque
Payload/CborPayload fields (the driver swap).
First step of removing the failed retry/idempotency experiment. Deletes
vox-core's operation_store module + its lib.rs wiring. Build is RED until the
rest of the excision lands (driver dedup/seal/replay block, Handler::retry_policy,
the macro's retry codegen + snapshots, vox-codegen targets, RetryPolicy/
OperationId/retry_support, the supports_retry handshake wire field, testbed/spec,
tests). Committed to save progress; not buildable yet.
Removes the entire operation-dedup/seal/replay machinery from the driver:
LiveOperationTracker + LiveOperation + AdmitResult + CancelResult, the
request-loop dedup block, the persistent-store lookup/admit/seal, the
replay path (replay_sealed_response + prepare_replay_schemas +
prepare_response_from_shape + incoming_args_bytes), DriverShared.operations
+ next_operation_id, Driver.live_operations, DriverReplySink's operation
fields, with_operation_store, and the operation_id generation. Cancel now
just aborts the in-flight handler; the retry.idem/persist branches collapse
to their non-retry behavior. session/mod + session/builders drop the
operation_store config field, 5 builder methods, and the PendingConnection
plumbing.

RetryPolicy / retry_policy / OperationId / supports_retry remain as inert
vestiges (removed in the next passes: vox-types, the #[service] macro +
snapshots, vox-codegen, the handshake wire field, testbed/spec, tests).
Workspace lib+bins compile; retry/store tests still reference removed APIs
and are handled next.
vox-types: removed RetryPolicy + MethodDescriptor.retry, retry_support module
(OperationId, PostcardPayload, metadata helpers, ChannelRetryMode), Handler::
retry_policy, supports_retry/peer_supports_retry handshake fields; collapsed
method_descriptor_with_retry into method_descriptor.
macro: stopped generating retry_policy impl, RetryPolicy in descriptors,
channel_retry_mode + ensure_channel_retry_mode, the persist-channel validation.
vox-core: removed the retry-on-resume select arm.

Still red: peer_supports_retry cascade (handshake/session/driver), vox-codegen
(.retry reads + retry_policy gen), tests + snapshots. Committed to save progress.
…eld gone

vox-core fully clean (lib): removed peer_supports_retry from DriverCaller +
Driver + ConnectionHandle + the caller-side resume re-send + the handshake
negotiation (supports_retry param dropped from handshake_as_initiator/acceptor
+ all call sites); Handler::retry_policy + the ErasedHandler forwarding
overrides gone. vox-types: handshake supports_retry/peer_supports_retry fields
gone (wire change). session/mod cleaned.

Remaining (build red): vox-codegen emits retry into generated Swift/TS clients
(+ the Swift runtime call(retry:) API), the testbed/spec, tests + snapshots.
Completes the retry removal across the library + binary surface:
- vox-types: RetryPolicy, OperationId, retry_support, Handler::retry_policy,
  supports_retry/peer_supports_retry handshake fields; method_descriptor_with_
  retry collapsed into method_descriptor.
- macro: no longer generates retry_policy impl, RetryPolicy descriptors,
  channel_retry_mode, ensure_channel_retry_mode, or the persist validation.
- vox-codegen: Swift client/server + TS schema no longer emit retry.
- vox-core: full peer_supports_retry + caller-resume removal; handshake_as_*
  drop the supports_retry param (+ all call sites incl. spec-tests).
- vox: dropped all retry re-exports.

cargo build (lib+bins) is green. Remaining: tests + snapshots still reference
removed retry/store APIs (driver_tests, tests/utils, macro tests, codegen
tests, .snap files); handled next. The Swift runtime package (RetryPolicy +
call(retry:)) is a separate Swift-side follow-up.
Completes the removal across the test surface:
- vox-core tests: deleted the operation-store/dedup/replay/persist tests and
  their helpers (PersistentReplyingHandler, OperationIdHandler, ReplayHandler,
  CountingOperationStore, the BreakableLink resume-test infra); stripped
  retry/peer_supports_retry from the remaining tests + helpers.
- vox-macros-core: deleted method_retry_helper_attributes +
  rejects_persist_methods_with_channel_arguments; regenerated all service
  snapshots (no more retry_policy impl / RetryPolicy descriptors /
  method_descriptor_with_retry); removed the orphaned retry snapshot.
- vox-codegen: deleted the swift/ts retry-emission tests + imports.
- vox + spec-tests: stripped peer_supports_retry from integration tests.
- request_context_opt_in_end_to_end now expects describe:0 — the client no
  longer auto-injects operation-id metadata, so the request carries none.

Whole vox workspace: cargo build + 1031 nextest tests + clippy all green. The
retry/idempotency/operation-store/stable-conduit-resume experiment is fully
gone from the Rust side. (Swift runtime package's RetryPolicy + call(retry:)
remain as a separate Swift-side follow-up.)
Add from_slice_borrowed (zero-copy: &str/&[u8]/Cow/opaque payloads point into
the input, lifetime-tied). A full Message (RequestCall with an inline
Payload::Value) now round-trips through phon: the envelope encodes with the
opaque payload sub-encoded inline, borrowed-decode yields a zero-copy payload
span + Cow metadata borrowing the wire, and the span re-decodes to the args.

This validates the phon opaque op against vox's real Message + PayloadAdapter.
The Message envelope encode/decode now goes through phon (vox_phon::to_vec /
from_slice_borrowed) instead of vox-jit (Cranelift) + vox-postcard. The envelope
is a fixed protocol type, so it is single-schema: no translation plan, no
per-conduit encoder/decoder, and the native/wasm codec split collapses (the phon
interpreter runs everywhere). MessagePlan is now vestigial — with_message_plan
accepts and drops it so session construction sites are unchanged for now.
BareConduitError now wraps vox_phon::Error. Removed the dead JIT-decoder
deserialize helper. bare_conduit round-trip test passes.
The codec swap for schema exchange (CBOR self-describing -> phon self-describing):
- schema_bytes::<T>() encodes T's phon schema closure (root id + reachable
  composites) as self-describing bytes (framed: u64 root, u32 count, per-schema
  u32 len + schema_to_bytes).
- parse_schema_bytes -> SchemaBundle { root, schemas }.
- build_decode_program::<T>(writer) reconciles the writer schema against T's
  derived descriptor via phon lower_decode (r[compat.plan-first]) — cache per
  (writer root, T), reuse per message. decode_with_program runs it (zero-copy).
- decode_compat one-shot convenience.

This is THE decode path for evolvable wire types — no single-schema shortcut.
Test: a real exchange reconciling writer-only (skipped) + reader-only-default
fields end to end.
…alue)

Per the design: metadata becomes a flat self-describing facet_value::Value map
(carried on the wire as phon Dynamic), no duplicate keys, and per-key flags
become well-known keys (vox:sensitive / vox:no-propagate). The Cow-based
MetadataEntry/MetadataValue wire enums are deleted.

- metadata.rs: type Metadata = Value; fluent metadata() builder; MetadataExt
  read accessors (meta_str/u64/bytes/len/is_sensitive/...); meta_set construction
  primitive; MetadataFlags kept as an API-only convenience (not a wire type) that
  maps to the well-known keys.
- Message/handshake/calls/channel: metadata fields lose their lifetime; the 7
  metadata-only message structs are now lifetime-free with identity Reborrow.
- client_middleware: dropped the OwnedMetadata borrow machinery (Value owns its
  data); push_*_metadata go through meta_set.
- request_context/server_middleware: borrow &Metadata instead of &[MetadataEntry].

Lean-on-defaults: empty metadata is just Value::default() (null), reads as empty.
- ConnectionRequest borrows &Metadata; metadata construction goes through the
  metadata() builder / meta_set; metadata reads through MetadataExt (meta_len/etc).
- The session builders + SessionConfig lose their now-unused '\''a lifetime
  (metadata is owned). The 6 owned message structs drop their <'\''static> args.
- driver: drop the OwnedMetadata borrow machinery (ClientRequest::new takes just
  the call now).
- golden-vectors: sample_metadata via the builder + meta_set (flagged 'auth').
- client/server logging redaction rewritten for the Value model: redact values of
  keys marked sensitive (well-known vox:sensitive key), hide the flag keys,
  format string/bytes/u64 values; via MetadataExt.
- ConnectBuilder loses its now-unused 'a (metadata owned); its IntoFuture is 'static.
- lib re-exports MetadataExt instead of the deleted MetadataEntry/MetadataValue.
…/spec)

All metadata construction goes through the metadata() builder / meta_set; all
reads through MetadataExt. Empty metadata leans on Default (Value::null). The
now-owned message structs drop their <'static> args everywhere; close_connection
and the acceptor-reject paths pass Default::default(). vox re-exports
metadata/meta_set/MetadataBuilder. Obsolete vox-jit MetadataEntry calibration
test removed (the type is gone). Whole workspace + all targets compile.
SchemaRecvTracker now stores the raw phon schema-closure bytes per (method,
direction); record_received is best-effort/idempotent (relaxed duplicate rule).
schema_deser builds a phon DecodeProgram (lower_decode) from the writer's schema
bytes reconciled against the reader type, cached on the tracker -- killing
vox-jit/vox-postcard in the args/response decode path. vox-phon gains
schema_bytes_for_shape (send side works with &Shape).

Send side (session/mod.rs CBOR production + proxy) still on the old path -> RED.
The schemas binding now carries phon schema-closure bytes end to end:
- send: SchemaSendTracker simplified to track sent (method,direction) bindings and
  attach vox_phon::schema_bytes_for_shape(wire_shape); relaxed dedup (best-effort,
  no per-schema hashing). PreparedSchemaPlan carries the bytes.
- proxy/relay: forward the peer's stored schema bytes from the recv tracker
  (get_or_plan_binding_from_tracker) instead of reconstructing from a SchemaSource.
- driver error path: encode the erased Result<(),VoxError<Infallible>> inline via
  Payload::outgoing and advertise the method's real response schema
  (prepare_response_for_shape) -- no more vox_postcard::to_vec + PostcardBytes.
- vox_phon: schema_bytes_for_shape + decode_owned_with_program; DecodeProgram is
  Send+Sync (cacheable on the shared tracker).

Args/response decode (schema_deser) no longer touches vox-jit/vox-postcard. Channel
+ envelope + the test-only lib helper are next; then the crate deletes.
decode_channel_payload now uses phon instead of vox-jit + vox-postcard's identity
plan; RxError::Deserialize carries a String. Flagged FIXME(channel-compat): this
stays single-schema (no method/tracker context at item-decode time) -- proper
r[compat.plan-first] needs the writer's element schema threaded from the
channel-establishing method down to the Rx, same gap the identity plan had.
Replace the per-handle inline-ChannelId proxy with the out-of-band design from
the spec (r[rpc.request], r[rpc.channel.payload-encoding]): each Tx/Rx encodes
only a u32 index, and the ChannelIds travel in a new RequestCall.channels list
(mirrors the Fd -> fd-table indirection).

- Tx/Rx: #[facet(proxy = ChannelId)] -> #[facet(opaque = {Tx,Rx}ChannelAdapter<T>)];
  add wire_index: AtomicU32. phon handles opaque adapters (not proxy), so this
  unblocks channel-bearing args through phon with no proxy support needed.
- channel.rs: collect_channels/provide_channels thread-locals + collector/source
  (Fd-isomorphic). The index re-associates the handle to its ChannelId at decode.
- Add RequestCall.channels: Vec<ChannelId>; thread through driver/session forward.
- Validated: case1-4 + round-trip + two_channels_get_distinct_indices.

Envelope + handshake now decode through the phon compat path (lower_decode), not
single-schema: MessagePlan carries the peer's Message schema bytes; BareConduitRx
builds the compat program lazily; handshake is phon self-describing (was CBOR).

Delete vox-jit, vox-jit-abi, vox-jit-cal, vox-jit-tests, vox-swift-abi,
vox-postcard, vox-bench. Migrate Fd and channel-item tests to phon. Fix
schema-tracker tests to the phon schema-bytes API (duplicate-is-error ->
best-effort, per the relaxed rule).
Send: in SessionCore::prepare_outbound_batch, after schema attach, if a Call's
args shape contains channels, pre-encode the args under collect_channels + the
binder so allocated ChannelIds land in call.channels and each handle encodes a
u32 index; swap args to the pre-encoded bytes (lifetime narrowed via message
covariance) and encode the envelope. Channel-free calls keep the lazy zero-copy
path. Add vox_phon::to_vec_for_shape for the type-erased pre-encode.

Recv: the generated dispatch wraps args decode in provide_channels(
request_call.channels) so each handle's inline index re-associates to its
ChannelId before binding (r[rpc.channel.binding]).

Re-export collect_channels/provide_channels from vox.
…an up test fixtures

- Route Payload::PostcardBytes through vox_phon::raw_opaque_bytes (phon's opaque
  passthrough sentinel) so proxy-forwarded / re-sent payloads emit verbatim rather
  than phon trying to derive the RawPostcardBorrowed sentinel — fixes the SIGABRT
  in proxy_connections_forwards_calls.
- Migrate vox-core tests + generate_golden_vectors off vox_postcard to vox_phon;
  add the channels field to all RequestCall literals; rewrite the MessagePlan and
  borrowed-decode tests to the phon compat path.
- Regenerate vox-macros-core snapshots (channels field + provide_channels wrap).

All 62 vox-core tests pass, including channel-over-driver and proxy forwarding.
The send tracker now emits phon schema-closure bytes (not CBOR), so the
bidirectional-bindings and transitive-deps tests parse via vox_phon::parse_schema_bytes
and compare bundle roots. The transitive-deps test now nests a composite (Inner)
since phon's closure carries composites, not inline scalars. Also update the
schema-incompat integration test to match phon's 'Incompatible' error wording.

All pure-Rust tests green (vox-types/vox-phon/vox-core/vox/...). Remaining failures
are TS/Swift codegen + cross-language conformance (the wire moved to phon) — task #7.
…unit)

r[rpc.channel.payload-encoding] / r[rpc.channel.binding]: Tx/Rx encode an explicit
index into the out-of-band channels list and the callee resolves the ChannelId by
that index — keeping re-association correct under the field reorder/skip the
compat path already allows, rather than relying on payload position.
…ostcardBytes -> Payload::Encoded

After the phon codec swap these carry phon bytes, not CBOR/postcard. Clean
whole-word rename across the workspace (~50 sites), the generated client/dispatch
code in vox-macros-core, and the insta snapshots. Refresh the now-stale
'CBOR-encoded'/'Raw bytes' doc comments on the renamed types.

Build + tests green except the two pre-existing vox-codegen channel-schema tests
(task #7: TS/Swift codegen still on the old wire).
The generate_golden_vectors binary now emits via vox_phon::to_vec; the 94
fixtures under test-fixtures/golden-vectors are the phon wire (not postcard).
Consumed only by the TS/Swift cross-language conformance tests (task #7 TS
migration); no Rust reader.
Foundation for emitting phon schema bytes + registry from vox-codegen (delegating
schema collection to phon-codegen's Module). Deps build clean; not yet wired into
the TS emission (that lands with the vox-wire/vox-core TS rewrite).
…rkspace

Add ../phon/typescript/packages/* to the pnpm workspace (mirrors the Rust path
deps on ../phon) and depend on @bearcove/phon-schema + @bearcove/phon-engine from
vox-wire and vox-core. Verified phon-ts resolves and type-checks from vox-wire.
vox-postcard kept alongside for now; retired once the migration lands.
…ility)

The application handshake is versioned, but the byte-framing layer underneath had no
magic/version — so growing the unix header for fd-passing (4->8 bytes) failed silently
as 'link closed during transport prologue' instead of a clear error.

Every byte-stream link now opens with a 6-byte prologue: [magic 'VOXL'][u8 version][u8
flags], flags bit0 = fd-capable. Reader validates magic+version and that the peer's
fd-capability matches its own link type, failing loudly on mismatch. StreamLink sends
fd_capable=0 (TCP/stdio/plain), FdStreamLink sends fd_capable=1 (unix + SCM_RIGHTS), so
the header-size difference is negotiated/validated rather than silently assumed.

Breaks wire compat (intentional). Tests updated to write the prologue; 12/12 pass.
…rt header

Aligns the Swift unix transport with Rust's FdStreamLink, which Swift never tracked
when fd-passing was added (Swift wrote a 4-byte header; Rust read 8 -> silent
'link closed during transport prologue').

- 8-byte fd-capable header [u32 len][u32 fd_count] on unix; 4-byte [u32 len] on TCP,
  selected per transport (UnixListener/UnixConnector fdFramed=true; Tcp* false).
- Versioned link prologue [magic 'VOXL'][u8 version][u8 flags] written once per
  connection and validated on read (magic, version, fd-capability), matching vox-stream.
  A future framing change now fails loudly instead of mute.

fd_count is written 0 / parsed-but-ignored for now; full SCM_RIGHTS fd-passing on the
Swift side is the next step (needs raw sendmsg/recvmsg under NIO).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant