Skip to content

Add CBOR serialization for complex JavaScript types in IPC#86

Merged
lambdalisue merged 5 commits intomainfrom
fix/bigint
Jan 5, 2026
Merged

Add CBOR serialization for complex JavaScript types in IPC#86
lambdalisue merged 5 commits intomainfrom
fix/bigint

Conversation

@lambdalisue
Copy link
Member

@lambdalisue lambdalisue commented Jan 5, 2026

Summary

  • Add custom CBOR serializer supporting BigInt, Date, Symbol, Function, Map, Set, RegExp, Error, TypedArray, and circular references
  • Extract common subprocess communication patterns into reusable factories
  • Add comprehensive test coverage for serialization edge cases

Why

The previous NDJSON-based IPC serialization couldn't handle complex JavaScript types like BigInt, Symbol, Date, TypedArray, and circular references. This limitation caused Reporter output to lose type information when displaying step results containing these types.

CBOR (RFC 8742) was chosen because:

  1. Native support for JavaScript types: BigInt, Date, TypedArray, ArrayBuffer, undefined
  2. Self-delimiting format: No length prefixes needed for streaming
  3. Efficient binary encoding: Smaller payload than JSON
  4. Extensible via tagged values: Custom serializers for Function, Symbol, Map, Set, RegExp, Error, and circular references

The refactoring reduces ~190 lines of duplicated subprocess bootstrap and cleanup logic by extracting common patterns into:

  • runSubprocess() factory for subprocess-side lifecycle management
  • startSubprocess()/runSubprocessToCompletion() for parent-side orchestration
  • createOutputValidator() for type-safe protocol validation

Test Plan

  • All existing tests pass
  • Added 10 serialization edge case scenarios (probitas/80-89-*.probitas.ts)
  • Added unit tests for serializer (serializer_test.ts)
  • Added unit tests for subprocess utilities (utils_test.ts)
  • Verified BigInt, Symbol, Map, Set, RegExp, Error, Date, TypedArray, Function, circular references, and undefined round-trip correctly

Copilot AI review requested due to automatic review settings January 5, 2026 09:14
JSON.stringify cannot handle various JavaScript types (BigInt, Symbol,
Function, circular references, etc.), which could cause failures in
IPC-based step context passing. These scenarios verify the framework
handles these edge cases correctly.

Covers: BigInt, Function, Symbol, undefined, circular references,
Map, Set, RegExp, Date, Error, TypedArray, ArrayBuffer, DataView.

Addresses #80
Standard JSON.stringify fails on BigInt, circular references, Symbol,
Function, Map, Set, RegExp, TypedArray, and other JavaScript types.
This causes step context values to be lost or corrupted when passed
between processes via IPC.

The custom serializer uses a marker object pattern ($type property)
to preserve type information across the IPC boundary, enabling steps
to receive BigInt, Map, circular references, and other complex values
through ctx.previous and ctx.results.

Handles: BigInt, Symbol, Function, undefined, circular references,
Map, Set, WeakMap, WeakSet, RegExp, Date, Error, TypedArray,
ArrayBuffer, DataView, WeakRef.

Addresses #80
Reduces code duplication by extracting common IPC patterns into
reusable factories and helpers across parent and subprocess sides.

## Subprocess side (_templates/utils.ts)
- Add runSubprocess() factory for standardized bootstrap lifecycle
- Add createOutputValidator() for type-safe protocol validation
- Add serializeError/deserializeError for unified error handling
- Extract ~90 lines of duplicated bootstrap code from list.ts and run.ts

## Parent side (subprocess.ts)
- Add startSubprocess() for connection setup and stream creation
- Add runSubprocessToCompletion() for simple request-response pattern
- Add cleanupSubprocess() for proper resource cleanup ordering
- Consolidate ~100 lines of duplicated spawn/connect/cleanup logic

## Protocol refactoring
- Migrate list_protocol.ts and run_protocol.ts to use shared validators
- Re-export error utilities from utils.ts for consistency
- Document architectural asymmetry between parent and subprocess

## Testing
- Add utils_test.ts covering all new utility functions
- Verify error serialization, port parsing, and validator logic

Total: ~190 lines of duplication eliminated, improved maintainability.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces NDJSON-based IPC serialization with CBOR serialization to support complex JavaScript types (BigInt, Symbol, Date, Function, Map, Set, RegExp, Error, TypedArray, and circular references) for inter-step communication in scenarios. It also refactors subprocess communication patterns into reusable utilities to reduce code duplication.

Key changes:

  • Implements custom CBOR serializer with tagged values for JavaScript types not natively supported by CBOR
  • Extracts common subprocess bootstrap logic into runSubprocess() factory and related utilities
  • Adds comprehensive unit tests for serializer and subprocess utilities, plus 10 end-to-end scenario tests

Reviewed changes

Copilot reviewed 22 out of 23 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
src/cli/_templates/serializer.ts New CBOR serializer with custom tagging for complex JavaScript types and circular reference support
src/cli/_templates/serializer_test.ts New comprehensive unit tests for CBOR serializer covering all supported types
src/cli/_templates/utils.ts Refactored subprocess-side utilities with runSubprocess() factory and CBOR stream handling
src/cli/_templates/utils_test.ts New unit tests for subprocess utilities including error serialization and output validation
src/cli/subprocess.ts Updated parent-side IPC with CBOR streams, new helper functions (encodeToCbor, startSubprocess, runSubprocessToCompletion, cleanupSubprocess)
src/cli/commands/run.ts Refactored to use new startSubprocess() and cleanupSubprocess() utilities, reducing boilerplate
src/cli/commands/list.ts Refactored to use new runSubprocessToCompletion() utility, significantly simplified code
src/cli/_templates/run_protocol.ts Updated to use shared error serialization from utils and createOutputValidator() factory
src/cli/_templates/list_protocol.ts Updated to use createOutputValidator() factory for type guards
src/cli/_templates/run.ts Refactored to use runSubprocess() factory, removing manual bootstrap code
src/cli/_templates/list.ts Refactored to use runSubprocess() factory, removing manual bootstrap code
probitas/80-89-*.probitas.ts Ten new test scenarios verifying serialization of BigInt, Function, Symbol, circular references, undefined, Map/Set, RegExp, Error, Date, and TypedArray
deno.json Added @std/cbor dependency
deno.lock Updated lock file with @std/cbor@0.1.9 dependency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

… issues

Fixes four critical issues in CBOR serialization:

1. Error custom properties now recursively convert nested complex types
   (Map, Set, etc.) instead of treating them as plain objects
2. Error deserialization registers errors in refs map before restoring
   custom properties to support circular references
3. Function deserialization replaces unsafe `new Function()` with
   `Object.defineProperty` to prevent code injection
4. Invalid output error handling uses try-catch around JSON.stringify
   to handle non-serializable values

These fixes address security vulnerabilities and ensure correct
round-trip serialization of complex JavaScript types in IPC.
…alization

Updates comments in serialization test scenarios to reflect CBOR usage
instead of JSON.stringify. Previous comments incorrectly described
JSON limitations, but subprocess IPC uses CBOR which natively supports
complex JavaScript types (BigInt, Date, undefined, Map, Set, RegExp,
Error, TypedArrays, circular references, Symbols, and Functions).

These scenarios verify that Reporter correctly displays these types in
test output after they are transmitted via CBOR IPC.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 23 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +314 to +317
case Tag.RegExp: {
const [source, flags] = content as [string, string];
return new RegExp(source, flags);
}
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RegExp deserialization does not validate the pattern for potentially malicious regular expressions that could cause ReDoS (Regular Expression Denial of Service) attacks. An attacker could craft a RegExp with catastrophic backtracking patterns and send it through IPC.

Consider adding validation or a timeout mechanism when deserializing RegExp patterns, or document this as a known limitation if the IPC channel is trusted.

Copilot uses AI. Check for mistakes.
Comment on lines +14 to +18
.step("This step should also run", (ctx) => {
// This step accesses the BigInt values
return {
received: typeof ctx.previous.value === "bigint",
};
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test only checks the type of ctx.previous.value but doesn't verify that the actual BigInt values were preserved correctly. The test should validate that ctx.previous.value equals the original value and that ctx.previous.timestamp is also a bigint. Additionally, the test comment on line 8 claims "CBOR natively supports BigInt" but according to the serializer tests (lines 71-86 in serializer_test.ts), small BigInts are converted to numbers by CBOR, which may not be the expected behavior for this scenario.

Copilot uses AI. Check for mistakes.
return error;
}
case Tag.Circular: {
const refPath = content as string;
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The circular reference resolution returns the value from the refs map, but if the reference path doesn't exist in the map, this will return undefined without any error. This could happen if a Circular tag references an invalid path or if the referenced object hasn't been processed yet. Consider adding validation to throw an error if the reference path is not found in the map.

Suggested change
const refPath = content as string;
const refPath = content as string;
if (!refs.has(refPath)) {
throw new Error(
`Invalid circular reference: no value registered for path "${refPath}"`,
);
}

Copilot uses AI. Check for mistakes.
Comment on lines +12 to +13
iterator: Symbol.iterator,
};
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test includes Symbol.iterator which is a well-known symbol from the global registry. However, looking at the serializer implementation (lines 159-165 in serializer.ts), well-known symbols like Symbol.iterator will be serialized using Symbol.keyFor(). Since Symbol.keyFor(Symbol.iterator) returns undefined (well-known symbols are not registered in the global symbol registry), they will be serialized as local symbols with their description. This means Symbol.iterator will not round-trip correctly - it will become a new local symbol with the description "Symbol.iterator" instead of the original well-known symbol. The test should verify that the deserialized symbol is actually Symbol.iterator, not just any symbol.

Copilot uses AI. Check for mistakes.
case Tag.WeakSet:
return new WeakSet();
case Tag.WeakRef:
return { [Symbol.toStringTag]: "WeakRef (unrestorable)" };
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WeakRef is deserialized as a plain object with a Symbol.toStringTag property rather than an actual WeakRef instance. This is inconsistent with how WeakMap and WeakSet are handled (which are restored as empty instances). While WeakRef cannot truly be restored without the original object, returning a plain object that masquerades as a WeakRef could cause issues in code that checks instanceof WeakRef. Consider either throwing an error, returning null, or at minimum documenting this behavior more clearly in the module documentation.

Suggested change
return { [Symbol.toStringTag]: "WeakRef (unrestorable)" };
return { type: "WeakRef", unrestorable: true };

Copilot uses AI. Check for mistakes.
@lambdalisue lambdalisue merged commit 2d47deb into main Jan 5, 2026
10 checks passed
@lambdalisue lambdalisue deleted the fix/bigint branch January 5, 2026 09:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants