Skip to content

Fix std::byte and fixed std::array<char,N> handling in CBOR/MsgPack/BSON#2650

Merged
stephenberry merged 1 commit into
mainfrom
fix/binary-byte-char-arrays
Jun 19, 2026
Merged

Fix std::byte and fixed std::array<char,N> handling in CBOR/MsgPack/BSON#2650
stephenberry merged 1 commit into
mainfrom
fix/binary-byte-char-arrays

Conversation

@stephenberry

Copy link
Copy Markdown
Owner

Follow-up to #2649 (the BEVE fix for #2647), addressing the same family of issues in the other binary codecs. Found by reproducing each case against CBOR, MsgPack, and BSON.

Summary

std::byte ranges std::array<char,N> read
CBOR ❌ ambiguous (vector/span) / unreadable (array) → fixed ❌ compile error → fixed
MsgPack ✅ already compact (bin) ❌ undefined read → fixed
BSON ✅ already compact (binary) no viable operator=fixed

CBOR

  • std::vector<std::byte> / std::span<std::byte> were ambiguous (the byte-string specialization tied with the generic array specialization) and did not compile on write or read. The byte-string specs now require a contiguous std::byte range, and the generic array specs exclude contiguous std::byte ranges, so exactly one matches.
  • std::array<std::byte,N> wrote a valid byte string but couldn't read it back. The byte-string reader is generalized to store into a fixed-size buffer (bounds-checked, remainder zero-filled), for both definite- and indefinite-length byte strings.
  • std::array<uint8_t,N> had the same write/read asymmetry (a pre-existing gap surfaced during review): it wrote a byte string with no matching reader. Added a dedicated reader mirroring the existing std::vector<uint8_t> one. std::array<uint8_t,N> and std::vector<uint8_t> are now cross-readable.
  • std::array<char,N> failed to compile on read (the text reader called resize/append/assign). The text reader now handles fixed char arrays.

MsgPack

  • std::byte ranges were already encoded as the compact bin format. Only std::array<char,N> read was missing (write produced a string; read was an undefined template). Added a from<MSGPACK, array_char_t> reader.

BSON

  • std::byte ranges were already compact binary. A std::array<char,N> string member failed to compile on read (no viable operator=); the string reader now handles fixed char arrays.

Semantics

Consistent with #2649: reading a payload longer than a fixed array yields error_code::syntax_error; a shorter payload is copied and the remainder zero-filled. Wire-driven paths and non-contiguous byte ranges (e.g. std::deque<std::byte> → CBOR array of numbers) are unchanged.

Tests

11 regression tests added (CBOR 6, MsgPack 3, BSON 3 — incl. the uint8_t array case): compact/byte-string encoding, cross-readability, fixed std::array<std::byte,N>/std::array<char,N> round trips, short-payload zero-fill, oversize-payload rejection, and struct members. All existing suites pass: CBOR 235, MsgPack 54, BSON 120.

Verification

Reviewed by an adversarial multi-agent pass (overload-resolution, memory-safety/bounds, round-trip semantics). No new defects; round-trips verified exact under ASan+UBSan. The std::array<uint8_t,N> reader was added in response to that review.

Follow-up to the BEVE fix for #2647, addressing the same family of issues
in the other binary codecs.

CBOR:
- std::vector<std::byte> / std::span<std::byte> were ambiguous (the
  byte-string specialization tied with the generic array specialization)
  and would not compile on either write or read. The byte-string specs now
  require a contiguous std::byte range and the generic array specs exclude
  contiguous std::byte ranges, so exactly one matches.
- std::array<std::byte, N> wrote a valid byte string but failed to read it
  back. The byte-string reader is generalized to store into a fixed-size
  buffer (bounds-checked, remainder zero-filled) for both definite- and
  indefinite-length byte strings.
- std::array<uint8_t, N> had the same write/read asymmetry (a pre-existing
  gap): it wrote a byte string with no matching reader. Added a dedicated
  reader mirroring the std::vector<uint8_t> one.
- std::array<char, N> failed to compile on read (the text reader called
  resize/append/assign). The text reader now handles fixed char arrays.

MsgPack:
- std::array<char, N> had no read specialization (write produced a string
  but read was undefined). Added a from<MSGPACK, array_char_t> reader.

BSON:
- A std::array<char, N> string member failed to compile on read
  ("no viable operator="). The string reader now handles fixed char arrays.

std::byte ranges in MsgPack (bin) and BSON (binary) were already compact
and correct, so only the fixed char array reads were needed there. Reading
a payload longer than a fixed array yields error_code::syntax_error; a
shorter payload is copied and the remainder zero-filled, matching the BEVE
behavior.
@stephenberry stephenberry merged commit e0ca96f into main Jun 19, 2026
58 of 61 checks passed
@stephenberry stephenberry deleted the fix/binary-byte-char-arrays branch June 19, 2026 01:38
stephenberry added a commit that referenced this pull request Jun 19, 2026
Follow-up to #2650. CBOR handled std::byte ranges with one concept-based
specialization but uint8_t ranges only via enumerated std::vector<uint8_t> /
std::array<uint8_t, N> specializations, so other contiguous uint8_t ranges
(e.g. std::span<uint8_t>) silently fell through to RFC 8746 typed-array
encoding instead of a byte string.

Collapse the four byte-string writers and three byte-string readers into one
read/write pair each, constrained on glz::contiguous_byte_range (byte_like:
std::byte, unsigned char, uint8_t), and widen the generic array specs'
exclusion to match. Every contiguous byte-like range now encodes as a CBOR
byte string (major type 2) and the variants are cross-readable on the wire,
regardless of element type or container. Net -293/+72 lines.

Adds regression tests for identical byte/uint8 encodings, span<uint8_t> as a
byte string, and byte<->uint8 cross-readability for vectors and fixed arrays.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistency in BEVE array handling for some primitives (char, std::byte)

1 participant