Skip to content

Encode std::byte ranges compactly and read fixed char arrays in BEVE#2649

Merged
stephenberry merged 1 commit into
mainfrom
fix/beve-byte-char-array
Jun 19, 2026
Merged

Encode std::byte ranges compactly and read fixed char arrays in BEVE#2649
stephenberry merged 1 commit into
mainfrom
fix/beve-byte-char-array

Conversation

@stephenberry

Copy link
Copy Markdown
Owner

Fixes #2647.

Problem

The BEVE codec mishandled arrays of two primitive element types:

  1. std::array<char, N> failed to compile on read. It is classified as str_t (via array_char_t), and from<BEVE, str_t> unconditionally called value.resize(n), which std::array does not have.
  2. std::vector<std::byte> / std::array<std::byte, N> used ~2× the space. std::byte is a scoped enum (not num_t), so it fell through to the generic_array branch — a one-byte type header per element. This was inconsistent with std::vector<uint8_t> (a compact u8 typed array) and with a scalar std::byte, which was already encoded as a u8 number.

Fix

  • New beve_num_t = num_t<T> || std::same_as<T, std::byte> concept opts std::byte into the numeric typed-array dispatch in write / read / size. The existing numeric formulas already yield byte_count == 0 and an unsigned u8 header for std::byte, so the wire output is byte-for-byte identical to uint8_t (e.g. a 16-element vector goes 34 → 18 bytes). std::byte and uint8_t ranges are now cross-readable in both directions.
  • The generic-enum reader gains a tagged overload that delegates to its underlying integer reader, so the typed-array conversion paths work for std::byte.
  • from<BEVE, str_t> now stores into resizable strings, string views, and fixed std::array<char, N> (bounds-checked, remainder zero-filled) through a shared helper used by both the tagged and untagged read paths.

Wire-driven paths (skip, beve_to_json, ptr) need no change — they handle the resulting u8 typed array purely from the tag.

Wire-format note

This intentionally changes the BEVE encoding of std::byte arrays. Data written by older versions using the generic-array form is rejected cleanly with error_code::syntax_error (no UB). This aligns std::byte arrays with both scalar std::byte and std::vector<uint8_t>.

Tests

11 regression tests added to beve_test.cpp covering: compact encoding + wire-equality with uint8_t, cross-readability, std::array<std::byte, N>, empty and non-contiguous (std::deque) byte ranges, std::byte struct members, beve_to_json rendering, full and short (zero-filled) std::array<char, N> round trips, oversized-payload rejection, and untagged mode. Full beve_test suite passes (381 tests), along with lazy_beve_test and skip_null_on_read_test.

Fixes #2647.

BEVE mishandled arrays of two primitive element types:

- std::array<char, N> failed to compile on read: from<BEVE, str_t>
  unconditionally called value.resize(n), which std::array lacks.

- std::vector<std::byte> / std::array<std::byte, N> serialized as a
  generic_array (a one-byte type header per element, ~2x the size),
  inconsistent with std::vector<uint8_t> (a compact u8 typed array) and
  with a scalar std::byte (already a u8 number).

std::byte is now encoded as a u8 typed array, byte-for-byte identical to
uint8_t (the existing numeric formulas yield byte_count 0 and an unsigned
u8 header for it). A new beve_num_t concept opts std::byte into the
numeric typed-array paths in write/read/size. The generic-enum reader
gains a tagged overload (delegating to its underlying integer reader) so
the typed-array conversion paths work for std::byte. from<BEVE, str_t>
now stores into resizable strings, string views, and fixed
std::array<char, N> (bounds-checked, remainder zero-filled) via a shared
helper used by both the tagged and untagged read paths.

Note: this changes the BEVE wire format for std::byte arrays; data
written by older versions using the generic-array form is rejected
cleanly with a syntax_error.
@stephenberry stephenberry merged commit 3600194 into main Jun 19, 2026
53 checks passed
@stephenberry stephenberry deleted the fix/beve-byte-char-array branch June 19, 2026 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistency in BEVE array handling for some primitives (char, std::byte)

1 participant