Skip to content

Unify CBOR std::byte and uint8_t byte-string handling#2652

Merged
stephenberry merged 1 commit into
mainfrom
fix/cbor-unify-byte-ranges
Jun 19, 2026
Merged

Unify CBOR std::byte and uint8_t byte-string handling#2652
stephenberry merged 1 commit into
mainfrom
fix/cbor-unify-byte-ranges

Conversation

@stephenberry

Copy link
Copy Markdown
Owner

Follow-up to #2650. That PR fixed std::byte and fixed std::array<char, N> handling across the binary codecs, but left CBOR's std::byte and uint8_t byte handling implemented two different ways:

  • std::byte ranges were handled by a single concept-based specialization (any contiguous std::byte range → byte string).
  • uint8_t ranges were handled only by enumerated specializations for std::vector<uint8_t> and std::array<uint8_t, N>.

So any other contiguous uint8_t range (e.g. std::span<uint8_t>) silently fell through to the generic array writer and was encoded as an RFC 8746 typed array instead of a byte string, while the equivalent std::span<std::byte> was a byte string. MsgPack already treats these uniformly via byte_like; CBOR was the outlier.

Change

Collapse the duplicated specializations into one read/write pair each, constrained on the existing glz::contiguous_byte_range concept (byte_like = std::byte, unsigned char, uint8_t), and widen the generic array specializations' exclusion to match:

  • write.hpp: 4 byte-string writers (std::byte range + std::vector<uint8_t> + std::array<std::byte, N> + std::array<uint8_t, N>) → 1.
  • read.hpp: 3 byte-string readers (std::byte range + std::vector<uint8_t> + std::array<uint8_t, N>) → 1. The existing std::byte reader already handled both resizable and fixed-size targets with the max_array_size safety checks, so the two uint8_t readers were redundant subsets of it.

Net −293 / +72 lines.

Behavior

Every contiguous byte-like range now encodes as a CBOR byte string (major type 2) and the variants are cross-readable on the wire, regardless of element type or container. std::vector<uint8_t> / std::array<uint8_t, N> round-trips and wire bytes are unchanged.

One intentional wire-format change: previously-uncovered uint8_t containers such as std::span<uint8_t> now encode as a byte string instead of an RFC 8746 typed array, making them consistent with std::span<std::byte>. (Numeric typed arrays for other element types, e.g. int32_t/float, are unaffected.)

Tests

All existing CBOR tests pass (239), plus 4 new regression tests: identical byte/uint8 encodings, std::span<uint8_t> as a byte string, and byte ↔ uint8 cross-readability for vectors and fixed arrays. Verified locally under clang -std=c++23, plain and ASan+UBSan (243 tests / 961 asserts, 0 failures).

Follow-up to #2650. CBOR handled std::byte ranges with one concept-based
specialization but uint8_t ranges only via enumerated std::vector<uint8_t> /
std::array<uint8_t, N> specializations, so other contiguous uint8_t ranges
(e.g. std::span<uint8_t>) silently fell through to RFC 8746 typed-array
encoding instead of a byte string.

Collapse the four byte-string writers and three byte-string readers into one
read/write pair each, constrained on glz::contiguous_byte_range (byte_like:
std::byte, unsigned char, uint8_t), and widen the generic array specs'
exclusion to match. Every contiguous byte-like range now encodes as a CBOR
byte string (major type 2) and the variants are cross-readable on the wire,
regardless of element type or container. Net -293/+72 lines.

Adds regression tests for identical byte/uint8 encodings, span<uint8_t> as a
byte string, and byte<->uint8 cross-readability for vectors and fixed arrays.
@stephenberry stephenberry merged commit dbe8a0f into main Jun 19, 2026
54 checks passed
@stephenberry stephenberry deleted the fix/cbor-unify-byte-ranges branch June 19, 2026 02:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant