|
| 1 | +# BEVE Proposal: Aligned Typed Arrays for Zero-Copy Access |
| 2 | + |
| 3 | +**Status:** Working Draft |
| 4 | + |
| 5 | +## Motivation |
| 6 | + |
| 7 | +BEVE's typed arrays store contiguous numerical data (floats, integers, etc.) in a compact layout that is already close to the native in-memory representation. However, the current specification does not guarantee that the data payload of a typed array begins at a memory address that satisfies the alignment requirement of the element type. Without alignment, a decoder must copy the data into a suitably aligned buffer before it can be reinterpreted as a native span of `float`, `double`, `int32_t`, etc. |
| 8 | + |
| 9 | +On modern hardware, unaligned access is either a performance penalty or an outright fault. By introducing optional alignment padding, a decoder that holds the entire BEVE message in a contiguous, aligned buffer can hand back a `std::span<T>` (or equivalent) that points directly into the message buffer — **zero copies, zero allocations**. |
| 10 | + |
| 11 | +### Design Goals |
| 12 | + |
| 13 | +1. **Zero-copy typed arrays** — typed array data can be reinterpreted in-place as `span<T>` where `T` is the element type. |
| 14 | +2. **Deterministic padding** — the padding length is computable from the element type and the current byte offset; it does not need to be stored explicitly. |
| 15 | +3. **Contiguous memory requirement** — the entire BEVE message from its start up to and including any aligned typed array must reside in a single contiguous buffer. |
| 16 | +4. **Composability** — any extension that embeds a typed array (matrices, complex numbers, timestamps) gains zero-copy support automatically. |
| 17 | +5. **Backward compatibility** — standard BEVE decoders that predate this proposal will encounter a clean failure (unknown sub-type), not silent misinterpretation. |
| 18 | + |
| 19 | +## Byte Offset Origin |
| 20 | + |
| 21 | +**Byte 0 is the first byte of the message buffer.** All offset calculations for alignment padding are relative to this origin. If a framing header (Extension 5) is present, it occupies bytes 0–1 and the root value begins at byte offset 2. If no framing header is present, the root value begins at byte offset 0. |
| 22 | + |
| 23 | +Both encoder and decoder inherently track their position from the start of the message buffer, so they always agree on byte offsets regardless of whether a framing header is present. |
| 24 | + |
| 25 | +## Buffer Alignment Requirement |
| 26 | + |
| 27 | +For zero-copy access, the memory buffer that holds the BEVE message **must** be aligned to at least the maximum alignment of any typed array element in the message. In practice, standard memory allocators on 64-bit systems return 16-byte aligned memory, which covers all standard types up to `int128_t` / `float128_t`. |
| 28 | + |
| 29 | +If the buffer address is aligned to `A` and the data payload of a typed array begins at byte offset `O` where `O % alignof(T) == 0`, then the absolute address of the payload is aligned to `alignof(T)`. |
| 30 | + |
| 31 | +## Aligned Typed Arrays — Built Into the Typed Array Tag |
| 32 | + |
| 33 | +Rather than consuming an extension ID, aligned typed arrays are encoded as a new sub-type within the existing typed array category 3 (boolean/string). This approach means that any BEVE extension that embeds a typed array — matrices, complex numbers, timestamps — gains zero-copy alignment support automatically, with no changes to those extensions. |
| 34 | + |
| 35 | +### Background: Typed Array Category 3 |
| 36 | + |
| 37 | +In the current specification, typed array category 3 (bits 3–4 = `11`) uses bit 5 to distinguish between two sub-types: |
| 38 | + |
| 39 | +``` |
| 40 | +0 -> boolean 0b00'0'11'100 |
| 41 | +1 -> string 0b01'0'11'100 |
| 42 | +``` |
| 43 | + |
| 44 | +Bits 6–7 are unused and must be zero. This proposal defines a third sub-type. |
| 45 | + |
| 46 | +### Sub-Type 2: Aligned Numeric Array |
| 47 | + |
| 48 | +When bits 5–7 of a typed array header encode the value `2` (bit 6 set, bits 5 and 7 clear), the typed array is an **aligned numeric array**: |
| 49 | + |
| 50 | +``` |
| 51 | +2 -> aligned 0b010'11'100 → 0x3C |
| 52 | +``` |
| 53 | + |
| 54 | +The next byte is a **numeric typed array header** — identical to a standard BEVE typed array header for a numeric type. This second header byte encodes the element category (floating point, signed integer, or unsigned integer) and the byte count, using the same bit layout as a normal typed array header byte. The decoder already knows how to parse this; it simply reads it from the second byte instead of the first. |
| 55 | + |
| 56 | +### Layout |
| 57 | + |
| 58 | +``` |
| 59 | +TYPED_ARRAY_HEADER(aligned) | NUMERIC_HEADER | SIZE | PADDING | DATA |
| 60 | +``` |
| 61 | + |
| 62 | +Where: |
| 63 | + |
| 64 | +- `TYPED_ARRAY_HEADER(aligned)` — 1 byte (`0x3C`), a typed array header with category 3, sub-type 2, indicating an aligned numeric array. |
| 65 | +- `NUMERIC_HEADER` — 1 byte, a standard typed array header encoding the element category (bits 3–4: 0=float, 1=signed, 2=unsigned) and byte count (bits 5–7). Bits 0–2 **must** be `0b100` (the typed array type tag); decoders **must** reject the message if they are not. This is the same byte you would write for a non-aligned typed array of the same element type. |
| 66 | +- `SIZE` — a compressed unsigned integer giving the number of elements (same semantics as standard typed arrays). |
| 67 | +- `PADDING` — 0 to `(alignment - 1)` bytes, inserted so that the first byte of `DATA` falls at a byte offset from the message origin that is a multiple of the element alignment. The contents of padding bytes are unspecified; decoders **must** ignore them. |
| 68 | +- `DATA` — the raw element data, identical to a standard typed array payload. |
| 69 | + |
| 70 | +### Alignment Calculation |
| 71 | + |
| 72 | +Given: |
| 73 | + |
| 74 | +- `offset_after_size` — the byte offset (from byte 0 of the message buffer) of the first byte after the `SIZE` field. |
| 75 | +- `alignment` — the natural alignment of the element type in bytes (equal to the element size for all standard numeric types). |
| 76 | + |
| 77 | +The number of padding bytes is: |
| 78 | + |
| 79 | +``` |
| 80 | +padding = (alignment - (offset_after_size % alignment)) % alignment |
| 81 | +``` |
| 82 | + |
| 83 | +This value is deterministic. The encoder inserts exactly this many bytes; the decoder computes the same value and skips them. |
| 84 | + |
| 85 | +### Alignment Values by Element Type |
| 86 | + |
| 87 | +| Element Type | Element Size | Required Alignment | |
| 88 | +|---|---|---| |
| 89 | +| `bfloat16_t` | 2 | 2 | |
| 90 | +| `float16_t` | 2 | 2 | |
| 91 | +| `float32_t` | 4 | 4 | |
| 92 | +| `float64_t` | 8 | 8 | |
| 93 | +| `float128_t` | 16 | 16 | |
| 94 | +| `int8_t` / `uint8_t` | 1 | 1 (no padding needed) | |
| 95 | +| `int16_t` / `uint16_t` | 2 | 2 | |
| 96 | +| `int32_t` / `uint32_t` | 4 | 4 | |
| 97 | +| `int64_t` / `uint64_t` | 8 | 8 | |
| 98 | +| `int128_t` / `uint128_t` | 16 | 16 | |
| 99 | + |
| 100 | +Note: 1-byte element types trivially satisfy alignment and never require padding. Implementations may use standard typed arrays for single-byte elements, as there is no alignment benefit. |
| 101 | + |
| 102 | +### Restrictions |
| 103 | + |
| 104 | +- The entire message from byte 0 through the end of the aligned typed array's `DATA` **must** reside in contiguous memory. |
| 105 | +- The `NUMERIC_HEADER` **must** encode a numeric type (category 0, 1, or 2). Encoders **must not** write an aligned typed array with a boolean or string header. Decoders **must** reject such combinations. Boolean arrays are bit-packed and string arrays have variable-length elements, so alignment is not meaningful for these types. |
| 106 | + |
| 107 | +## Decoding Procedure |
| 108 | + |
| 109 | +1. **Begin at byte offset 0 of the message buffer.** If a framing header is present, decode it and advance the offset accordingly. Track the current byte offset throughout decoding. |
| 110 | +2. **Decode the root VALUE normally**, tracking the current byte offset at each point. |
| 111 | +3. **Upon encountering a typed array with category 3, sub-type 2 (aligned):** |
| 112 | + a. Read the `NUMERIC_HEADER` byte to determine element type and size. |
| 113 | + b. Read the `SIZE` compressed unsigned integer to get the element count. |
| 114 | + c. Record `offset_after_size` — the current byte offset. |
| 115 | + d. Compute `padding = (alignment - (offset_after_size % alignment)) % alignment`. |
| 116 | + e. Skip `padding` bytes. |
| 117 | + f. The next `element_count * element_size` bytes are the data payload, **already aligned**. Return a pointer/span directly into the buffer. |
| 118 | + |
| 119 | +## Encoding Procedure |
| 120 | + |
| 121 | +1. **Begin at byte offset 0 of the message buffer.** If writing a framing header, do so first and advance the offset accordingly. Track byte offsets throughout encoding. |
| 122 | +2. **Encode the root VALUE normally**, tracking offsets. |
| 123 | +3. **When encoding a typed array that should be aligned:** |
| 124 | + a. Write the `TYPED_ARRAY_HEADER(aligned)` byte (`0x3C`). |
| 125 | + b. Write the `NUMERIC_HEADER` byte (same as a standard numeric typed array header). |
| 126 | + c. Write the `SIZE` compressed unsigned integer. |
| 127 | + d. Compute `padding = (alignment - (current_offset % alignment)) % alignment`. |
| 128 | + e. Write `padding` bytes (contents are unspecified; zero is conventional). |
| 129 | + f. Write the raw element data. |
| 130 | + |
| 131 | +## Worked Example |
| 132 | + |
| 133 | +Consider encoding a message containing a single aligned `float64_t` typed array with 3 elements: `[1.0, 2.0, 3.0]`, without a framing header. |
| 134 | + |
| 135 | +``` |
| 136 | +Offset Bytes Description |
| 137 | +------ ----- ----------- |
| 138 | +0 3C TYPED_ARRAY_HEADER: aligned typed array |
| 139 | + (0b010'11'100: category=3, sub-type=2=aligned) |
| 140 | +1 64 NUMERIC_HEADER: float64 typed array |
| 141 | + (0b011'00'100: byte_count=3→8 bytes, float, typed array) |
| 142 | +2 0C SIZE: 3 elements (3 << 2 | 0 = 0x0C, 1-byte compressed uint) |
| 143 | +3 xx xx xx xx xx PADDING: 5 bytes (contents unspecified) |
| 144 | + (alignment=8, offset_after_size=3, padding=(8-3%8)%8=5) |
| 145 | +8 00 00 00 00 DATA[0]: 1.0 as float64 little-endian |
| 146 | + 00 00 F0 3F |
| 147 | +16 00 00 00 00 DATA[1]: 2.0 as float64 little-endian |
| 148 | + 00 00 00 40 |
| 149 | +24 00 00 00 00 DATA[2]: 3.0 as float64 little-endian |
| 150 | + 00 00 08 40 |
| 151 | +------ |
| 152 | +Total: 32 bytes |
| 153 | +``` |
| 154 | + |
| 155 | +The data begins at offset 8, which is a multiple of 8 (`alignof(float64_t)`). If the buffer itself is 8-byte aligned, the decoder can return a `span<double>` pointing at buffer offset 8 with no copy. |
| 156 | + |
| 157 | +## Composability with Existing Extensions |
| 158 | + |
| 159 | +Because alignment is a property of the typed array itself, every extension that embeds a typed array benefits automatically. |
| 160 | + |
| 161 | +### Matrices (Extension 2) |
| 162 | + |
| 163 | +A matrix stores its data as a typed array. By using an aligned typed array as the inner `VALUE`, the matrix data payload is automatically aligned: |
| 164 | + |
| 165 | +``` |
| 166 | +EXT(2) | MATRIX_HEADER | EXTENTS | ALIGNED_TYPED_ARRAY |
| 167 | +``` |
| 168 | + |
| 169 | +No changes to the matrix extension are required. |
| 170 | + |
| 171 | +### Complex Numbers (Extension 3) |
| 172 | + |
| 173 | +Complex arrays store pairs of numerical values in a typed array. Using an aligned typed array as the inner data automatically aligns the complex data: |
| 174 | + |
| 175 | +``` |
| 176 | +EXT(3) | COMPLEX_HEADER | SIZE | ALIGNED_TYPED_ARRAY_DATA |
| 177 | +``` |
| 178 | + |
| 179 | +No changes to the complex number extension are required. |
| 180 | + |
| 181 | +## Nested / Multiple Aligned Arrays |
| 182 | + |
| 183 | +A message may contain multiple aligned typed arrays (for example, as values in an object). Each one computes its own padding independently based on its offset from byte 0. The contiguous-memory requirement applies to the entire message. |
| 184 | + |
| 185 | +Because the headers, sizes, and keys between typed arrays will vary in length, each aligned typed array may have a different amount of padding. This is expected and correct. |
| 186 | + |
| 187 | +## Impact on Message Size |
| 188 | + |
| 189 | +An aligned typed array uses one extra byte compared to a standard typed array (the additional `NUMERIC_HEADER` byte), plus at most `alignment - 1` bytes of padding. For typical payloads containing large arrays, this overhead is negligible. For messages with many small aligned arrays, the overhead could be more significant. Implementations should consider using standard (unaligned) typed arrays for small arrays where the copy cost is trivial. |
| 190 | + |
| 191 | +As a guideline: the copy cost of re-aligning `N` bytes is roughly proportional to `N`, while the padding overhead is bounded by a constant. For arrays larger than a few cache lines (e.g., >64 bytes of data), alignment padding is almost always worthwhile. |
| 192 | + |
| 193 | +## Backward Compatibility |
| 194 | + |
| 195 | +- Decoders that predate this proposal will encounter typed array category 3 with an unrecognized sub-type value of 2. This is a clean failure — the decoder knows it is dealing with a typed array but does not recognize the sub-type. This is no worse than an unknown extension ID, and arguably better since the context is preserved. |
| 196 | + |
| 197 | +## Security Considerations |
| 198 | + |
| 199 | +- Padding bytes are unspecified and **must** be ignored by decoders. Because zeros are valid data in a binary format, requiring zero-padding provides no security benefit and adds unnecessary verification cost in the decode path. |
| 200 | +- Decoders **must** validate that the computed padding does not extend beyond the message buffer. |
| 201 | + |
| 202 | +## Summary |
| 203 | + |
| 204 | +This proposal adds zero-copy typed array support to BEVE through a new sub-type within the existing typed array tag: |
| 205 | + |
| 206 | +**Aligned Typed Array** (typed array category 3, sub-type 2): uses a second header byte to encode the numeric element type, followed by the element count, deterministic padding, and the data payload. Because alignment lives within the typed array tag itself, every extension that embeds a typed array — matrices, complex numbers, timestamps — gains zero-copy support automatically with no modifications. |
| 207 | + |
| 208 | +This allows decoders to return direct pointers into the message buffer as typed spans, eliminating copy and allocation overhead for large numerical arrays — a critical optimization for scientific computing, real-time data processing, and high-throughput serialization pipelines. |
0 commit comments