Skip to content

Commit 2590947

Browse files
authored
Merge pull request #34 from beve-org/aligned-typed-arrays
Aligned typed arrays
2 parents 133a8b9 + cb41ab0 commit 2590947

3 files changed

Lines changed: 53 additions & 7 deletions

File tree

README.md

Lines changed: 29 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# BEVE - Binary Efficient Versatile Encoding
2-
Version 1.0
2+
Version 1
33

44
*High performance, tagged binary data specification like JSON, MessagePack, CBOR, etc. But, designed for higher performance and scientific computing.*
55

@@ -199,16 +199,17 @@ The next two bits indicate the type stored in the array:
199199
0 -> floating point
200200
1 -> signed integer
201201
2 -> unsigned integer
202-
3 -> boolean or string
202+
3 -> boolean, string, or aligned
203203
```
204204

205205
For integral and floating point types, the next three bits of the type header are the BYTE COUNT.
206206

207-
For boolean or string types the next bit indicates whether the type is a boolean or a string
207+
For boolean or string types the next bit indicates whether the type is a boolean, string, or an aligned numeric array
208208

209209
```c++
210210
0 -> boolean // packed as single bits to the nearest byte
211211
1 -> string // an array of strings (not an array of characters)
212+
2 -> aligned // zero-copy aligned numeric array
212213
```
213214

214215
Layout: `HEADER | SIZE | data`
@@ -237,6 +238,31 @@ String arrays do not include the string HEADER for each element.
237238

238239
Layout: `HEADER | SIZE | string[0] | ... string[N]`
239240

241+
### Aligned Typed Arrays
242+
243+
Aligned typed arrays enable zero-copy access by padding the data payload to the element type's natural alignment. The message buffer must be aligned to at least the maximum element alignment in the message.
244+
245+
Layout: `HEADER | NUMERIC_HEADER | SIZE | PADDING_LENGTH | PADDING | DATA`
246+
247+
- `HEADER` — 1 byte (`0x5C`), typed array with category 3, sub-type 2.
248+
- `NUMERIC_HEADER` — 1 byte, a standard numeric typed array header encoding the element category (bits 3–4: 0=float, 1=signed, 2=unsigned) and BYTE COUNT (bits 5–7). Bits 0–2 must be `0b100`.
249+
- `SIZE` — compressed unsigned integer, element count.
250+
- `PADDING_LENGTH` — 1 byte, number of padding bytes that follow (0 to `alignment - 1`).
251+
- `PADDING``PADDING_LENGTH` bytes (contents unspecified, must be ignored by decoders).
252+
- `DATA` — raw element data, aligned to `alignof(T)`.
253+
254+
The encoder computes padding as:
255+
256+
```
257+
padding = (alignment - (offset_after_padding_length % alignment)) % alignment
258+
```
259+
260+
where `offset_after_padding_length` is the byte offset from the start of the message buffer.
261+
262+
Aligned typed arrays must only encode numeric types. Boolean and string sub-types must not be used.
263+
264+
> Extensions that embed typed arrays (matrices, complex numbers) gain zero-copy support automatically by using an aligned typed array as the inner value.
265+
240266
## 5 - Generic Array
241267

242268
Generic arrays expect elements to have headers.

docs/index.html

Lines changed: 24 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@
4444
<div class="container">
4545
<h1>BEVE</h1>
4646
<p class="subtitle">Binary Efficient Versatile Encoding</p>
47-
<p class="version">Version 1.0</p>
47+
<p class="version">Version 1</p>
4848
<p class="tagline">High performance, tagged binary data specification like JSON, MessagePack, CBOR, etc. But, designed for higher performance and scientific computing.</p>
4949
<div class="cta-buttons">
5050
<a href="#specification" class="btn btn-primary">Read Specification</a>
@@ -350,11 +350,12 @@ <h4>4 - Typed Array</h4>
350350
<pre><code class="language-cpp">0 -&gt; floating point
351351
1 -&gt; signed integer
352352
2 -&gt; unsigned integer
353-
3 -&gt; boolean or string</code></pre>
353+
3 -&gt; boolean, string, or aligned</code></pre>
354354
<p>For integral and floating point types, the next three bits of the type header are the BYTE COUNT.</p>
355-
<p>For boolean or string types the next bit indicates whether the type is a boolean or a string:</p>
355+
<p>For boolean or string types the next bit indicates whether the type is a boolean, string, or an aligned numeric array:</p>
356356
<pre><code class="language-cpp">0 -&gt; boolean // packed as single bits to the nearest byte
357-
1 -&gt; string // an array of strings (not an array of characters)</code></pre>
357+
1 -&gt; string // an array of strings (not an array of characters)
358+
2 -&gt; aligned // zero-copy aligned numeric array</code></pre>
358359
<p>Layout: <code>HEADER | SIZE | data</code></p>
359360

360361
<h5>Boolean Arrays</h5>
@@ -375,6 +376,25 @@ <h5>String Arrays</h5>
375376
<p>String arrays do not include the string HEADER for each element.</p>
376377
<p>Layout: <code>HEADER | SIZE | string[0] | ... string[N]</code></p>
377378

379+
<h5>Aligned Typed Arrays</h5>
380+
<p>Aligned typed arrays enable zero-copy access by padding the data payload to the element type's natural alignment. The message buffer must be aligned to at least the maximum element alignment in the message.</p>
381+
<p>Layout: <code>HEADER | NUMERIC_HEADER | SIZE | PADDING_LENGTH | PADDING | DATA</code></p>
382+
<ul>
383+
<li><code>HEADER</code> — 1 byte (<code>0x5C</code>), typed array with category 3, sub-type 2.</li>
384+
<li><code>NUMERIC_HEADER</code> — 1 byte, a standard numeric typed array header encoding the element category (bits 3–4: 0=float, 1=signed, 2=unsigned) and BYTE COUNT (bits 5–7). Bits 0–2 must be <code>0b100</code>.</li>
385+
<li><code>SIZE</code> — compressed unsigned integer, element count.</li>
386+
<li><code>PADDING_LENGTH</code> — 1 byte, number of padding bytes that follow (0 to <code>alignment - 1</code>).</li>
387+
<li><code>PADDING</code><code>PADDING_LENGTH</code> bytes (contents unspecified, must be ignored by decoders).</li>
388+
<li><code>DATA</code> — raw element data, aligned to <code>alignof(T)</code>.</li>
389+
</ul>
390+
<p>The encoder computes padding as:</p>
391+
<pre><code class="language-text">padding = (alignment - (offset_after_padding_length % alignment)) % alignment</code></pre>
392+
<p>where <code>offset_after_padding_length</code> is the byte offset from the start of the message buffer.</p>
393+
<p>Aligned typed arrays must only encode numeric types. Boolean and string sub-types must not be used.</p>
394+
<div class="note">
395+
<p>Extensions that embed typed arrays (matrices, complex numbers) gain zero-copy support automatically by using an aligned typed array as the inner value.</p>
396+
</div>
397+
378398
<h4>5 - Generic Array</h4>
379399
<p>Generic arrays expect elements to have headers.</p>
380400
<p>Layout: <code>HEADER | SIZE | VALUE[0] | ... VALUE[N]</code></p>
File renamed without changes.

0 commit comments

Comments
 (0)