Skip to content

Commit cfd15e3

Browse files
committed
Update design doc and spec with trait value table and fixed-size index
1 parent 33bd1fe commit cfd15e3

2 files changed

Lines changed: 44 additions & 0 deletions

File tree

designs/compact-binary-model-format.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -260,6 +260,8 @@ performance), the remaining 30-40% matters.
260260
├────────────────────────────────┤
261261
│ Symbol Table │
262262
├────────────────────────────────┤
263+
│ Trait Value Table │
264+
├────────────────────────────────┤
263265
│ Shape Index (optional) │
264266
├────────────────────────────────┤
265267
│ Metadata Section │
@@ -360,6 +362,28 @@ Writers SHOULD assign local symbol IDs in descending frequency order (most
360362
frequently referenced strings get the lowest IDs) to minimize VarUInt
361363
encoding size.
362364

365+
### Trait Value Table
366+
367+
The trait value table deduplicates trait values that appear multiple times
368+
across shapes. Each unique trait value is stored once in this table and
369+
referenced by index from within shape trait encodings.
370+
371+
```
372+
VarUInt valueCount
373+
DynamicValue[] values (valueCount complete encoded values)
374+
```
375+
376+
In the shapes section, each trait is encoded as:
377+
378+
```
379+
SymRef traitId
380+
VarUInt traitValueRef (index into the trait value table)
381+
```
382+
383+
This replaces inline `DynamicValue` encoding with a compact integer reference.
384+
Annotation traits (`{}`) that appear thousands of times are stored once and
385+
referenced by a 1-2 byte VarUInt instead of being encoded inline each time.
386+
363387
### Shape Index
364388

365389
The shape index maps shape IDs to byte offsets within the shapes section and

docs/source-2.0/spec/smf.rst

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,8 @@ File structure
5656
├────────────────────────────────┤
5757
│ Symbol Table │
5858
├────────────────────────────────┤
59+
│ Trait Value Table │
60+
├────────────────────────────────┤
5961
│ Shape Index (optional) │
6062
├────────────────────────────────┤
6163
│ Metadata Section │
@@ -138,6 +140,24 @@ Writers SHOULD assign local symbol IDs in descending frequency order to
138140
minimize VarUInt encoding size.
139141

140142

143+
.. _smf-trait-value-table:
144+
145+
-----------------
146+
Trait value table
147+
-----------------
148+
149+
Deduplicates trait values that appear multiple times across shapes. Each
150+
unique value is stored once and referenced by index.
151+
152+
.. code-block:: none
153+
154+
VarUInt valueCount
155+
DynamicValue[] values (valueCount encoded values)
156+
157+
In shapes, each trait is encoded as ``SymRef traitId`` + ``VarUInt valueRef``
158+
(index into this table) instead of an inline ``DynamicValue``.
159+
160+
141161
.. _smf-shape-index:
142162

143163
-----------

0 commit comments

Comments
 (0)