-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Champion
What's the motivation for this proposal?
We want to be as efficient as possible in encoding our state, and currently we're putting it in a recursive jagged array (array with arrays), which is terrible for performance/memory.
Note, this is an optimization and can be done at any point in the future without any breakage.
Proposed Solution / Feature
Instead, we ideally encode a byte stream. However, encoding strings is annoying, and html supports utf-8 which is quite space efficient, so we'll encode those separately and index them.
We still need to encode types, references and variable length arrays.
We need to encode a lot of integers of various sizes, so we'll encode those efficiently.
We also encounter a lot of arrays all of the same type, so we'll special-case that.
The approach is again a typeid + data, except now it's all in a flat array, and the data is variable length. If we put the constants in the typeid, about 60 typeIds. We can expand the types some more with various types of integers, refs, and array lengths and use the leftovers for more constants, like numbers that are common.
Because strings are references to existing memory, and the encoded state is a copy, we stream the state as base64 starting with ["
and we complete with ",...${the JSON-encoded and script-escaped strings}]
. This means we have an array with only strings, which is stored efficiently.
To encode arrays, we need to know where in the byte stream it starts and stops. However, we are streaming so we don't know what the length will be, so instead we'll write all typeIds behind their data, and parse backwards when deserializing.
Furthermore, it is common to have the same type of items in arrays. So we will have a special type of array for that. To allow easy switching, we merge all arrays that follow each other, except when separated by a marker.
If you have an array of constants, MixArray
is best. But if it is an array of Int8 numbers, even if some of them are also constants, it should be a SingleArray
.
However, this is quite complex, so perhaps just start with only MixArray
.
We will encode the root state array itself as an array so we know how big an array to make when looking for indexes.
E.g. these could be the meanings of the typeId byte:
Undefined
Null
True
False
EmptyString
EMPTY_ARRAY
EMPTY_OBJ
NEEDS_COMPUTATION
STORE_ALL_PROPS
Slot
Fragment
NaN
PositiveInfinity
NegativeInfinity
MaxSafeInt
AlmostMaxSafeInt
(used for close fragment)MinSafeInt
+Int8
+ 1 byte. We could in theory substract what we have as constants but that's extra work-Int8
+ 1+Int16
+ 2-Int16
+ 2+Int32
+ 4-Int32
+ 4Number
+ 8 - non-integers, stored little-endianStringIdx8
StringIdx16
StringIdx24
we don't support more than 16.7 million stringsRefIdx8
RefIdx16
RefIdx24
we don't support more than 16.7 MB state excluding strings (or should we?)SingleArray8
: + ...data (encoded without typeId), typeId, offset 1 byte (so the previous item will be at 1(item typeid) + 1(offset int) + 1 (SingleArray8
marker) + offset bytes backwards).SingleArray16
: + ...data (encoded without typeId), typeId, offset 2 byte (so the previous item will be at 1(item typeid) + 2(1-65536 bytes count) + 1 (SingleArray16
marker) + offset bytes backwards).MixArray8
: + ...data (encoded with typeId) + offset 1 byteMixArray16
: + ...data (encoded with typeId) + offset 2 byteArrayEnd
: + count of items (encoded). Marks the end of the previous arrays. To know total offset, add up the previousArray
+MixArray
offsets. An empty array that is not the constantEMPTY_ARRAY
has count 0.VNode
RefVNode
Date
: + 4 bytes epoch timeURL
: + stringIndex (encoded int)Regex
: + stringIndex (encoded int), byte with the 8 "dgymsuvy" flagsBigInt
: + stringIndex (encoded int) (just keep it as bigint in the strings array and stream as string)URLSearchParams
Error
Object
Promise
Set
Map
: + mixarray of key+value pairsUint8Array
QRL
Task
Resource
Component
Signal
WrappedSignal
ComputedSignal
SerializerSignal
Store
StoreArray
FormData
JSXNode
PropsProxy
EffectData
- next 64: number 0-64
- next 64: refIdx 0-64
- rest: stringIdx 0-64 (a few more actually)
For example, the roots [32, 2.573, "hi", Slot, [Slot]]
would encode as [base64(Num32Id <8 bytes> NumberId String0Id SlotId SlotId 1 MixArray8Id Int1Id ArrayEnd 17 MixArray8Id Num5Id ArrayEnd) (30 bytes base64), "hi"]
To decode, JSON.parse the result, take the first item, base64-decode, reconstruct the uint8array, splice off the first item so you are left with the strings array, and then go backwards to reconstruct the roots:
- ArrayEnd, Num5Id, MixArray8Id, 17: create an array of 5 and stop when you get to 17 bytes earlier
- index 4:
- ArrayEnd, Num1Id, MixArray8Id, 1: create an array of 5 and stop when you get to 17 bytes earlier
- index 0:
- SlotId: assign the const Slot
- index 0:
- ArrayEnd, Num1Id, MixArray8Id, 1: create an array of 5 and stop when you get to 17 bytes earlier
- index 3:
- SlotId: assign the const Slot
- index 2:
- String0Id: assing the string at index 0
- index 1:
- NumberId, 8 bytes: reconstruct the float64
- index 0:
- Num32Id: assign 32
PRs/ Links / References
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status