Skip to content

Commit 224715f

Browse files
committed
Move description of message binary format into new document
1 parent 18f7eaf commit 224715f

File tree

2 files changed

+93
-92
lines changed

2 files changed

+93
-92
lines changed

MESSAGE_BINARY_FORMAT.md

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
### Introduction
2+
3+
The main offering of this crate is a consistent and known representation of Rust types. As such, the format is
4+
considered to be part of our stable API, and changing the format requires a major version number bump. To aid you
5+
in debugging, the current version of that format is documented here.
6+
7+
### High-level overview
8+
9+
***Connection Initial Description***
10+
11+
Version 1 of the protocol did not send this description. Version 2 is the first version that sends a startup description.
12+
The first 8 bytes sent on the channel are the little endian protocol version number. If the reader is not compatible with the message version provided, it must
13+
terminate immediately. Immediately following the version, the sender will describe the features of the message stream it is about to send. Those features are as follows.
14+
15+
**Version 1**
16+
17+
Version 1 did not send an initial description. All optional features are disabled in version 1.
18+
19+
**Version 2**
20+
- `name`: checksum_enabled, `size`: 1 byte, `possible values`: 2 or 3, `notes`: 2 indicates checksums will be sent, 3 indicates checksums will not be sent.
21+
22+
***Message stream***
23+
24+
After the initial description, the byte stream is split up into messages. Every message begins with a `length` value. After `length` bytes have
25+
been read, a new message can begin immediately afterward. This `length` value is the entirety of the header of a
26+
message. If checksums are enabled, the 8 bytes immediately following the message are the checksum of the message. This checksum is determined by hashing
27+
the bytes of the message using SipHash 2-4. If checksums are disabled, instead the next message begins immediately. The bytes from the message are then
28+
deserialized into a Rust type via [`bincode`](https://github.com/bincode-org/bincode), using the following configuration.
29+
30+
```rust,ignore
31+
bincode::DefaultOptions::new()
32+
.with_limit(size_limit)
33+
.with_little_endian()
34+
.with_varint_encoding()
35+
.reject_trailing_bytes()
36+
```
37+
38+
### Length encoding
39+
40+
The length is encoded using a variably sized integer encoding scheme. To understand this scheme, first we need a few constant values.
41+
42+
```ignore
43+
u16_marker; decimal: 252, hex: FC
44+
u32_marker; decimal: 253, hex: FD
45+
u64_marker; decimal: 254, hex: FE
46+
zst_marker; decimal: 255, hex: FF
47+
stream_end; decimal: 0, hex: 00
48+
```
49+
50+
Any length less than `u16_marker` and greater than 0 is encoded as a single byte whose value is the length.
51+
A length of zero is encoded with the `zst_marker`. The stream is ended with the `stream_end` value. When this is
52+
read the peer is expected to close the connection.
53+
54+
`async-io-typed` always uses little-endian. The user data being sent may contain values that are not
55+
little-endian, but `async-io-typed` itself always uses little-endian.
56+
57+
If the first byte is `u16_marker`, then the length is 16 bits wide, and encoded in the following 2 bytes. Once
58+
those 2 bytes are read, the message begins. `u32_marker` and `u64_marker` are used in a similar way, each of
59+
those being 4 bytes, and 8 bytes respectively.
60+
61+
### Examples
62+
63+
64+
Length 12
65+
```ignore
66+
0C
67+
```
68+
69+
Length 0
70+
```ignore
71+
FF
72+
```
73+
74+
Length 252 (First byte is u16_marker)
75+
```ignore
76+
FC, FC, 00
77+
```
78+
79+
Length 253 (First byte is u16_marker)
80+
```ignore
81+
FC, FD, 00
82+
```
83+
84+
Length 65,536 (aka 2^16) (First byte is u32_marker)
85+
```ignore
86+
FD, 00, 00, 01, 00
87+
```
88+
89+
Length 4,294,967,296 (aka 2^32) (First byte is u64_marker)
90+
```ignore
91+
FE, 00, 00, 00, 00, 01, 00, 00, 00
92+
```

README.md

Lines changed: 1 addition & 92 deletions
Original file line numberDiff line numberDiff line change
@@ -31,95 +31,4 @@ it will help. Consider using protobufs or JSON if Rust adoption is a blocker.
3131

3232
## Binary format
3333

34-
### Introduction
35-
36-
The main offering of this crate is a consistent and known representation of Rust types. As such, the format is
37-
considered to be part of our stable API, and changing the format requires a major version number bump. To aid you
38-
in debugging, that format is documented here.
39-
40-
### High-level overview
41-
42-
***Connection Initial Description***
43-
44-
Version 1 of the protocol did not send this description. Version 2 is the first version that sends a startup description.
45-
The first 8 bytes sent on the channel are the little endian protocol version number. If the reader is not compatible with the message version provided, it must
46-
terminate immediately. Immediately following the version, the sender will describe the features of the message stream it is about to send. Those features are as follows.
47-
48-
**Version 1**
49-
50-
Version 1 did not send an initial description. All optional features are disabled in version 1.
51-
52-
**Version 2**
53-
- `name`: checksum_enabled, `size`: 1 byte, `possible values`: 2 or 3, `notes`: 2 indicates checksums will be sent, 3 indicates checksums will not be sent.
54-
55-
***Message stream***
56-
57-
After the initial description, the byte stream is split up into messages. Every message begins with a `length` value. After `length` bytes have
58-
been read, a new message can begin immediately afterward. This `length` value is the entirety of the header of a
59-
message. If checksums are enabled, the 8 bytes immediately following the message are the checksum of the message. This checksum is determined by hashing
60-
the bytes of the message using SipHash 2-4. If checksums are disabled, instead the next message begins immediately. The bytes from the message are then
61-
deserialized into a Rust type via [`bincode`](https://github.com/bincode-org/bincode), using the following configuration.
62-
63-
```rust,ignore
64-
bincode::DefaultOptions::new()
65-
.with_limit(size_limit)
66-
.with_little_endian()
67-
.with_varint_encoding()
68-
.reject_trailing_bytes()
69-
```
70-
71-
### Length encoding
72-
73-
The length is encoded using a variably sized integer encoding scheme. To understand this scheme, first we need a few constant values.
74-
75-
```ignore
76-
u16_marker; decimal: 252, hex: FC
77-
u32_marker; decimal: 253, hex: FD
78-
u64_marker; decimal: 254, hex: FE
79-
zst_marker; decimal: 255, hex: FF
80-
stream_end; decimal: 0, hex: 00
81-
```
82-
83-
Any length less than `u16_marker` and greater than 0 is encoded as a single byte whose value is the length.
84-
A length of zero is encoded with the `zst_marker`. The stream is ended with the `stream_end` value. When this is
85-
read the peer is expected to close the connection.
86-
87-
`async-io-typed` always uses little-endian. The user data being sent may contain values that are not
88-
little-endian, but `async-io-typed` itself always uses little-endian.
89-
90-
If the first byte is `u16_marker`, then the length is 16 bits wide, and encoded in the following 2 bytes. Once
91-
those 2 bytes are read, the message begins. `u32_marker` and `u64_marker` are used in a similar way, each of
92-
those being 4 bytes, and 8 bytes respectively.
93-
94-
### Examples
95-
96-
97-
Length 12
98-
```ignore
99-
0C
100-
```
101-
102-
Length 0
103-
```ignore
104-
FF
105-
```
106-
107-
Length 252 (First byte is u16_marker)
108-
```ignore
109-
FC, FC, 00
110-
```
111-
112-
Length 253 (First byte is u16_marker)
113-
```ignore
114-
FC, FD, 00
115-
```
116-
117-
Length 65,536 (aka 2^16) (First byte is u32_marker)
118-
```ignore
119-
FD, 00, 00, 01, 00
120-
```
121-
122-
Length 4,294,967,296 (aka 2^32) (First byte is u64_marker)
123-
```ignore
124-
FE, 00, 00, 00, 00, 01, 00, 00, 00
125-
```
34+
Details on the binary format used by this crate can be found in [the binary format specification](https://github.com/Xaeroxe/async-io-typed/blob/main/MESSAGE_BINARY_FORMAT.md).

0 commit comments

Comments
 (0)