Summary
The specification does not explicitly define the bit ordering within bytes when serializing the bitstring prior to GZIP compression. This ambiguity is causing interoperability issues between independent implementations.
Current specification text
Section 3.3 (Bitstring Generation Algorithm) states:
"Let bitstring be a list of bits with a minimum size of 16KB, where each bit is initialized to 0 (zero)."
The algorithm describes the bitstring as an abstract list of bits, but does not specify how this list should be packed into bytes before compression.
The ambiguity
When mapping bit index n to a byte array, there are two possible conventions:
| Convention |
Byte index |
Bit position within byte |
Example: bit 0 set |
| MSB-first |
floor(n / 8) |
7 - (n mod 8) |
0b10000000 (128) |
| LSB-first |
floor(n / 8) |
n mod 8 |
0b00000001 (1) |
Real-world impact
We encountered this issue when integrating two independently developed implementations: one Issuer using LSB-first encoding and one validator using MSB-first decoding. Status verification failed despite both implementations following the specification as written.
Ecosystem observation
Digital Bazaar's implementation uses MSB-first ordering, which appears to be the prevailing convention in the ecosystem.
Proposal
Add normative language to Section 3.3 specifying the bit-to-byte mapping. Suggested addition:
"When serializing the bitstring to a byte array, implementations MUST use most-significant-bit-first (MSB-first) ordering. Specifically, bit index n MUST be stored in byte floor(n / 8) at bit position 7 - (n mod 8), where bit position 7 represents the most significant bit of the byte."
Questions for the working group
- Is MSB-first the intended convention?
- Should this be explicitly documented in the specification?
- Would it be useful to include a test vector (e.g., "a bitstring with only bit 0 set compresses to X") to help implementers verify correctness?
Thank you for your consideration.
Summary
The specification does not explicitly define the bit ordering within bytes when serializing the bitstring prior to GZIP compression. This ambiguity is causing interoperability issues between independent implementations.
Current specification text
Section 3.3 (Bitstring Generation Algorithm) states:
The algorithm describes the bitstring as an abstract list of bits, but does not specify how this list should be packed into bytes before compression.
The ambiguity
When mapping bit index
nto a byte array, there are two possible conventions:floor(n / 8)7 - (n mod 8)0b10000000(128)floor(n / 8)n mod 80b00000001(1)Real-world impact
We encountered this issue when integrating two independently developed implementations: one Issuer using LSB-first encoding and one validator using MSB-first decoding. Status verification failed despite both implementations following the specification as written.
Ecosystem observation
Digital Bazaar's implementation uses MSB-first ordering, which appears to be the prevailing convention in the ecosystem.
Proposal
Add normative language to Section 3.3 specifying the bit-to-byte mapping. Suggested addition:
Questions for the working group
Thank you for your consideration.