Skip to content

Commit 00ec734

Browse files
committed
Add D4036 Why Not Span
1 parent 8c3c452 commit 00ec734

File tree

1 file changed

+296
-0
lines changed

1 file changed

+296
-0
lines changed

source/d4036-why-not-span.md

Lines changed: 296 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,296 @@
1+
---
2+
title: "Why Span Is Not Enough"
3+
document: D4036R0
4+
date: 2026-06-15
5+
reply-to:
6+
- "Vinnie Falco <vinnie.falco@gmail.com>"
7+
audience: LEWG
8+
---
9+
10+
## Abstract
11+
12+
C++ has bytes. A contiguous region of bytes needs a type. A sequence of such regions needs another. This paper examines the types that predictably come to mind, and their consequences.
13+
14+
---
15+
16+
## Revision History
17+
18+
### R0: June 2026
19+
20+
- Initial version.
21+
22+
---
23+
24+
## Disclosure
25+
26+
The author maintains [Boost.Beast](https://github.com/boostorg/beast)<sup>[7]</sup>, a published HTTP and WebSocket library built on [Boost.Asio](https://www.boost.org/doc/libs/release/doc/html/boost_asio.html)<sup>[1]</sup>'s buffer model, and develops [Capy, Corosio, Http, Beast2, and Burl](https://github.com/cppalliance)<sup>[10]</sup> - libraries that define or consume buffer abstractions. The author published [P4003R0](https://wg21.link/p4003r0)<sup>[8]</sup>. The author holds a neutral position on the Networking TS (changed from positive). This body of work creates a bias toward dedicated buffer types. Such types have costs: one more vocabulary type to learn, and interoperability friction with code that uses raw `span<byte>`.
27+
28+
## Credit Where Due
29+
30+
`std::span` is a well-established vocabulary type. It turns a pointer and a size into a single thing. Perfectly. The vocabulary need is profound and this paper does not propose to diminish it.
31+
32+
The question is whether `span` is also the right vocabulary for I/O buffer descriptors.
33+
34+
## 1. `span<byte>`
35+
36+
Question. How do we represent a contiguous region of bytes?
37+
38+
Answer. `span<byte>`. A pointer and a size. That works.
39+
40+
However...
41+
42+
Platform I/O requires an array, not one region:
43+
44+
| Platform | Descriptor | Used By |
45+
| -------- | --------------- | -------------------------------------------------- |
46+
| POSIX | `struct iovec` | `readv()` / `writev()` |
47+
| POSIX | `struct msghdr` | `sendmsg()` / `recvmsg()` |
48+
| Windows | `WSABUF` | `WSARecv()` / `WSASend()` |
49+
| Windows | `FILE_SEGMENT_ELEMENT` | `ReadFileScatter()` / `WriteFileGather()` |
50+
| Linux | `struct iovec` | `io_uring_prep_readv()` / `io_uring_prep_writev()` |
51+
52+
`span<byte>` can describe one region. Wrap it in a one-element array and `readv()` accepts it. But I/O rarely involves a single contiguous region. A message has a header and a body. A protocol has framing and payload. Sending two regions with `write()` means two syscalls. Sending them with `writev()` means one - this is scatter/gather I/O. `span<byte>` is an insufficient type for representing an array of buffers.
53+
54+
## 2. `span<span<byte>>`
55+
56+
Question. How do we represent several such regions?
57+
58+
Answer. A span of spans. `span<span<byte>>`.
59+
60+
A single buffer is a view of someone's data. The bytes exist somewhere - in a `vector`, in a memory-mapped page, in a stack array. The buffer borrows them. Non-owning is natural. The data has a natural owner elsewhere.
61+
62+
A buffer sequence is different. Nobody "naturally" has an array of `span<byte>` objects lying around. The sequence is an assembled grouping - a data structure constructed to collect regions together. Making it non-owning means the grouping itself cannot be stored, returned, or passed across an asynchronous boundary.
63+
64+
```cpp
65+
class message {
66+
span<span<byte>> buffers_; // borrows... what?
67+
};
68+
69+
span<span<byte>> prepare_message(span<byte> hdr, span<byte> body) {
70+
span<byte> bufs[] = { hdr, body };
71+
return { bufs }; // dangling
72+
}
73+
74+
void start_send(socket& s, span<byte> hdr, span<byte> body) {
75+
span<byte> bufs[] = { hdr, body };
76+
s.async_send(span<span<byte>>(bufs), callback);
77+
// returns immediately; bufs destroyed; dangling
78+
}
79+
```
80+
81+
## 3. `range<span<byte>>`
82+
83+
Question. How do we own a collection of byte regions?
84+
85+
Answer. Use a range. `vector<span<byte>>`, `array<span<byte>, N>`, any range whose value type is `span<byte>`.
86+
87+
Ranges solve the ownership problem: a `vector` owns its elements.
88+
89+
Ranges create a byte consumption problem. Consider a JSON stream arriving in two chunks:
90+
91+
```cpp
92+
// chunk 1 [100 bytes]: {"name":"Alice","age":30}{"name":"B
93+
// chunk 2 [100 bytes]: ob","age":25}...
94+
95+
range<span<byte>> input = { chunk1, chunk2 };
96+
97+
// parser finds first complete object: {"name":"Alice","age":30}
98+
// that is 26 bytes - consume them
99+
//
100+
// views::drop(input, 1) drops all of chunk1 (100 bytes) - too much
101+
// views::drop(input, 0) drops nothing - too little
102+
// no standard range operation removes exactly 26 bytes
103+
```
104+
105+
The parse boundary (26 bytes) does not align with the buffer boundary (100 bytes). Consuming 26 bytes means advancing chunk 1 by 26 bytes - 74 remain - without touching chunk 2. No range adaptor does this. [`std::ranges`](https://eel.is/c++draft/ranges)<sup>[5]</sup> operates on elements. Parsing operates on bytes, not elements.
106+
107+
Incremental parsers with this need - JSON, XML, CSV, protobuf - go unserved.
108+
109+
## 4. `byte`
110+
111+
Question. What if we add byte-level algorithms to a range of `span<byte>`?
112+
113+
Answer. The range is fine for ownership and iteration. The element type is not.
114+
115+
`span<byte>` already serves too many needs: serialization, cryptography, hashing, memory-mapped regions. If buffer sequences also use `span<byte>`, the type system cannot distinguish a buffer from any other byte span. A concept, an overload, or a constraint that separates "buffer in a sequence" from "hash input" or "encryption key" is impossible to write.
116+
117+
### Boost.Asio
118+
119+
A separate type enables run-time safety checks:
120+
121+
| Capability | Asio `mutable_buffer`<sup>[1]</sup> | `span<byte>` |
122+
| ---------------------------------- | ------------------------------------ | ------------ |
123+
| Implementation-defined members | Possible | Closed |
124+
| Detect dangling after reallocation | Possible | No |
125+
| Future diagnostic aids | Possible | No |
126+
| Conditional debug callback | `BOOST_ASIO_ENABLE_BUFFER_DEBUGGING` | No |
127+
128+
Each time `span<byte>` appears in a function signature, it loses the safety capability.
129+
130+
## 5. Six Ecosystems Already Arrived Here
131+
132+
Six I/O ecosystems, designed independently, all arrived at similar solutions:
133+
134+
| Ecosystem | Buffer Type | Layout |
135+
| --------- | --------------------------------- | --------------------------------------------------------------- |
136+
| POSIX | `iovec` | `void*` + `size_t` |
137+
| Windows | `WSABUF` | `ULONG` + `char*` |
138+
| Asio | `const_buffer` / `mutable_buffer` | `void const*` + `size_t`, with range concepts<sup>[1]</sup> |
139+
| libuv | `uv_buf_t` | `char*` + `size_t`<sup>[2]</sup> |
140+
| Go | `net.Buffers` | scatter/gather over `[][]byte`<sup>[3]</sup> |
141+
| .NET | `ReadOnlySequence<T>` | linked list of discontiguous `Memory<T>` segments<sup>[4]</sup> |
142+
143+
Everybody converged on custom types independently.
144+
145+
## 6. The Final Straw
146+
147+
The committee already endorsed this principle.
148+
149+
[P0298R3](https://wg21.link/p0298r3)<sup>[6]</sup> introduced `std::byte` because `unsigned char` performed triple duty. Neil MacIntosh wrote:
150+
151+
> "these types perform a 'triple duty'. Not only are they used for byte addressing, but also as arithmetic types, and as character types. This multiplicity of roles opens the door for programmer error"<sup>[6]</sup>
152+
153+
> "The key motivation here is to make byte a distinct type - to improve program safety by leveraging the type system."<sup>[6]</sup>
154+
155+
`unsigned char` had the right size and alignment. The committee added `std::byte` anyway - same size, same alignment, but no arithmetic, no implicit conversions. The generic type's operations did not match the domain. The committee restricted the interface.
156+
157+
`span<byte>` performs double duty - general-purpose byte view and I/O buffer descriptor. A bespoke type restricts the interface to `data()` and `size()`. Same principle, one level of abstraction higher.
158+
159+
**The precise fit is bespoke.**
160+
161+
### Almost There
162+
163+
`std::byte` kept the shift operators despite the stated goal of removing arithmetic. The principle was right. The execution left a gap.
164+
165+
## 7. Finally Correct
166+
167+
New buffer types give us the principled option. Only what we need: `data()` and `size()`.
168+
169+
### `void*`, Not `byte*`
170+
171+
`void*` is maximally accepting and minimally permissive. Any pointer converts to it implicitly. The user must perform an explicit cast to go back. The asymmetry is by design.
172+
173+
| Risk | `void*` | `byte*` | Cost |
174+
| --------------------------- | ------- | ------- | ------------------------------------------ |
175+
| Requires `reinterpret_cast` | No | Yes | Invites superfluous casts |
176+
| Dereferenceable | No | Yes | Invites accidental access |
177+
| Pointer arithmetic | No | Yes | Invites accidental arithmetic |
178+
| Assignable to `span<byte>` | No | Yes | Invites full span API misuse |
179+
| Promises byte-level meaning | No | Yes | Invites false type assertions |
180+
| Contradicts type erasure | No | Yes | Invites type erasure violations |
181+
| C++17 only | No | Yes | Disinvites C users |
182+
183+
### A Buffer Sequence Is Distinct
184+
185+
Buffer sequences are not served by existing concepts. They are a new concept.
186+
187+
### What the Standard Needs
188+
189+
- A read-only byte region type (`void const*` + `size_t`)
190+
- A writable byte region type (`void*` + `size_t`)
191+
- Concepts for sequences of read-only and writable byte regions
192+
- Algorithms: total byte count, byte-granular slicing, copy between buffer sequences
193+
194+
The types already exist:
195+
196+
```cpp
197+
class mutable_buffer {
198+
unsigned char* p_ = nullptr;
199+
std::size_t n_ = 0;
200+
public:
201+
mutable_buffer() = default;
202+
mutable_buffer(mutable_buffer const&) = default;
203+
mutable_buffer& operator=(mutable_buffer const&) = default;
204+
constexpr mutable_buffer(void* data, std::size_t size) noexcept
205+
: p_(static_cast<unsigned char*>(data)), n_(size) { }
206+
constexpr void* data() const noexcept { return p_; }
207+
constexpr std::size_t size() const noexcept { return n_; }
208+
};
209+
210+
class const_buffer {
211+
unsigned char const* p_ = nullptr;
212+
std::size_t n_ = 0;
213+
public:
214+
const_buffer() = default;
215+
const_buffer(const_buffer const&) = default;
216+
const_buffer& operator=(const_buffer const& other) = default;
217+
constexpr const_buffer(void const* data, std::size_t size) noexcept
218+
: p_(static_cast<unsigned char const*>(data)), n_(size) { }
219+
constexpr const_buffer(mutable_buffer const& b) noexcept
220+
: p_(static_cast<unsigned char const*>(b.data())), n_(b.size()) { }
221+
constexpr void const* data() const noexcept { return p_; }
222+
constexpr std::size_t size() const noexcept { return n_; }
223+
};
224+
```
225+
226+
These are the [Networking TS](https://wg21.link/n4771)<sup>[9]</sup> types.
227+
228+
## 8. Side by Side
229+
230+
| Task | `span<byte>` | `mutable_buffer` |
231+
| --------------------- | ---------------------------------------------------- | ------------------------------------ |
232+
| Construct from vector | `span<byte>{reinterpret_cast<byte*>(v.data()), ...}` | `mutable_buffer{v.data(), v.size()}` |
233+
| Consume N bytes | `buf = span<byte>{buf.data() + n, buf.size() - n}` | `buf += n` |
234+
| Detect dangling | Requires ABI Break | *see-below* |
235+
236+
Safety feature:
237+
238+
```cpp
239+
class mutable_buffer {
240+
void* p_ = nullptr;
241+
size_t n_ = 0;
242+
void(*check_)() = nullptr;
243+
public:
244+
void* data() const { if(check_) check_(); return p_; }
245+
size_t size() const noexcept { return n_; }
246+
};
247+
```
248+
249+
Smaller to write, safer to use, open to diagnostics.
250+
251+
## 9. But
252+
253+
### But this is standardizing Asio's types
254+
255+
Yes. They earn their keep.
256+
257+
### But `vector<span<byte>>` is enough
258+
259+
Users opt out of types which do not let them opt out of allocations.
260+
261+
### But `mdspan` is enough
262+
263+
Buffer sequences only need one dimension. [`mdspan`](https://eel.is/c++draft/mdspan.overview)<sup>[5]</sup> provides several.
264+
265+
### But `span<void>` is enough
266+
267+
Even if `span<void>` were possible, what remains after removing the impossible is `data()` and `size()`. That is just a less-capable `mutable_buffer`.
268+
269+
### But `span<byte>` is enough
270+
271+
`span<byte>` is also a less-capable `mutable_buffer`. It is `span<void>` with added harm.
272+
273+
## Suggested Straw Poll
274+
275+
> LEWG agrees that a contiguous byte region descriptor for I/O should be a dedicated type, not `span<byte>`.
276+
277+
---
278+
279+
# Acknowledgements
280+
281+
The buffer model described here draws on twenty years of Asio's buffer sequence abstractions, due to Chris Kohlhoff.
282+
283+
---
284+
285+
## References
286+
287+
1. [Boost.Asio](https://www.boost.org/doc/libs/release/doc/html/boost_asio.html) - Buffer types and buffer sequence requirements (Chris Kohlhoff). https://www.boost.org/doc/libs/release/doc/html/boost_asio.html
288+
2. [libuv](https://docs.libuv.org/en/v1.x/) - `uv_buf_t` buffer type. https://docs.libuv.org/en/v1.x/
289+
3. [Go standard library](https://pkg.go.dev/) - `net.Buffers` (https://pkg.go.dev/net#Buffers). https://pkg.go.dev/
290+
4. [.NET System.IO.Pipelines](https://learn.microsoft.com/en-us/dotnet/api/system.io.pipelines) - `ReadOnlySequence<T>`. https://learn.microsoft.com/en-us/dotnet/api/system.io.pipelines
291+
5. [C++ Working Draft](https://eel.is/c++draft/) - `span` ([span.overview](https://eel.is/c++draft/span.overview)), `mdspan` ([mdspan.overview](https://eel.is/c++draft/mdspan.overview)), ranges ([ranges](https://eel.is/c++draft/ranges)). https://eel.is/c++draft/
292+
6. [P0298R3](https://wg21.link/p0298r3) - A byte type definition (Neil MacIntosh). https://wg21.link/p0298r3
293+
7. [Boost.Beast](https://github.com/boostorg/beast) - HTTP and WebSocket built on Boost.Asio (Vinnie Falco). https://github.com/boostorg/beast
294+
8. [P4003R0](https://wg21.link/p4003r0) - (Vinnie Falco). https://wg21.link/p4003r0
295+
9. [N4771](https://wg21.link/n4771) - Working Draft, C++ Extensions for Networking. https://wg21.link/n4771
296+
10. [C++ Alliance](https://github.com/cppalliance) - Capy, Corosio, Http, Beast2, Burl (Vinnie Falco). https://github.com/cppalliance

0 commit comments

Comments
 (0)