Add SMB connection pool, pipelined reads, and TCP tuning by lukekim · Pull Request #11 · spiceai/spiceio

lukekim · 2026-04-02T20:32:29Z

Summary

Connection pool: N authenticated SMB TCP connections (default 4, SPICEIO_SMB_CONNECTIONS), round-robin dispatched. Concurrent S3 requests fan out across connections instead of serializing on a single mutex — the S3-compatible API proxy over SMB 3.1.1 #1 bottleneck under concurrency.
Pipelined reads: Streaming GetObject sends batches of 8 read requests before collecting responses, filling the network pipe and hiding per-request RTT.
TCP socket tuning: SO_SNDBUF/SO_RCVBUF enlarged to 1 MB on SMB sockets via raw setsockopt FFI (no libc crate).
Configurable I/O cap: SPICEIO_SMB_MAX_IO (default 64 KB) lets users raise standalone read/write size for compliant servers. Compound operations stay at 64 KB. MaxTransactSize now parsed from negotiate response.

Test plan

102 unit tests pass
make lint clean (fmt + check + clippy + rustdoc)
S3 API integration: 14/14 assertions pass (ListBuckets, HeadBucket, Put/Get small/64KB/1MB, HeadObject, ListObjectsV2, CopyObject, DeleteObject, nested paths, overwrite)
sccache integration (self-build): 44 cache hits, 97.78% hit rate, 0 write errors
sccache extended (spiceai build): 1070 cache hits, 97.90% hit rate, 0 write errors, 50% warm build speedup
Log analysis: zero unexpected errors across 1195 log lines; all SMB errors are expected cache-miss lookups (0xC0000034/0xC000003A)

Eliminate the single-TCP-connection serialization bottleneck by pooling N authenticated SMB sessions (default 4, configurable via SPICEIO_SMB_CONNECTIONS). Concurrent S3 requests now fan out across connections via round-robin instead of queuing behind one mutex. Streaming GetObject sends batches of 8 read requests before collecting responses (pipelined reads), hiding per-request round-trip latency. Additional changes: - Enlarge SO_SNDBUF/SO_RCVBUF to 1 MB on SMB TCP sockets - Add configurable standalone I/O cap via SPICEIO_SMB_MAX_IO (default 64 KB; raisable for servers that handle larger I/O) - Parse MaxTransactSize from negotiate response - Separate compound-safe sizes (64 KB) from standalone I/O sizes - New smb::pool module with SmbPool round-robin dispatcher - ShareSession backed by pool; FileHandle pinned to opening connection

Copilot

Pull request overview

This PR improves SMB-backed S3 throughput by adding connection fan-out and read pipelining, plus tuning negotiated/requested I/O sizes and TCP socket buffers.

Changes:

Add an SmbPool to maintain multiple authenticated SMB TCP connections and dispatch operations round-robin.
Add pipelined SMB reads for streaming GetObject to reduce RTT impact.
Parse MaxTransactSize, add SPICEIO_SMB_MAX_IO cap, and apply TCP buffer tuning on SMB sockets.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/smb/protocol.rs	Parses `max_transact_size` from negotiate response for downstream I/O sizing.
src/smb/pool.rs	New SMB connection pool that creates N authenticated sessions and round-robins work.
src/smb/ops.rs	Migrates share/session ops to pool-based dispatch; adds `FileHandle::read_pipeline`.
src/smb/mod.rs	Exposes the new `pool` module and re-exports `SmbPool`.
src/smb/client.rs	Adds socket tuning, configurable I/O caps, and implements `pipelined_read`.
src/s3/router.rs	Uses compound caps for fast path and switches streaming reads to pipelined batches.
src/main.rs	Adds env-config for pool size and max I/O cap; wires up pool-based share session.
CLAUDE.md	Documents the new environment variables and architecture changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-02T20:38:42Z

+            let fd = stream.as_raw_fd();
+            let buf_size: i32 = 1024 * 1024;
+            let ptr = std::ptr::from_ref(&buf_size).cast();
+            let len = size_of::<i32>() as u32;


size_of isn’t in scope here, so this won’t compile. Import it (e.g., use std::mem::size_of;) or fully-qualify the call (std::mem::size_of::<i32>()).

Suggested change

let len = size_of::<i32>() as u32;

let len = std::mem::size_of::<i32>() as u32;

Copilot · 2026-04-02T20:38:43Z

+                let body = msg[SMB2_HEADER_SIZE..].to_vec();
+                let data = decode_read_response_owned(body).ok_or_else(|| {
+                    io::Error::new(io::ErrorKind::InvalidData, "invalid read response")
+                })?;
+                results.push(data);


pipelined_read assumes responses arrive in the same order requests were sent (it just pushes each chunk). SMB2 responses can be returned out-of-order, so this can reorder/corrupt the byte stream (and can also prematurely stop if an EOF response for a later offset arrives early). Track message_id→request index/offset when sending and place each decoded chunk into the correct slot before returning (or otherwise reorder by offset).

Copilot · 2026-04-02T20:38:43Z

+        'outer: while offset < stream_end {
+            let remaining = stream_end - offset;
+            match handle.read_pipeline(offset, chunk_size, remaining).await {


This loop can send more bytes than requested for HTTP Range requests: read_pipeline always requests chunk_size bytes per SMB read, even when remaining < chunk_size, and the returned chunk is forwarded as-is. For non-EOF ranges, SMB will happily return the full chunk_size, causing the response body to exceed Content-Length/Content-Range. Ensure the last request in a batch uses min(chunk_size, remaining) (and similarly for later requests in the batch) and/or truncate the final chunk(s) to remaining before sending.

Copilot · 2026-04-02T20:38:43Z

    region: String,
+    /// Number of SMB TCP connections in the pool (default 4)
+    smb_connections: usize,
+    /// Max I/O size for standalone read/write operations (default 1MB)


The field comment says the default max I/O is 1MB, but SPICEIO_SMB_MAX_IO defaults to 0 here (which the SMB client interprets as the internal default of 64KB). Update the comment to match the actual default behavior to avoid confusion.

Suggested change

/// Max I/O size for standalone read/write operations (default 1MB)

/// Max I/O size for standalone read/write operations (default 0, meaning the SMB client uses its internal default of 64KB)

Copilot AI review requested due to automatic review settings April 2, 2026 20:32

Copilot started reviewing on behalf of lukekim April 2, 2026 20:33 View session

lukekim merged commit 13a48f0 into trunk Apr 2, 2026
6 checks passed

lukekim deleted the perf/connection-pool-pipelining branch April 2, 2026 20:37

Copilot AI reviewed Apr 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SMB connection pool, pipelined reads, and TCP tuning#11

Add SMB connection pool, pipelined reads, and TCP tuning#11
lukekim merged 1 commit into
trunkfrom
perf/connection-pool-pipelining

lukekim commented Apr 2, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 2, 2026

Uh oh!

Copilot AI Apr 2, 2026

Uh oh!

Copilot AI Apr 2, 2026

Uh oh!

Copilot AI Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	let len = size_of::<i32>() as u32;
	let len = std::mem::size_of::<i32>() as u32;

	/// Max I/O size for standalone read/write operations (default 1MB)
	/// Max I/O size for standalone read/write operations (default 0, meaning the SMB client uses its internal default of 64KB)

Conversation

lukekim commented Apr 2, 2026

Summary

Test plan

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants