Floor zero negotiated I/O sizes + tighten read decoder bounds (v0.5.5)#24
Open
lukekim wants to merge 2 commits into
Open
Floor zero negotiated I/O sizes + tighten read decoder bounds (v0.5.5)#24lukekim wants to merge 2 commits into
lukekim wants to merge 2 commits into
Conversation
Addresses the two unresolved Copilot threads from #23, which auto-merge landed before they could be applied. Floor zero values from the SMB negotiate response. If a server (or a config edge case) returns max_read_size, max_write_size, or max_transact_size = 0, every downstream pipeline call does `remaining.div_ceil(0)` and panics the request task. The panic shows up to clients as "connection refused" on subsequent requests since spiceio's HTTP server is in the same process — exactly the failure mode behind the reported sccache flake against a long-running spiceio. negotiate_and_auth now substitutes the configured io_cap (default 64 KiB) for any zero, and logs at error level so the bug is visible. Two further safety nets: - pipelined_read returns an InvalidInput error if called with chunk_size = 0, so any future caller that skipped the floor still can't reach the div_ceil panic. - handle_get_object short-circuits to a 500 InternalError if handle.max_chunk somehow regressed to 0, instead of spawning a task that panics on the first read_pipeline call. Tighten decode_read_response_from_msg. The earlier rejection of `data_offset < SMB2_HEADER_SIZE` only ruled out offsets pointing at the SMB2 header — but the read response has 16 bytes of fixed fields (StructureSize/DataOffset/Reserved/DataLength/DataRemaining/Flags) before the Buffer. A malformed offset of e.g. SMB2_HEADER_SIZE + 4 would slice into DataLength and return those bytes as the file payload. The decoder now rejects anything before SMB2_HEADER_SIZE + 16 (a new exported READ_RESPONSE_FIXED_PART constant). Two new unit tests cover the response-fixed-fields band and the minimum-valid-offset boundary. Bumps version to v0.5.5.
There was a problem hiding this comment.
Pull request overview
This PR hardens SMB I/O negotiation and read-response decoding to avoid divide-by-zero panics and malformed read payload slicing, while bumping the crate version to v0.5.5.
Changes:
- Floors zero negotiated SMB I/O sizes to the configured/default I/O cap.
- Adds defensive zero-chunk handling for GetObject and pipelined reads.
- Tightens
decode_read_response_from_msgbounds and adds decoder boundary tests.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/smb/protocol.rs |
Adds fixed-field read response offset validation and related tests. |
src/smb/client.rs |
Floors zero negotiated I/O sizes and adds a zero chunk-size guard. |
src/s3/router.rs |
Returns an internal error before streaming if GetObject chunk size is zero. |
Cargo.toml |
Bumps package version to 0.5.5. |
Cargo.lock |
Updates locked package version to 0.5.5. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Move zero-chunk guard upstream of the div_ceil panic site. FileHandle::read_pipeline now returns InvalidInput before its own remaining.div_ceil(chunk_size as u64) call (the prior guard inside SmbClient::pipelined_read was unreachable from the high-level path). Same guard added to assemble_parts before its max_read div_ceil. - Extract the negotiate-flooring logic into a tested free function effective_io_sizes(neg_max_read, neg_max_write, neg_max_transact, io_cap, on_zero) -> EffectiveIoSizes. Six new unit tests cover the typical case, compound cap behavior, each individual zero field (with on_zero-callback verification), the all-zeros case, and the transact clamp. - Apply the SMB2_HEADER_SIZE + READ_RESPONSE_FIXED_PART minimum offset to decode_read_response (used by create_read_close compound) and decode_read_response_owned (used by SmbClient::read), replacing their prior saturating_sub-style accept. Three new tests for the rejection and acceptance boundaries on those paths.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Addresses the two Copilot threads from #23 that auto-merge landed before they could be applied, plus the most plausible root cause for the sccache → spiceio "Connection refused" flake reported against v0.5.4.
1. Floor zero values from the SMB negotiate response
If a server returns
max_read_size,max_write_size, ormax_transact_size = 0(a buggy or unusual server, or a config edge case),self.max_read_size = neg_resp.max_read_size.min(transact).min(io_cap)propagates the zero, and every downstream pipeline call doesremaining.div_ceil(0)— which panics. Because spiceio's HTTP server and SMB workers share the same process, that panic kills the listener; subsequent sccache compiles seeConnection refused (os error 61)until the supervisor (if any) restarts the binary. That matches the failure pattern in the reported run: thousands of cache hits, then a sudden mid-build wall.negotiate_and_authnow substitutes the configuredio_cap(default 64 KiB) for any negotiated zero and emits aserr!so the situation is visible in logs.2. Two safety nets so a future bug can't regress the divide-by-zero
pipelined_readreturns anio::Error::InvalidInputif called withchunk_size = 0, instead of panicking ondiv_ceil.handle_get_objectshort-circuits to 500 InternalError ifhandle.max_chunksomehow regresses to 0, instead of spawning a task that panics on its firstread_pipelinecall.3. Tighten
decode_read_response_from_msgto reject offsets inside the read-response fixed fieldsThe previous
data_offset < SMB2_HEADER_SIZErejection only ruled out offsets pointing at the SMB2 header. But the read response has 16 bytes of fixed fields (StructureSize/DataOffset/Reserved/DataLength/DataRemaining/Flags) before theBufferfield. A malformeddata_offsetof e.g.SMB2_HEADER_SIZE + 4would slice intoDataLengthand return those bytes as the file payload — wrong content but no detection.The decoder now rejects anything before
SMB2_HEADER_SIZE + READ_RESPONSE_FIXED_PART(a new exportedREAD_RESPONSE_FIXED_PART = 16constant inprotocol.rs). Two new unit tests cover the response-fixed-fields rejection band (+4and+15) and confirm the minimum-valid-offset boundary (SMB2_HEADER_SIZE + 16) is still accepted.4. Bumps version to v0.5.5.
Test plan
make lint— fmt, clippy strict, rustdoc all cleancargo test --locked --lib— 157/157 pass (was 155 on trunk; +2 for the new decoder boundary tests)./target/release/spiceio --version→spiceio 0.5.5serr!line is the new diagnostic if any negotiated value comes back zero. With the floor in place the process should no longer die mid-build.[spiceio] error:lines.Notes
This is purely a defensive / correctness fix — no perf changes, no API surface changes beyond the new
READ_RESPONSE_FIXED_PARTconst which is exported for symmetry withSMB2_HEADER_SIZE. The hot path is identical at runtime (the floor is a one-shot at session init; thechunk_size == 0guards are dead branches in normal operation).Auto-merge note: I've intentionally not enabled auto-merge on this PR — the prior cascade (#22 merged before review, #23 merged before review) was at least partly caused by auto-merge firing on green CI before the Copilot pass posted. Worth waiting for Copilot to weigh in here before merging.