Phase 2: Security Hardening#8
Conversation
…ction - TestDecodeVersionMismatch: version 99 and version 0 should be rejected - TestDecodeLegacyRejected: unrecognized magic bytes should be rejected - TestSentinelErrors: errors.Is() works through wrapping - TestDecodeRoundTripRegression: valid encode/decode still works
- Add ErrVersionMismatch and ErrInvalidFormat sentinel errors - Validate Header.Version == Version in readHeader - Remove legacy fallback branch that attempted zstd decompression without magic - Unrecognized magic bytes now return ErrInvalidFormat
- TestDecodeBoundsRGSet, TestDecodeBoundsPathDirectory - TestDecodeBoundsStringIndexes, TestDecodeBoundsTrigramIndexes - TestDecodeBoundsDocIDMapping, TestDecodeBoundsBloomFilter - TestDecodeBoundsHLLRegisters, TestDecodeBoundsNumericRGs - TestDecodeBoundsStringLengthRGs, TestDecodeCraftedPayload
- Add maxRGSetSize, maxNumPaths, maxTermsPerPath, maxTrigramsPerPath, maxBloomWords, maxHLLRegisters constants - Guard readRGSet, readPathDirectory, readBloomFilter, readStringIndexes, readStringLengthIndexes, readNumericIndexes, readNullIndexes, readTrigramIndexes, readHyperLogLogs, readDocIDMapping - Thread maxRGs from Header.NumRowGroups into readNumericIndexes and readStringLengthIndexes; thread maxDocs into readDocIDMapping - All bounds violations return ErrInvalidFormat with actual/max values
|
@claude PTAL |
|
Claude finished @tazarov's task in 1m 57s —— View job Code Review: Phase 2 Security HardeningTasks:
Overall the PR is solid and covers the stated requirements well. The TDD approach is evident and the bounds constants are clearly justified. A few issues worth addressing before merging: Issues1. Inner magic bytes check doesn't wrap
|
- Wrap ErrInvalidFormat in "data too short" and inner magic checks - Add numRGs bounds check to readRGSet using header-derived max - Deduplicate TestDecodeCraftedPayload into TestDecodeCraftedInnerMagic - Add TestDecodeBoundsRGSetNumRGs for new bounds check
|
Claude finished @tazarov's task in 1m 32s —— View job Review Follow-up CompleteTasks:
What was foundThe two blocking issues from the previous review were already addressed in the current commits:
The only remaining polish item was the missing GoDoc on the exported sentinels, which I've now added: // ErrVersionMismatch is returned by Decode when the binary format version
// does not match the expected version (Version constant).
ErrVersionMismatch = errors.New("version mismatch")
// ErrInvalidFormat is returned by Decode when the binary data is structurally
// invalid: unrecognized magic bytes, oversized allocations, or corrupt fields.
ErrInvalidFormat = errors.New("invalid format")The two non-blocking observations from the previous review (near-duplicate tests, |
…entinels Co-authored-by: Trayan Azarov <tazarov@users.noreply.github.com>
|
Claude finished @tazarov's task in 2m 6s —— View job Code Review: Phase 2 Security Hardening (Final State)Tasks:
The PR is in excellent shape. All previous blocking issues have been resolved, and the implementation is clean and correct. Here's what I verified: All Previous Issues Addressed
Allocation Site ReviewAll 10 stream-controlled allocation sites are correctly bounded:
The choice of Test CoverageThe near-duplicate test concern from the previous review was resolved cleanly: One Minor Observation (non-blocking)In SummaryThis PR is merge-ready. The security hardening is thorough, the sentinel error contract is consistent throughout all code paths, the bounds constants are clearly justified, and the test suite directly validates each constraint. The TDD approach and clean commit sequence make the changes easy to audit. |
Summary
Phase 2: Security Hardening
Goal: The deserialization path is safe to expose to untrusted inputs
Status: Verified (5/5 must-haves)
Hardens all deserialization paths in
Decode()against crafted/corrupt.ginfiles. Adds bounds checks to all 10 stream-controlled allocation sites, introducesErrVersionMismatchandErrInvalidFormatsentinel errors, rejects unknown binary format versions, and removes the legacy fallback branch that silently re-attempted zstd decompression.Changes
Plan 02-01: Deserialization Security Hardening
Bounds checks on all 10 stream-controlled allocation sites, version validation with sentinel errors, legacy fallback removed.
Key files:
serialize_security_test.go— 14 new security-focused testsserialize.go— 7 bounds constants, 2 sentinel errors, version validation, bounds checks in all read functions, legacy fallback removalTask commits (TDD):
8a1c0a9test: failing tests for version validation and legacy rejection0221d5ffix: sentinel errors, version check, legacy removal43fa9aatest: failing tests for bounds checks on all allocation sites5d93c6efix: constants and guards on all sitesRequirements Addressed
Decode()— reject unknown versionsVerification
Key Decisions
readRGSetbounds instead of parameter threading — avoids signature changes to 3+ callersreadDocIDMapping,readNumericIndexes,readStringLengthIndexes— structural anchors existErrInvalidFormatwith actual/max values for debugging