Introduce FD-based code-signature verifier#942
Conversation
…fier
A new subpackage that verifies a Mach-O slice's code signature against
expected (cdhash, signing_id, team_id) using only a borrowed file
descriptor. Designed for use on the ES_AUTH_EXEC hot path. Streams the
file once: full-file SHA-256, page-hash validation against the picked
CodeDirectory, and cdhash compute share a single pass — every byte read
from disk is observed at most once per Run() call.
Public surface in Source/common/verifyinghasher/VerifyingHasher.h:
VerifyingHasher::Run(fd, cputype, cpusubtype, Expected) -> Result
Status ::= { kError, kNoMatch, kMatchCDHash, kMatchSidTidDrift }
Result ::= { Status, std::optional<std::array<uint8_t, 32>> sha256 }
Expected ::= { std::span<const uint8_t> cdhash,
std::string_view signing_id,
std::string_view team_id }
Exact cdhash equality returns kMatchCDHash. A (signing_id, team_id)
drift fallback returns kMatchSidTidDrift only when both Expected.team_id
and the parsed CD's team_id are non-empty (ad-hoc binaries are not
eligible). The strongest available CodeDirectory by hashType is picked
(SHA-384 > SHA-256 > SHA-256-truncated > SHA-1), mirroring xnu's
hashPriorities in bsd/kern/ubc_subr.c.
Internal layers (package-private; visibility encapsulated by BUILD):
FileReader / FdFileReader pread-only positional file access
HeaderParser fat / mach_header / LC_CODE_SIGNATURE
CodeSignatureParser SuperBlob / CodeDirectory / cdhash
HashTraits Sha{1,256,256Truncated,384}Traits
PageVerifier per-page hash check + gap detection
VerifyingHasherCore phase orchestration + full-file digest
UninitBuffer uninit-on-allocate byte buffer
Structural caps and validations track xnu where it has its own bound or
runtime check (LC_CODE_SIGNATURE.datasize, sizeofcmds, CD pageSize,
scatter rejection, the nCodeSlots-vs-codeLimit asymmetry).
Tests under Source/common/verifyinghasher/:
- Eight unit-test targets, one per layer plus end-to-end.
- The CountingMemoryFileReader oracle asserts MaxReadsAnyByte <= 1
across default, small-buffer, and post-malformed-CS paths.
- hw_universal: ad-hoc multi-CD fat fixture (SHA-1/256/256t/384).
- hw_team_signed: team-signed counterpart for the drift path.
Tests under Testing/:
- //Testing/OneOffs:verifying_hasher_smoke runs the public facade
against /usr/bin/yes.
- //Testing/Fuzzing:VerifyingHasherFuzzer and HeaderParserFuzzer with
seed corpora; the end-to-end fuzzer additionally aborts on
MaxReadsAnyByte > 1. libFuzzer + ASan wired via --config=fuzz in
.bazelrc and the objc_fuzz_test macro in Testing/Fuzzing/fuzzing.bzl.
Out of scope for this PR (separate follow-ups):
- santad integration (SNTExecutionController, SNTPolicyProcessor)
- handling for unsigned binaries (LC_CODE_SIGNATURE absent → kError;
fallback hashing handled by the consuming caller)
- SNTFileInfo / MOLCodesignChecker edits
- configuration keys, telemetry, rollout flags
- CS_KILL/CS_HARD page-hash skip optimization
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a streaming Mach-O header parser, CodeDirectory parser, per-page verifier, a multi-phase VerifyingHasher core with FD API, comprehensive unit tests, Bazel fuzzing targets/tooling and corpora, a CLI verifier, and supporting build/config updates. ChangesVerifyingHasher Core System
Fuzzing Infrastructure
CLI Utilities
Sequence Diagram(s)sequenceDiagram
participant CLI as VerifyingHasher CLI
participant Reader as FileReader / FdFileReader
participant Core as VerifyingHasherCore
participant Header as HeaderParser
participant CsParser as CodeSignatureParser
participant PageVerifier as PageVerifierT
CLI->>Reader: open file, construct reader
CLI->>Core: Run(reader, ArchSelector)
Core->>Header: stream bytes via Update()
Header-->>Core: slice info, cs_blob_offset/size
Core->>Reader: pread CS blob region (overlap-aware)
Core->>CsParser: ParseCodeSignature(blob)
CsParser-->>Core: ParsedCodeDirectory (slot_hashes, cdhash)
Core->>PageVerifier: instantiate with slot_hashes/page_size
Core->>Reader: stream signed-region bytes -> PageVerifier & SHA256
PageVerifier-->>Core: mismatches / complete
Core-->>CLI: Result (status, optional sha256, cdhash, mismatches)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (2)
Source/common/verifyinghasher/HashTraits.h (1)
51-64: 💤 Low valueConsider avoiding leading underscores in macro-local variable names.
The
_p,_n,_step_sz,_step,_rcidentifiers use leading underscores. While technically allowed in block scope, the C++ standard reserves identifiers beginning with underscore for implementation use in global namespace, which can cause confusion. A common macro hygiene pattern uses a project-specific prefix instead (e.g.,vh_p_orcc_loop_p_).This is a minor style point; the logic itself correctly handles inputs exceeding
CC_LONGcapacity.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Source/common/verifyinghasher/HashTraits.h` around lines 51 - 64, The macro VERIFYINGHASHER_CC_UPDATE_LOOP uses macro-local identifiers with leading underscores (_p, _n, _step_sz, _step, _rc); rename them to non-reserved, project-prefixed names (e.g., vh_p, vh_n, vh_step_sz, vh_step, vh_rc or cc_loop_p, cc_loop_n, etc.) throughout the macro body so no global-reserved identifiers are introduced, preserving all existing casts, loop logic, CC_LONG conversion, and return semantics in the cc_update_fn((ctx), ...) invocation.Source/common/verifyinghasher/CodeSignatureParser.h (1)
29-29: 💤 Low valueUnused include:
<vector>appears unnecessary in this header.The
ParsedCodeDirectorystruct andParseCodeSignaturedeclaration don't usestd::vector. If it's only needed in the implementation, consider moving it to the.mmfile.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Source/common/verifyinghasher/CodeSignatureParser.h` at line 29, The header unnecessarily includes <vector>; remove that include from CodeSignatureParser.h and keep the declarations for ParsedCodeDirectory and ParseCodeSignature unchanged; if any implementation files need std::vector, add `#include` <vector> in the corresponding .mm (implementation) file (e.g., where ParseCodeSignature is defined) so only the implementation, not the header, depends on <vector>.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Source/common/verifyinghasher/CodeSignatureParserTest.mm`:
- Around line 61-108: When scanning a FAT binary you never check whether a
matching architecture was found, so slice_off/slice_size can remain 0 and the
code falls through to parse invalid data; after the fat branch (i.e. after the
loop over archs when magic == FAT_MAGIC || magic == FAT_CIGAM) add a fast-fail:
if the file was FAT (magic == FAT_MAGIC || magic == FAT_CIGAM) and no matching
slice was selected (slice_size == 0 or slice_off == 0) then close(fd) and return
{} immediately; reference the existing variables magic, FAT_MAGIC/FAT_CIGAM,
slice_off and slice_size to implement this check so subsequent pread/mach-header
parsing only runs when a valid slice was found.
In `@Source/common/verifyinghasher/CountingMemoryFileReader.h`:
- Around line 36-38: The Pread override in CountingMemoryFileReader returns -1
for negative off but doesn't set errno; update Pread (method name Pread in class
CountingMemoryFileReader) to set errno = EINVAL before returning -1 for off < 0
to satisfy the FileReader error contract, and ensure any required header for
errno (e.g., <errno.h>) is available in the translation unit.
In `@Source/common/verifyinghasher/MemoryFileReader.h`:
- Around line 33-39: Pread in MemoryFileReader returns -1 on synthetic/invalid
paths but doesn't set errno; update MemoryFileReader::Pread so before returning
-1 it sets errno appropriately (e.g. set errno = EIO when failing the synthetic
failure path controlled by fail_next_, and set errno = EINVAL when off < 0) so
it matches the FileReader/FdFileReader contract and callers can inspect errno;
locate the Pread method and add the errno assignments just prior to each return
-1.
In `@Source/common/verifyinghasher/VerifyingHasherCore.mm`:
- Around line 210-225: The tail phase loop in VerifyingHasherCore.mm currently
treats reader_.Pread(...) == 0 as a normal break and proceeds to finalize
full_digest_, which yields a digest for a truncated file; instead, when cursor_
< total and Pread returns 0 treat it as an I/O error: set last_error_ (e.g.,
"pread returned 0 before EOF in tail phase") and return Status::kIoError. Update
the loop that calls reader_.Pread(...) and the Pread==0 branch so it mirrors
earlier phases’ handling of premature EOF (use cursor_, total, last_error_, and
Status::kIoError to locate the change).
In `@Testing/Fuzzing/install_libclang_fuzzer.sh`:
- Around line 49-55: The current CLANG_VERSION/DST_PATH derivation is brittle;
update the logic in this script to first try deriving the resource dir via
`clang -print-resource-dir` and check for the existence of
libclang_rt.fuzzer_osx.a under that path (reference DST_PATH and the `clang
-print-resource-dir` call), and if the file is missing fall back to a more
robust Apple-Clang version parse (use grep/awk to extract the Apple Clang
version into CLANG_VERSION rather than cut) and reconstruct DST_PATH under
XcodeDefault.xctoolchain; always verify the final DST_PATH file exists and exit
with an explicit error if neither method finds the fuzzer runtime.
In `@Testing/Fuzzing/README.md`:
- Around line 31-33: The README has unlabeled fenced code blocks causing MD040
failures; update each fence in Testing/Fuzzing/README.md to include a language
tag: use ```bash for shell snippets (the
./Testing/Fuzzing/install_libclang_fuzzer.sh line, the bazel test --config=fuzz
block, the bazel run --config=fuzz blocks, and
./Testing/Fuzzing/regenerate_corpus.sh) and use ```starlark for the
objc_fuzz_test example (the MyNewFuzzer snippet); ensure each opening fence is
changed accordingly so all unlabeled fences at the reported locations are
labeled.
In `@Testing/OneOffs/verifying_hasher_smoke.sh`:
- Around line 67-76: The hard-coded tamper offset (49152 + 1024) in
verifying_hasher_smoke.sh can end up past EOF or outside the signed region;
modify the script to derive the tamper offset from the verifier output (e.g.,
parse codeLimit or another signed-region boundary emitted by the verifier) and
choose an offset well inside that limit and within the file size before flipping
a byte; update the Python one-liner (or replace it with a small shell+python
helper) to take the computed offset (using the existing $TAMP file and $TAMP
size) and flip a byte there, ensuring you validate the offset is < file size and
< codeLimit to avoid flakes or false negatives.
---
Nitpick comments:
In `@Source/common/verifyinghasher/CodeSignatureParser.h`:
- Line 29: The header unnecessarily includes <vector>; remove that include from
CodeSignatureParser.h and keep the declarations for ParsedCodeDirectory and
ParseCodeSignature unchanged; if any implementation files need std::vector, add
`#include` <vector> in the corresponding .mm (implementation) file (e.g., where
ParseCodeSignature is defined) so only the implementation, not the header,
depends on <vector>.
In `@Source/common/verifyinghasher/HashTraits.h`:
- Around line 51-64: The macro VERIFYINGHASHER_CC_UPDATE_LOOP uses macro-local
identifiers with leading underscores (_p, _n, _step_sz, _step, _rc); rename them
to non-reserved, project-prefixed names (e.g., vh_p, vh_n, vh_step_sz, vh_step,
vh_rc or cc_loop_p, cc_loop_n, etc.) throughout the macro body so no
global-reserved identifiers are introduced, preserving all existing casts, loop
logic, CC_LONG conversion, and return semantics in the cc_update_fn((ctx), ...)
invocation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f2015db9-8bff-454a-b6a1-6af56567f52f
📒 Files selected for processing (46)
.bazelrcMODULE.bazelSource/common/BUILDSource/common/verifyinghasher/BUILDSource/common/verifyinghasher/CodeSignatureParser.hSource/common/verifyinghasher/CodeSignatureParser.mmSource/common/verifyinghasher/CodeSignatureParserTest.mmSource/common/verifyinghasher/CountingMemoryFileReader.hSource/common/verifyinghasher/FileReader.hSource/common/verifyinghasher/FileReader.mmSource/common/verifyinghasher/FileReaderTest.mmSource/common/verifyinghasher/HashTraits.hSource/common/verifyinghasher/HashTraitsTest.mmSource/common/verifyinghasher/HeaderParser.hSource/common/verifyinghasher/HeaderParser.mmSource/common/verifyinghasher/HeaderParserTest.mmSource/common/verifyinghasher/MemoryFileReader.hSource/common/verifyinghasher/PageVerifier.hSource/common/verifyinghasher/PageVerifierTest.mmSource/common/verifyinghasher/UninitBuffer.hSource/common/verifyinghasher/UninitBufferTest.mmSource/common/verifyinghasher/VerifyingHasher.hSource/common/verifyinghasher/VerifyingHasher.mmSource/common/verifyinghasher/VerifyingHasherCore.hSource/common/verifyinghasher/VerifyingHasherCore.mmSource/common/verifyinghasher/VerifyingHasherCoreTest.mmSource/common/verifyinghasher/VerifyingHasherTest.mmSource/common/verifyinghasher/testdata/hw_team_signedSource/common/verifyinghasher/testdata/hw_universalTesting/Fuzzing/BUILDTesting/Fuzzing/HeaderParserFuzzer.mmTesting/Fuzzing/HeaderParserFuzzer_corpus/fat32_hdrTesting/Fuzzing/HeaderParserFuzzer_corpus/fat64_synthetic_hdrTesting/Fuzzing/HeaderParserFuzzer_corpus/hw_universal_hdrTesting/Fuzzing/HeaderParserFuzzer_corpus/thin_arm64e_hdrTesting/Fuzzing/README.mdTesting/Fuzzing/VerifyingHasherFuzzer.mmTesting/Fuzzing/VerifyingHasherFuzzer_corpus/fat32Testing/Fuzzing/VerifyingHasherFuzzer_corpus/fat64_syntheticTesting/Fuzzing/VerifyingHasherFuzzer_corpus/thin_arm64eTesting/Fuzzing/fuzzing.bzlTesting/Fuzzing/install_libclang_fuzzer.shTesting/Fuzzing/regenerate_corpus.shTesting/OneOffs/BUILDTesting/OneOffs/VerifyingHasher.mmTesting/OneOffs/verifying_hasher_smoke.sh
The \`markdown-check\` job runs \`git grep -EIn $'[ \t]+$' -- ':(exclude)*.patch'\` across the working tree and fails on any trailing whitespace. A single line in \`docs/src/lib/santaconfig.ts\` has one trailing space, which currently fails CI on every open PR (e.g. #942). This commit removes that one trailing character so the gate returns green.
Bundles review-driven changes: - VerifyingHasherCore.mm phase 5 (tail) now returns kIoError when pread yields 0 bytes with cursor_ < total. The previous silent break finalized the SHA-256 over only a prefix of the file and returned kOk / kPagesMismatched, violating the "32-byte SHA-256 of the full file" contract. Matches phases 1 and 3's existing premature-EOF handling. VerifyingHasherCoreTest.mm adds a regression test using a local TruncatedMemoryFileReader that claims a Size() larger than its data to simulate mid-verification truncation. - MemoryFileReader and CountingMemoryFileReader now set errno before returning -1 (EINVAL on negative offset, EIO on MemoryFileReader's synthetic ScheduleErrorOnNextPread() path). FileReader.h's interface documents "-1 = error (errno set)"; FdFileReader satisfied this via pread() itself, but the in-memory readers left errno untouched. - CodeSignatureParserTest.mm's ExtractCsBlob fast-fails when the fat scan finds no matching architecture, rather than falling through into mach_header_64 parsing of the fat header bytes (which could worst-case allocate std::vector<uint8_t> for an arbitrary uint32). - install_libclang_fuzzer.sh derives the resource directory via `clang -print-resource-dir` instead of parsing the human-facing `clang --version` output. Also removes a latent dependency on the /clang/<full-version> -> /clang/<major> compatibility symlink, which isn't created on all Xcode installs. - HashTraits.h's VERIFYINGHASHER_CC_UPDATE_LOOP macro renames its loop-locals from _p/_n/_step_sz/_step/_rc to vh_-prefixed names. All current expansion sites are at block scope where the underscore names are permitted, but at namespace scope they're reserved; the prefix also avoids silent shadowing of caller-side identifiers. - CodeSignatureParser.h drops an unused <vector> include. The header's ParsedCodeDirectory and ParseCodeSignature declarations use std::span / std::string only.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Source/common/verifyinghasher/VerifyingHasherCoreTest.mm`:
- Around line 266-267: The test performs raw buffer ops on the vector returned
by Slurp (variable bytes) and currently uses non-fatal XCTAssert* checks before
indexing into bytes; change those precondition assertions to be fatal by either
returning immediately or calling XCTSkipIf(...) right after the precondition
check so the subsequent bytes[3 * bytes.size() / 4] ^= 0xFF (and other places
that index/memset/memcmp) cannot run on an invalid-sized buffer. Locate the
occurrences around Slurp/bytes in VerifyingHasherCoreTest.mm (e.g., the shown
bytes = Slurp(...) and the other ranges 354-358, 403-405, 503-505, 514-522,
551-563, 596-598) and add a guard like if (bytes.size() < expected) {
XCTSkipIf(true, "fixture missing or wrong size"); return; } or place an early
return immediately after the XCTAssert* to make the failure fatal.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ba034f32-87dd-4443-b97b-d000a88fb872
📒 Files selected for processing (8)
Source/common/verifyinghasher/CodeSignatureParser.hSource/common/verifyinghasher/CodeSignatureParserTest.mmSource/common/verifyinghasher/CountingMemoryFileReader.hSource/common/verifyinghasher/HashTraits.hSource/common/verifyinghasher/MemoryFileReader.hSource/common/verifyinghasher/VerifyingHasherCore.mmSource/common/verifyinghasher/VerifyingHasherCoreTest.mmTesting/Fuzzing/install_libclang_fuzzer.sh
✅ Files skipped from review due to trivial changes (1)
- Source/common/verifyinghasher/CountingMemoryFileReader.h
🚧 Files skipped from review as they are similar to previous changes (5)
- Source/common/verifyinghasher/MemoryFileReader.h
- Testing/Fuzzing/install_libclang_fuzzer.sh
- Source/common/verifyinghasher/HashTraits.h
- Source/common/verifyinghasher/CodeSignatureParser.h
- Source/common/verifyinghasher/CodeSignatureParserTest.mm
Replace 20 manual `::close(fd)` calls across the verifyinghasher unit tests with `santa::ScopedFile` for RAII closure. Helpers with multiple early-return paths (CodeSignatureParserTest::ExtractCsBlob, the various Slurp helpers) no longer repeat `::close` on each error branch. testErrorOnNonMachOFile now uses `ScopedFile::CreateTemporary` instead of a fixed-name path under NSTemporaryDirectory(); the mkstemp-backed file is unlinked at creation and closed on scope exit, removing the post-failure leftover risk and the cross-run collision risk on the fixed-name path. No behavior change. //Source/common/verifyinghasher:unit_tests 8/8.
Two whitespace-only changes from `Testing/fix.sh`: * VerifyingHasherTest.mm: re-wrap two XCTAssertTrue lines added in the ScopedFile-adoption commit so they fit at the project column width. * HashTraits.h: re-align trailing backslashes in VERIFYINGHASHER_CC_UPDATE_LOOP. The vh_-prefix rename in the review- response commit shifted the column widths and the macro wasn't re- formatted at the time.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
Source/common/verifyinghasher/VerifyingHasherCoreTest.mm (1)
258-264:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winMake these test preconditions fatal before raw buffer ops.
XCTAssert*keeps executing. These paths still index intobytes,memsettrailing regions, ormemcmpfixed-length hashes after non-fatal precondition checks, so a missing fixture / short read / unexpected digest length can turn an ordinary test failure into OOB or UB. Please add an immediatereturn(orXCTSkipIf(...)) after each prerequisite check, including after thecpstep intestDetectsTamperOnCopy.Representative fix pattern
- XCTAssertEqual(std::system(cp.c_str()), 0); + const int cp_rc = std::system(cp.c_str()); + XCTAssertEqual(cp_rc, 0); + if (cp_rc != 0) return; auto bytes = Slurp(tampered.c_str()); + XCTAssertFalse(bytes.empty()); + if (bytes.empty()) return; bytes[3 * bytes.size() / 4] ^= 0xFF;XCTAssertGreaterThanOrEqual(bytes.size(), 32u * 1024u); + if (bytes.size() < 32u * 1024u) return; std::memset(bytes.data() + bytes.size() - 32 * 1024, 0, 32 * 1024);XCTAssertEqual(got.size(), static_cast<size_t>(CS_CDHASH_LEN)); + if (got.size() != static_cast<size_t>(CS_CDHASH_LEN) || + expected.size() != static_cast<size_t>(CS_CDHASH_LEN)) return; XCTAssertEqual(0, std::memcmp(got.data(), expected.data(), CS_CDHASH_LEN));Also applies to: 351-355, 399-401, 501-502, 507-519, 524-525, 549-560, 578-595
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Source/common/verifyinghasher/VerifyingHasherCoreTest.mm` around lines 258 - 264, Make the precondition assertions fatal before performing raw buffer operations: in testDetectsTamperOnCopy and the other affected tests, after the cp creation/assertion and each Slurp call or XCTAssert that validates fixture/digest sizes (e.g., checks on cp, tampered, bytes.size(), expected digest length) immediately return or call XCTSkipIf(...) on failure instead of relying on non-fatal XCTAssert*. This prevents subsequent code that mutates or indexes the raw buffer (bytes[...], memset of trailing regions, memcmp of fixed-length hashes) from running on invalid data; update the checks surrounding Slurp(...), bytes, memset, and memcmp sites to early-return/skip when a prerequisite check fails.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@Source/common/verifyinghasher/VerifyingHasherCoreTest.mm`:
- Around line 258-264: Make the precondition assertions fatal before performing
raw buffer operations: in testDetectsTamperOnCopy and the other affected tests,
after the cp creation/assertion and each Slurp call or XCTAssert that validates
fixture/digest sizes (e.g., checks on cp, tampered, bytes.size(), expected
digest length) immediately return or call XCTSkipIf(...) on failure instead of
relying on non-fatal XCTAssert*. This prevents subsequent code that mutates or
indexes the raw buffer (bytes[...], memset of trailing regions, memcmp of
fixed-length hashes) from running on invalid data; update the checks surrounding
Slurp(...), bytes, memset, and memcmp sites to early-return/skip when a
prerequisite check fails.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 18fe9425-2338-490c-9379-4d918a01cf36
📒 Files selected for processing (7)
Source/common/verifyinghasher/BUILDSource/common/verifyinghasher/CodeSignatureParserTest.mmSource/common/verifyinghasher/FileReaderTest.mmSource/common/verifyinghasher/HashTraits.hSource/common/verifyinghasher/HeaderParserTest.mmSource/common/verifyinghasher/VerifyingHasherCoreTest.mmSource/common/verifyinghasher/VerifyingHasherTest.mm
🚧 Files skipped from review as they are similar to previous changes (6)
- Source/common/verifyinghasher/FileReaderTest.mm
- Source/common/verifyinghasher/VerifyingHasherTest.mm
- Source/common/verifyinghasher/HashTraits.h
- Source/common/verifyinghasher/BUILD
- Source/common/verifyinghasher/HeaderParserTest.mm
- Source/common/verifyinghasher/CodeSignatureParserTest.mm
Closes the four open review threads on the PR.
* Testing/Fuzzing/README.md: tag the five unlabeled fenced code
blocks with bash/starlark (markdownlint MD040).
* Testing/OneOffs/verifying_hasher_smoke.sh: derive Case 3's tamper
offset from Case 1's verifier output (slice_off + codeLimit/2)
instead of a hardcoded 49152 + 1024. Keeps the smoke test valid
across future /usr/bin/yes layout changes (size shrink, arch
reorder, thin-vs-fat).
* VerifyingHasherCore.mm: promote premature EOF in the header phase
to kIoError to match the phase-3 / phase-5 classification added in
the prior review-response commit. Also add a "reader served past
reported size" check in all three phases — production hook for
fd-backed readers is the phase-1 case (file growth between fstat
and a later pread); phases 3 and 5 add defense-in-depth against
misbehaving custom FileReader implementations. Two regression
tests cover the truncation and over-serve directions for the
header phase.
* VerifyingHasherCoreTest.mm: make 8 test preconditions fatal via
if (cond) { XCTFail(...); return; } so a missing or mis-sized
fixture produces a clean test failure rather than OOB access in
the next memset/memcmp/index op.
Closes the open human-review threads on the PR. * HeaderParser: reject LE-on-disk fat (host-order FAT_MAGIC / FAT_MAGIC_64) at the magic dispatch. ProcessFatHeader and ProcessFatArchTable assume BE-on-disk and unconditionally OSSwap-from-BE, so accepting LE here would silently corrupt every field downstream. Apple's tools and dyld produce only BE-on-disk fat; xnu's mach_loader also rejects LE via OSSwapBigToHostInt32(magic) == FAT_MAGIC. New regression test testRejectsLittleEndianFatMagic covers both magics. * HeaderParser: factor the mh_swap_ ternaries through a templated SwapIfNeeded member. Eight call sites (four ternaries plus two if-blocks doing pairwise swaps) collapse to direct assignments. * CodeSignatureParser: drop the redundant compare_size field on ParsedCodeDirectory. Slot stride and compare size already live on the HashTraits picked by the RunStreamingPhases<T> dispatch; no hashing or verification code read the runtime field. * CodeSignatureParser: reuse the existing `version` local in ParseCodeDirectory instead of byteswapping cd->version twice. * HeaderParser / CodeSignatureParser: convert XNU file:line citations to permalinks pinned at xnu-12377.101.15, and add precise lines to the three citations that previously named a function without a line. Also correct the "no hash to validate" citation from ubc_subr.c:5849 (a local buffer declaration) to ubc_subr.c:5922-5928 (the actual `validated = FALSE` path inside cs_validate_hash). * BUILD: drop the internal "Task 13" reference to the :VerifyingHasher facade target — point at the rule's actual definition below. * Testing/Fuzzing/fuzzing.bzl: generalize the module docstring; objc_fuzz_test is a generic macro, not VerifyingHasher-specific. * Bump Copyright 2025 -> 2026 on the five files that missed the earlier sweep.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
Source/common/verifyinghasher/HeaderParserTest.mm (2)
127-150: ⚡ Quick winPin the fat-path tests to the checked-in universal fixture.
These assertions only exercise the fat path while
/usr/bin/filehappens to stay universal on the runner. If that binary ever becomes thin on a supported macOS image,slice_offset > 0stops being a stable invariant andtestChunkSizeInvariance/testSliceOffsetIfKnownstop covering the fat-header flow. Thehw_universalfixture mentioned in this PR would make these tests deterministic across host architectures and OS updates.Also applies to: 257-287
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Source/common/verifyinghasher/HeaderParserTest.mm` around lines 127 - 150, The tests testParsesFatBinary and testChunkSizeInvariance (and the other referenced block) rely on /usr/bin/file being a universal binary; switch them to use the checked-in universal fixture (e.g. the hw_universal fixture) instead of Slurp("/usr/bin/file") so the fat-path is deterministic across runners. Update calls that pass the data into FeedChunked (and the SliceInfo assertions) to read the fixture (keep using kHostArch, FeedChunked, and santa::SliceInfo) and also update the related testSliceOffsetIfKnown to use the same hw_universal fixture.
243-250: ⚡ Quick winAvoid the 256 MiB backing buffer in the size-cap regression.
HeaderParseronly needs the declared file size plus the header/load-command bytes to reject this case, so materializing the whole 256 MiB slice just adds a large heap spike to unit and ASan runs. Consider feeding only the small prefix needed for the header while constructing the parser withkTotal, or teachFeedChunkedto take a logical file size separate from the supplied bytes.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Source/common/verifyinghasher/HeaderParserTest.mm` around lines 243 - 250, The test testRejectsImplausibleCsBlobSize currently allocates a full 256 MiB backing buffer via MakeThin64WithCsBlob which is unnecessary; instead produce only the small prefix bytes required for the Mach-O header and load-commands (and the code-signature LC entry) and pass the declared total size kTotal as the logical file size to the parser path: either call FeedChunked with the small prefix buffer and an added parameter representing the logical file size (or modify FeedChunked to accept a separate fileSize argument) or construct the HeaderParser/its input using kTotal while supplying only the prefix bytes; keep MakeThin64WithCsBlob to return just the header+LC prefix and ensure FeedChunked/HeaderParser still sees kTotal so the implausible cs blob size check triggers without allocating 256 MiB.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@Source/common/verifyinghasher/HeaderParserTest.mm`:
- Around line 127-150: The tests testParsesFatBinary and testChunkSizeInvariance
(and the other referenced block) rely on /usr/bin/file being a universal binary;
switch them to use the checked-in universal fixture (e.g. the hw_universal
fixture) instead of Slurp("/usr/bin/file") so the fat-path is deterministic
across runners. Update calls that pass the data into FeedChunked (and the
SliceInfo assertions) to read the fixture (keep using kHostArch, FeedChunked,
and santa::SliceInfo) and also update the related testSliceOffsetIfKnown to use
the same hw_universal fixture.
- Around line 243-250: The test testRejectsImplausibleCsBlobSize currently
allocates a full 256 MiB backing buffer via MakeThin64WithCsBlob which is
unnecessary; instead produce only the small prefix bytes required for the Mach-O
header and load-commands (and the code-signature LC entry) and pass the declared
total size kTotal as the logical file size to the parser path: either call
FeedChunked with the small prefix buffer and an added parameter representing the
logical file size (or modify FeedChunked to accept a separate fileSize argument)
or construct the HeaderParser/its input using kTotal while supplying only the
prefix bytes; keep MakeThin64WithCsBlob to return just the header+LC prefix and
ensure FeedChunked/HeaderParser still sees kTotal so the implausible cs blob
size check triggers without allocating 256 MiB.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6cfe9c76-ea9a-4588-949a-8a367d68ef74
📒 Files selected for processing (13)
Source/common/verifyinghasher/BUILDSource/common/verifyinghasher/CodeSignatureParser.hSource/common/verifyinghasher/CodeSignatureParser.mmSource/common/verifyinghasher/CodeSignatureParserTest.mmSource/common/verifyinghasher/HeaderParser.hSource/common/verifyinghasher/HeaderParser.mmSource/common/verifyinghasher/HeaderParserTest.mmSource/common/verifyinghasher/VerifyingHasherCore.hSource/common/verifyinghasher/VerifyingHasherCore.mmSource/common/verifyinghasher/VerifyingHasherCoreTest.mmTesting/Fuzzing/HeaderParserFuzzer.mmTesting/Fuzzing/VerifyingHasherFuzzer.mmTesting/Fuzzing/fuzzing.bzl
🚧 Files skipped from review as they are similar to previous changes (11)
- Testing/Fuzzing/HeaderParserFuzzer.mm
- Testing/Fuzzing/fuzzing.bzl
- Source/common/verifyinghasher/VerifyingHasherCore.h
- Source/common/verifyinghasher/CodeSignatureParser.h
- Source/common/verifyinghasher/HeaderParser.h
- Source/common/verifyinghasher/CodeSignatureParser.mm
- Source/common/verifyinghasher/HeaderParser.mm
- Source/common/verifyinghasher/VerifyingHasherCore.mm
- Source/common/verifyinghasher/BUILD
- Source/common/verifyinghasher/VerifyingHasherCoreTest.mm
- Source/common/verifyinghasher/CodeSignatureParserTest.mm
* HeaderParserTest: switch testParsesFatBinary, testChunkSizeInvariance, and testSliceOffsetIfKnown from /usr/bin/file to the checked-in hw_universal fixture for determinism. The system binary is fat today, but Apple has been progressively shipping arm64-only system binaries; a future CI runner could silently lose fat coverage in these tests. The fixture carries arm64 + x86_64 (no arm64e), so a host-conditional kHwUniversalArch selector picks the slice actually present rather than reusing kHostArch (which resolves to arm64e on Apple Silicon). * HeaderParserTest: rework testRejectsImplausibleCsBlobSize to feed just the header + LC prefix (~48 bytes) instead of allocating a 256 MiB zero-filled buffer per test run. The parser hits the kMaxCsBlobSize cap on the LC_CODE_SIGNATURE entry before reading the declared CS region, so we tell it the logical file size via kTotal and only hand it the prefix.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Source/common/verifyinghasher/CodeSignatureParser.mm`:
- Around line 130-150: The code currently sets saw_canonical as soon as
slot_type == CSSLOT_CODEDIRECTORY before the CD blob is validated, allowing a
malformed canonical CD to be ignored if an alternate parses; fix by deferring
saw_canonical and candidate insertion until after the blob is fully validated:
move the checks for blob_magic == CSMAGIC_CODEDIRECTORY, blob_len range, and
blob_len >= kMinCdBlobLen before touching saw_canonical or pushing into
candidates; additionally, if is_canonical and any of those validations fail, set
err to a descriptive message (e.g., "malformed code signature (CD blob ... )")
and return false so a malformed canonical CD rejects the whole superblob, while
keeping the existing skip behavior for malformed alternates (is_alternate).
- Around line 89-90: The parser must clear the ParsedCodeDirectory output before
attempting any parse to avoid carrying stale identifier/team_id/spans/hashes
into a new parse; at the start of ParseCodeSignature(std::span<const uint8_t>,
uint64_t, ParsedCodeDirectory&, std::string&) assign a fresh value to out (e.g.
out = ParsedCodeDirectory{} or explicitly clear identifier, team_id, span/hash
fields and offsets) so every early return leaves a clean state, and apply the
same reset to the other code-directory parsing function in this file that also
populates a ParsedCodeDirectory to prevent stale values from surviving between
calls.
In `@Source/common/verifyinghasher/HeaderParserTest.mm`:
- Around line 58-66: The Slurp helper currently calls pread once and treats any
short or interrupted read as a failure; change Slurp to perform a read loop
using pread on santa::ScopedFile::UnsafeFD() that accumulates bytes until the
requested st.st_size is read or a terminal error/EOF occurs: use a ssize_t total
= 0 and while (total < static_cast<ssize_t>(v.size())) { n = ::pread(fd,
v.data() + total, v.size() - total, total); if (n > 0) total += n; else if (n ==
0) return {}; else if (errno == EINTR) continue; else return {}; } and only
return the filled vector when total == v.size(); reference Slurp, ::pread,
santa::ScopedFile, and ::fstat to locate and update the function.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0c5ac9bf-0fae-420e-91d8-bda02b7aab6c
📒 Files selected for processing (13)
Source/common/verifyinghasher/BUILDSource/common/verifyinghasher/CodeSignatureParser.hSource/common/verifyinghasher/CodeSignatureParser.mmSource/common/verifyinghasher/CodeSignatureParserTest.mmSource/common/verifyinghasher/HeaderParser.hSource/common/verifyinghasher/HeaderParser.mmSource/common/verifyinghasher/HeaderParserTest.mmSource/common/verifyinghasher/VerifyingHasherCore.hSource/common/verifyinghasher/VerifyingHasherCore.mmSource/common/verifyinghasher/VerifyingHasherCoreTest.mmTesting/Fuzzing/HeaderParserFuzzer.mmTesting/Fuzzing/VerifyingHasherFuzzer.mmTesting/Fuzzing/fuzzing.bzl
🚧 Files skipped from review as they are similar to previous changes (10)
- Testing/Fuzzing/HeaderParserFuzzer.mm
- Source/common/verifyinghasher/CodeSignatureParser.h
- Source/common/verifyinghasher/VerifyingHasherCore.h
- Testing/Fuzzing/VerifyingHasherFuzzer.mm
- Source/common/verifyinghasher/HeaderParser.h
- Source/common/verifyinghasher/VerifyingHasherCore.mm
- Source/common/verifyinghasher/BUILD
- Source/common/verifyinghasher/HeaderParser.mm
- Source/common/verifyinghasher/CodeSignatureParserTest.mm
- Source/common/verifyinghasher/VerifyingHasherCoreTest.mm
* CodeSignatureParser: reset the output ParsedCodeDirectory at the
top of ParseCodeSignature so callers reusing it across parses
don't leak stale identifier/team_id (conditionally-overwritten
string fields) from a prior successful call into the next one.
Latent today — VerifyingHasherCore's parsed_cd_ is effectively
single-use — but a future caller that reuses the struct would
silently feed stale team_id into the SID/TID drift path.
* HeaderParserTest: loop in Slurp's pread so a short read or EINTR
doesn't truncate to {} and surface as an opaque assertion failure
downstream. Test-only helper; rare in practice on local files
but the loop is the correct shape.
…andidate ParseCodeSignature's candidate-collection loop previously set saw_canonical from the index entry's slot_type alone (before the CD blob's own bytes were inspected) and silently skipped candidates with the wrong magic or with length < kMinCdBlobLen. Two divergences from xnu's cs_validate_csblob fell out: * A slot-0 index entry pointing at non-CodeDirectory bytes still flipped saw_canonical. With a well-formed alternate present, the parser would succeed picking the alternate. xnu rejects the whole superblob via cs_validate_codedirectory on the canonical. * A malformed high-rank alternate (wrong magic or undersized) would silently `continue`, letting the picker fall back to a lower-rank valid CD. An attacker could use this to steer which (cdhash, identifier, team_id) triple the parser publishes downstream — xnu rejects the whole superblob in this case too. Move the magic + length floor checks before saw_canonical / push_back, and promote every failure (canonical *and* alternate) to a hard reject with a descriptive error. Matches the preamble of xnu's cs_validate_codedirectory (length >= sizeof(*cd), magic == CSMAGIC_CODEDIRECTORY) for every CD candidate. The post-loop rationale comment is rewritten to describe the new invariant: non-picked candidates still skip the deeper per-CD structural pass xnu runs, but the floor closes the picker-steering attack and we only hash the picked CD's bytes anyway. Tests: * testRejectsTinyCdBlobLen — the existing assertion expected the old "no CodeDirectory blobs found" error from a silent skip; updated to expect the new "CD blob too small" rejection. * testRejectsMalformedCanonicalWithValidAlternate (new) — slot-0 payload has wrong magic, alternate is well-formed; must reject. * testRejectsUndersizedCanonicalWithValidAlternate (new) — slot-0 payload has CD magic but length < kMinCdBlobLen, alternate is well-formed; must reject. * testRejectsMalformedAlternateWithValidCanonical (new) — the picker-steering case: well-formed SHA-1 canonical plus malformed SHA-256 alternate; must reject rather than fall back to the canonical. Verified clean against the VerifyingHasherFuzzer corpus (60s, 128k iterations, no crashes; new rejection branches covered).
A new subpackage that verifies a Mach-O slice's code signature against
expected (cdhash, signing_id, team_id) using only a borrowed file
descriptor. Designed for use on the ES_AUTH_EXEC hot path. Streams the
file once: full-file SHA-256, page-hash validation against the picked
CodeDirectory, and cdhash compute share a single pass — every byte read
from disk is observed at most once per Run() call.
Public surface in
Source/common/verifyinghasher/VerifyingHasher.h:Exact cdhash equality returns
kMatchCDHash. A (signing_id, team_id)drift fallback returns
kMatchSidTidDriftonly when bothExpected.team_idand the parsed CD's team_id are non-empty (ad-hocbinaries are not eligible). The strongest available CodeDirectory by
hashType is picked (SHA-384 > SHA-256 > SHA-256-truncated > SHA-1),
mirroring xnu's
hashPrioritiesinbsd/kern/ubc_subr.c.Internal layers (package-private; visibility encapsulated by BUILD)
FileReader/FdFileReaderHeaderParserCodeSignatureParserHashTraitsSha{1,256,256Truncated,384}TraitsPageVerifierVerifyingHasherCoreUninitBufferStructural caps and validations track xnu where it has its own bound or
runtime check (LC_CODE_SIGNATURE.datasize, sizeofcmds, CD pageSize,
scatter rejection, the nCodeSlots-vs-codeLimit asymmetry).
Tests under
Source/common/verifyinghasher/CountingMemoryFileReaderoracle assertsMaxReadsAnyByte <= 1across default, small-buffer, and post-malformed-CS paths.
hw_universal: ad-hoc multi-CD fat fixture (SHA-1/256/256t/384).hw_team_signed: team-signed counterpart for the drift path.Tests under
Testing///Testing/OneOffs:verifying_hasher_smokeruns the public facadeagainst
/usr/bin/yes.//Testing/Fuzzing:VerifyingHasherFuzzerandHeaderParserFuzzerwith seed corpora; the end-to-end fuzzer additionally aborts on
MaxReadsAnyByte > 1. libFuzzer + ASan wired via--config=fuzzin.bazelrcand theobjc_fuzz_testmacro inTesting/Fuzzing/fuzzing.bzl.Out of scope for this PR (separate follow-ups)
SNTExecutionController,SNTPolicyProcessor)kError;fallback hashing handled by the consuming caller)
SNTFileInfo/MOLCodesignCheckeredits