Skip to content

Introduce FD-based code-signature verifier#942

Merged
mlw merged 9 commits into
mainfrom
mlw/verifyinghasher
May 12, 2026
Merged

Introduce FD-based code-signature verifier#942
mlw merged 9 commits into
mainfrom
mlw/verifyinghasher

Conversation

@mlw
Copy link
Copy Markdown
Contributor

@mlw mlw commented May 10, 2026

A new subpackage that verifies a Mach-O slice's code signature against
expected (cdhash, signing_id, team_id) using only a borrowed file
descriptor. Designed for use on the ES_AUTH_EXEC hot path. Streams the
file once: full-file SHA-256, page-hash validation against the picked
CodeDirectory, and cdhash compute share a single pass — every byte read
from disk is observed at most once per Run() call.

Public surface in Source/common/verifyinghasher/VerifyingHasher.h:

VerifyingHasher::Run(fd, cputype, cpusubtype, Expected) -> Result
  Status   ::= { kError, kNoMatch, kMatchCDHash, kMatchSidTidDrift }
  Result   ::= { Status, std::optional<std::array<uint8_t, 32>> sha256 }
  Expected ::= { std::span<const uint8_t> cdhash,
                 std::string_view signing_id,
                 std::string_view team_id }

Exact cdhash equality returns kMatchCDHash. A (signing_id, team_id)
drift fallback returns kMatchSidTidDrift only when both
Expected.team_id and the parsed CD's team_id are non-empty (ad-hoc
binaries are not eligible). The strongest available CodeDirectory by
hashType is picked (SHA-384 > SHA-256 > SHA-256-truncated > SHA-1),
mirroring xnu's hashPriorities in bsd/kern/ubc_subr.c.

Internal layers (package-private; visibility encapsulated by BUILD)

Target Purpose
FileReader / FdFileReader pread-only positional file access
HeaderParser fat / mach_header / LC_CODE_SIGNATURE
CodeSignatureParser SuperBlob / CodeDirectory / cdhash
HashTraits Sha{1,256,256Truncated,384}Traits
PageVerifier per-page hash check + gap detection
VerifyingHasherCore phase orchestration + full-file digest
UninitBuffer uninit-on-allocate byte buffer

Structural caps and validations track xnu where it has its own bound or
runtime check (LC_CODE_SIGNATURE.datasize, sizeofcmds, CD pageSize,
scatter rejection, the nCodeSlots-vs-codeLimit asymmetry).

Tests under Source/common/verifyinghasher/

  • Eight unit-test targets, one per layer plus end-to-end.
  • The CountingMemoryFileReader oracle asserts MaxReadsAnyByte <= 1
    across default, small-buffer, and post-malformed-CS paths.
  • hw_universal: ad-hoc multi-CD fat fixture (SHA-1/256/256t/384).
  • hw_team_signed: team-signed counterpart for the drift path.

Tests under Testing/

  • //Testing/OneOffs:verifying_hasher_smoke runs the public facade
    against /usr/bin/yes.
  • //Testing/Fuzzing:VerifyingHasherFuzzer and HeaderParserFuzzer
    with seed corpora; the end-to-end fuzzer additionally aborts on
    MaxReadsAnyByte > 1. libFuzzer + ASan wired via --config=fuzz in
    .bazelrc and the objc_fuzz_test macro in
    Testing/Fuzzing/fuzzing.bzl.

Out of scope for this PR (separate follow-ups)

  • santad integration (SNTExecutionController, SNTPolicyProcessor)
  • handling for unsigned binaries (LC_CODE_SIGNATURE absent → kError;
    fallback hashing handled by the consuming caller)
  • SNTFileInfo / MOLCodesignChecker edits
  • configuration keys, telemetry, rollout flags
  • CS_KILL/CS_HARD page-hash skip optimization

…fier

A new subpackage that verifies a Mach-O slice's code signature against
expected (cdhash, signing_id, team_id) using only a borrowed file
descriptor. Designed for use on the ES_AUTH_EXEC hot path. Streams the
file once: full-file SHA-256, page-hash validation against the picked
CodeDirectory, and cdhash compute share a single pass — every byte read
from disk is observed at most once per Run() call.

Public surface in Source/common/verifyinghasher/VerifyingHasher.h:

  VerifyingHasher::Run(fd, cputype, cpusubtype, Expected) -> Result
    Status   ::= { kError, kNoMatch, kMatchCDHash, kMatchSidTidDrift }
    Result   ::= { Status, std::optional<std::array<uint8_t, 32>> sha256 }
    Expected ::= { std::span<const uint8_t> cdhash,
                   std::string_view signing_id,
                   std::string_view team_id }

Exact cdhash equality returns kMatchCDHash. A (signing_id, team_id)
drift fallback returns kMatchSidTidDrift only when both Expected.team_id
and the parsed CD's team_id are non-empty (ad-hoc binaries are not
eligible). The strongest available CodeDirectory by hashType is picked
(SHA-384 > SHA-256 > SHA-256-truncated > SHA-1), mirroring xnu's
hashPriorities in bsd/kern/ubc_subr.c.

Internal layers (package-private; visibility encapsulated by BUILD):

  FileReader / FdFileReader        pread-only positional file access
  HeaderParser                     fat / mach_header / LC_CODE_SIGNATURE
  CodeSignatureParser              SuperBlob / CodeDirectory / cdhash
  HashTraits                       Sha{1,256,256Truncated,384}Traits
  PageVerifier                     per-page hash check + gap detection
  VerifyingHasherCore              phase orchestration + full-file digest
  UninitBuffer                     uninit-on-allocate byte buffer

Structural caps and validations track xnu where it has its own bound or
runtime check (LC_CODE_SIGNATURE.datasize, sizeofcmds, CD pageSize,
scatter rejection, the nCodeSlots-vs-codeLimit asymmetry).

Tests under Source/common/verifyinghasher/:
- Eight unit-test targets, one per layer plus end-to-end.
- The CountingMemoryFileReader oracle asserts MaxReadsAnyByte <= 1
  across default, small-buffer, and post-malformed-CS paths.
- hw_universal: ad-hoc multi-CD fat fixture (SHA-1/256/256t/384).
- hw_team_signed: team-signed counterpart for the drift path.

Tests under Testing/:
- //Testing/OneOffs:verifying_hasher_smoke runs the public facade
  against /usr/bin/yes.
- //Testing/Fuzzing:VerifyingHasherFuzzer and HeaderParserFuzzer with
  seed corpora; the end-to-end fuzzer additionally aborts on
  MaxReadsAnyByte > 1. libFuzzer + ASan wired via --config=fuzz in
  .bazelrc and the objc_fuzz_test macro in Testing/Fuzzing/fuzzing.bzl.

Out of scope for this PR (separate follow-ups):
- santad integration (SNTExecutionController, SNTPolicyProcessor)
- handling for unsigned binaries (LC_CODE_SIGNATURE absent → kError;
  fallback hashing handled by the consuming caller)
- SNTFileInfo / MOLCodesignChecker edits
- configuration keys, telemetry, rollout flags
- CS_KILL/CS_HARD page-hash skip optimization
@mlw mlw requested a review from a team as a code owner May 10, 2026 18:59
@github-actions github-actions Bot added lang/objc++ PRs modifying files in ObjC++ comp/common size/xl Size: extra large labels May 10, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 10, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • ✅ Review completed - (🔄 Check again to review again)
📝 Walkthrough

Walkthrough

Adds a streaming Mach-O header parser, CodeDirectory parser, per-page verifier, a multi-phase VerifyingHasher core with FD API, comprehensive unit tests, Bazel fuzzing targets/tooling and corpora, a CLI verifier, and supporting build/config updates.

Changes

VerifyingHasher Core System

Layer / File(s) Summary
Build & Dependency Setup
.bazelrc, MODULE.bazel, Source/common/BUILD, Testing/Fuzzing/fuzzing.bzl
Added build:fuzz Bazel config, rules_fuzzing dep, registered verifyinghasher tests, and introduced a fuzzing Starlark helper.
File I/O Abstraction
Source/common/verifyinghasher/FileReader.h, FileReader.mm, MemoryFileReader.h, CountingMemoryFileReader.h, FileReaderTest.mm
Added FileReader interface, FdFileReader, MemoryFileReader with failure injection, CountingMemoryFileReader tracking per-byte reads; included unit tests.
Hash Algorithm Traits
Source/common/verifyinghasher/HashTraits.h, HashTraitsTest.mm
Defined SHA-1/SHA-256/SHA-256-truncated/SHA-384 trait wrappers with chunked CommonCrypto updates; added known-vector and chunked-equivalence tests.
Utility Buffers
Source/common/verifyinghasher/UninitBuffer.h, UninitBufferTest.mm
Added move-only uninitialized byte buffer and tests for allocation and accessors.
Mach-O Header Parsing
Source/common/verifyinghasher/HeaderParser.h, HeaderParser.mm, HeaderParserTest.mm
Phase-driven streaming parser for thin/fat/fat64 Mach-O; locates and validates LC_CODE_SIGNATURE; extensive positive/negative tests.
Code Signature Parsing
Source/common/verifyinghasher/CodeSignatureParser.h, CodeSignatureParser.mm, CodeSignatureParserTest.mm
Parses CS_SuperBlob and CodeDirectory candidates, selects strongest supported hash, validates structure, computes cdhash, extracts identifier/team-id, and includes many regression tests.
Page Verification
Source/common/verifyinghasher/PageVerifier.h, PageVerifierTest.mm
Templated PageVerifier that streams bytes, computes per-page digests, compares to slot-hashes, detects gaps/corruption, and records mismatches; tests verify trait-specific behavior and regressions.
Core Verifier
Source/common/verifyinghasher/VerifyingHasherCore.h, VerifyingHasherCore.mm, VerifyingHasherCoreTest.mm
Implements multi-phase orchestration: header streaming, CS blob extraction, page verification, and full-file SHA-256 hashing; exposes results, mismatches, and errors; tests cover real and synthetic fixtures.
Public Facade
Source/common/verifyinghasher/VerifyingHasher.h, VerifyingHasher.mm, VerifyingHasherTest.mm
High-level VerifyingHasher::Run(fd, arch, expected) returning status and optional SHA-256; supports cdhash match and signing-id/team-id drift fallback; unit tests added.
VerifyingHasher BUILD
Source/common/verifyinghasher/BUILD
Declared libraries, unit tests, two fixture filegroups, visibility for fuzzing/one-offs, and aggregated test_suite.

Fuzzing Infrastructure

Layer / File(s) Summary
Fuzzing Build Rules
Testing/Fuzzing/fuzzing.bzl, Testing/Fuzzing/BUILD
Added objc_fuzz_test macro and Bazel fuzz targets VerifyingHasherFuzzer and HeaderParserFuzzer with corpus wiring and seed filegroup.
Fuzz Target Implementations
Testing/Fuzzing/HeaderParserFuzzer.mm, Testing/Fuzzing/VerifyingHasherFuzzer.mm
libFuzzer entrypoints that feed inputs to HeaderParser and VerifyingHasherCore; VerifyingHasher fuzzer enforces a single-observation oracle via CountingMemoryFileReader.
Fuzzing Tooling & Docs
Testing/Fuzzing/install_libclang_fuzzer.sh, regenerate_corpus.sh, README.md
Tooling to install libFuzzer runtime into Xcode toolchain, regenerate corpora (thin/fat/fat64/universal), and README documenting usage and Bazel --config=fuzz.

CLI Utilities

Layer / File(s) Summary
CLI Tool
Testing/OneOffs/VerifyingHasher.mm, Testing/OneOffs/BUILD
Command-line verifier that runs VerifyingHasherCore and prints parsed CD metadata, full-file digest, CD hash, and mismatches.
Smoke Test
Testing/OneOffs/verifying_hasher_smoke.sh, Testing/OneOffs/BUILD
Shell smoke-test exercising the CLI against real and tampered binaries; wired as a Bazel sh_test.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as VerifyingHasher CLI
  participant Reader as FileReader / FdFileReader
  participant Core as VerifyingHasherCore
  participant Header as HeaderParser
  participant CsParser as CodeSignatureParser
  participant PageVerifier as PageVerifierT
  CLI->>Reader: open file, construct reader
  CLI->>Core: Run(reader, ArchSelector)
  Core->>Header: stream bytes via Update()
  Header-->>Core: slice info, cs_blob_offset/size
  Core->>Reader: pread CS blob region (overlap-aware)
  Core->>CsParser: ParseCodeSignature(blob)
  CsParser-->>Core: ParsedCodeDirectory (slot_hashes, cdhash)
  Core->>PageVerifier: instantiate with slot_hashes/page_size
  Core->>Reader: stream signed-region bytes -> PageVerifier & SHA256
  PageVerifier-->>Core: mismatches / complete
  Core-->>CLI: Result (status, optional sha256, cdhash, mismatches)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • northpolesec/santa#859: Both PRs touch the fuzzing subsystem; that PR removes legacy fuzzing targets and the rules_fuzzing dependency which this change adds.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch mlw/verifyinghasher

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (2)
Source/common/verifyinghasher/HashTraits.h (1)

51-64: 💤 Low value

Consider avoiding leading underscores in macro-local variable names.

The _p, _n, _step_sz, _step, _rc identifiers use leading underscores. While technically allowed in block scope, the C++ standard reserves identifiers beginning with underscore for implementation use in global namespace, which can cause confusion. A common macro hygiene pattern uses a project-specific prefix instead (e.g., vh_p_ or cc_loop_p_).

This is a minor style point; the logic itself correctly handles inputs exceeding CC_LONG capacity.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Source/common/verifyinghasher/HashTraits.h` around lines 51 - 64, The macro
VERIFYINGHASHER_CC_UPDATE_LOOP uses macro-local identifiers with leading
underscores (_p, _n, _step_sz, _step, _rc); rename them to non-reserved,
project-prefixed names (e.g., vh_p, vh_n, vh_step_sz, vh_step, vh_rc or
cc_loop_p, cc_loop_n, etc.) throughout the macro body so no global-reserved
identifiers are introduced, preserving all existing casts, loop logic, CC_LONG
conversion, and return semantics in the cc_update_fn((ctx), ...) invocation.
Source/common/verifyinghasher/CodeSignatureParser.h (1)

29-29: 💤 Low value

Unused include: <vector> appears unnecessary in this header.

The ParsedCodeDirectory struct and ParseCodeSignature declaration don't use std::vector. If it's only needed in the implementation, consider moving it to the .mm file.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Source/common/verifyinghasher/CodeSignatureParser.h` at line 29, The header
unnecessarily includes <vector>; remove that include from CodeSignatureParser.h
and keep the declarations for ParsedCodeDirectory and ParseCodeSignature
unchanged; if any implementation files need std::vector, add `#include` <vector>
in the corresponding .mm (implementation) file (e.g., where ParseCodeSignature
is defined) so only the implementation, not the header, depends on <vector>.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Source/common/verifyinghasher/CodeSignatureParserTest.mm`:
- Around line 61-108: When scanning a FAT binary you never check whether a
matching architecture was found, so slice_off/slice_size can remain 0 and the
code falls through to parse invalid data; after the fat branch (i.e. after the
loop over archs when magic == FAT_MAGIC || magic == FAT_CIGAM) add a fast-fail:
if the file was FAT (magic == FAT_MAGIC || magic == FAT_CIGAM) and no matching
slice was selected (slice_size == 0 or slice_off == 0) then close(fd) and return
{} immediately; reference the existing variables magic, FAT_MAGIC/FAT_CIGAM,
slice_off and slice_size to implement this check so subsequent pread/mach-header
parsing only runs when a valid slice was found.

In `@Source/common/verifyinghasher/CountingMemoryFileReader.h`:
- Around line 36-38: The Pread override in CountingMemoryFileReader returns -1
for negative off but doesn't set errno; update Pread (method name Pread in class
CountingMemoryFileReader) to set errno = EINVAL before returning -1 for off < 0
to satisfy the FileReader error contract, and ensure any required header for
errno (e.g., <errno.h>) is available in the translation unit.

In `@Source/common/verifyinghasher/MemoryFileReader.h`:
- Around line 33-39: Pread in MemoryFileReader returns -1 on synthetic/invalid
paths but doesn't set errno; update MemoryFileReader::Pread so before returning
-1 it sets errno appropriately (e.g. set errno = EIO when failing the synthetic
failure path controlled by fail_next_, and set errno = EINVAL when off < 0) so
it matches the FileReader/FdFileReader contract and callers can inspect errno;
locate the Pread method and add the errno assignments just prior to each return
-1.

In `@Source/common/verifyinghasher/VerifyingHasherCore.mm`:
- Around line 210-225: The tail phase loop in VerifyingHasherCore.mm currently
treats reader_.Pread(...) == 0 as a normal break and proceeds to finalize
full_digest_, which yields a digest for a truncated file; instead, when cursor_
< total and Pread returns 0 treat it as an I/O error: set last_error_ (e.g.,
"pread returned 0 before EOF in tail phase") and return Status::kIoError. Update
the loop that calls reader_.Pread(...) and the Pread==0 branch so it mirrors
earlier phases’ handling of premature EOF (use cursor_, total, last_error_, and
Status::kIoError to locate the change).

In `@Testing/Fuzzing/install_libclang_fuzzer.sh`:
- Around line 49-55: The current CLANG_VERSION/DST_PATH derivation is brittle;
update the logic in this script to first try deriving the resource dir via
`clang -print-resource-dir` and check for the existence of
libclang_rt.fuzzer_osx.a under that path (reference DST_PATH and the `clang
-print-resource-dir` call), and if the file is missing fall back to a more
robust Apple-Clang version parse (use grep/awk to extract the Apple Clang
version into CLANG_VERSION rather than cut) and reconstruct DST_PATH under
XcodeDefault.xctoolchain; always verify the final DST_PATH file exists and exit
with an explicit error if neither method finds the fuzzer runtime.

In `@Testing/Fuzzing/README.md`:
- Around line 31-33: The README has unlabeled fenced code blocks causing MD040
failures; update each fence in Testing/Fuzzing/README.md to include a language
tag: use ```bash for shell snippets (the
./Testing/Fuzzing/install_libclang_fuzzer.sh line, the bazel test --config=fuzz
block, the bazel run --config=fuzz blocks, and
./Testing/Fuzzing/regenerate_corpus.sh) and use ```starlark for the
objc_fuzz_test example (the MyNewFuzzer snippet); ensure each opening fence is
changed accordingly so all unlabeled fences at the reported locations are
labeled.

In `@Testing/OneOffs/verifying_hasher_smoke.sh`:
- Around line 67-76: The hard-coded tamper offset (49152 + 1024) in
verifying_hasher_smoke.sh can end up past EOF or outside the signed region;
modify the script to derive the tamper offset from the verifier output (e.g.,
parse codeLimit or another signed-region boundary emitted by the verifier) and
choose an offset well inside that limit and within the file size before flipping
a byte; update the Python one-liner (or replace it with a small shell+python
helper) to take the computed offset (using the existing $TAMP file and $TAMP
size) and flip a byte there, ensuring you validate the offset is < file size and
< codeLimit to avoid flakes or false negatives.

---

Nitpick comments:
In `@Source/common/verifyinghasher/CodeSignatureParser.h`:
- Line 29: The header unnecessarily includes <vector>; remove that include from
CodeSignatureParser.h and keep the declarations for ParsedCodeDirectory and
ParseCodeSignature unchanged; if any implementation files need std::vector, add
`#include` <vector> in the corresponding .mm (implementation) file (e.g., where
ParseCodeSignature is defined) so only the implementation, not the header,
depends on <vector>.

In `@Source/common/verifyinghasher/HashTraits.h`:
- Around line 51-64: The macro VERIFYINGHASHER_CC_UPDATE_LOOP uses macro-local
identifiers with leading underscores (_p, _n, _step_sz, _step, _rc); rename them
to non-reserved, project-prefixed names (e.g., vh_p, vh_n, vh_step_sz, vh_step,
vh_rc or cc_loop_p, cc_loop_n, etc.) throughout the macro body so no
global-reserved identifiers are introduced, preserving all existing casts, loop
logic, CC_LONG conversion, and return semantics in the cc_update_fn((ctx), ...)
invocation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f2015db9-8bff-454a-b6a1-6af56567f52f

📥 Commits

Reviewing files that changed from the base of the PR and between 4a0f326 and 1f513f3.

📒 Files selected for processing (46)
  • .bazelrc
  • MODULE.bazel
  • Source/common/BUILD
  • Source/common/verifyinghasher/BUILD
  • Source/common/verifyinghasher/CodeSignatureParser.h
  • Source/common/verifyinghasher/CodeSignatureParser.mm
  • Source/common/verifyinghasher/CodeSignatureParserTest.mm
  • Source/common/verifyinghasher/CountingMemoryFileReader.h
  • Source/common/verifyinghasher/FileReader.h
  • Source/common/verifyinghasher/FileReader.mm
  • Source/common/verifyinghasher/FileReaderTest.mm
  • Source/common/verifyinghasher/HashTraits.h
  • Source/common/verifyinghasher/HashTraitsTest.mm
  • Source/common/verifyinghasher/HeaderParser.h
  • Source/common/verifyinghasher/HeaderParser.mm
  • Source/common/verifyinghasher/HeaderParserTest.mm
  • Source/common/verifyinghasher/MemoryFileReader.h
  • Source/common/verifyinghasher/PageVerifier.h
  • Source/common/verifyinghasher/PageVerifierTest.mm
  • Source/common/verifyinghasher/UninitBuffer.h
  • Source/common/verifyinghasher/UninitBufferTest.mm
  • Source/common/verifyinghasher/VerifyingHasher.h
  • Source/common/verifyinghasher/VerifyingHasher.mm
  • Source/common/verifyinghasher/VerifyingHasherCore.h
  • Source/common/verifyinghasher/VerifyingHasherCore.mm
  • Source/common/verifyinghasher/VerifyingHasherCoreTest.mm
  • Source/common/verifyinghasher/VerifyingHasherTest.mm
  • Source/common/verifyinghasher/testdata/hw_team_signed
  • Source/common/verifyinghasher/testdata/hw_universal
  • Testing/Fuzzing/BUILD
  • Testing/Fuzzing/HeaderParserFuzzer.mm
  • Testing/Fuzzing/HeaderParserFuzzer_corpus/fat32_hdr
  • Testing/Fuzzing/HeaderParserFuzzer_corpus/fat64_synthetic_hdr
  • Testing/Fuzzing/HeaderParserFuzzer_corpus/hw_universal_hdr
  • Testing/Fuzzing/HeaderParserFuzzer_corpus/thin_arm64e_hdr
  • Testing/Fuzzing/README.md
  • Testing/Fuzzing/VerifyingHasherFuzzer.mm
  • Testing/Fuzzing/VerifyingHasherFuzzer_corpus/fat32
  • Testing/Fuzzing/VerifyingHasherFuzzer_corpus/fat64_synthetic
  • Testing/Fuzzing/VerifyingHasherFuzzer_corpus/thin_arm64e
  • Testing/Fuzzing/fuzzing.bzl
  • Testing/Fuzzing/install_libclang_fuzzer.sh
  • Testing/Fuzzing/regenerate_corpus.sh
  • Testing/OneOffs/BUILD
  • Testing/OneOffs/VerifyingHasher.mm
  • Testing/OneOffs/verifying_hasher_smoke.sh

Comment thread Source/common/verifyinghasher/CodeSignatureParserTest.mm
Comment thread Source/common/verifyinghasher/CountingMemoryFileReader.h
Comment thread Source/common/verifyinghasher/MemoryFileReader.h
Comment thread Source/common/verifyinghasher/VerifyingHasherCore.mm
Comment thread Testing/Fuzzing/install_libclang_fuzzer.sh Outdated
Comment thread Testing/Fuzzing/README.md Outdated
Comment thread Testing/OneOffs/verifying_hasher_smoke.sh Outdated
mlw added a commit that referenced this pull request May 11, 2026
The \`markdown-check\` job runs \`git grep -EIn $'[ \t]+$' --
':(exclude)*.patch'\` across the working tree and fails on any trailing
whitespace. A single line in \`docs/src/lib/santaconfig.ts\` has one
trailing space, which currently fails CI on every open PR (e.g. #942).

This commit removes that one trailing character so the gate returns
green.
Bundles review-driven changes:

- VerifyingHasherCore.mm phase 5 (tail) now returns kIoError when pread
  yields 0 bytes with cursor_ < total. The previous silent break
  finalized the SHA-256 over only a prefix of the file and returned
  kOk / kPagesMismatched, violating the "32-byte SHA-256 of the full
  file" contract. Matches phases 1 and 3's existing premature-EOF
  handling. VerifyingHasherCoreTest.mm adds a regression test using
  a local TruncatedMemoryFileReader that claims a Size() larger than
  its data to simulate mid-verification truncation.

- MemoryFileReader and CountingMemoryFileReader now set errno before
  returning -1 (EINVAL on negative offset, EIO on
  MemoryFileReader's synthetic ScheduleErrorOnNextPread() path).
  FileReader.h's interface documents "-1 = error (errno set)";
  FdFileReader satisfied this via pread() itself, but the in-memory
  readers left errno untouched.

- CodeSignatureParserTest.mm's ExtractCsBlob fast-fails when the fat
  scan finds no matching architecture, rather than falling through
  into mach_header_64 parsing of the fat header bytes (which could
  worst-case allocate std::vector<uint8_t> for an arbitrary uint32).

- install_libclang_fuzzer.sh derives the resource directory via
  `clang -print-resource-dir` instead of parsing the human-facing
  `clang --version` output. Also removes a latent dependency on the
  /clang/<full-version> -> /clang/<major> compatibility symlink, which
  isn't created on all Xcode installs.

- HashTraits.h's VERIFYINGHASHER_CC_UPDATE_LOOP macro renames its
  loop-locals from _p/_n/_step_sz/_step/_rc to vh_-prefixed names.
  All current expansion sites are at block scope where the underscore
  names are permitted, but at namespace scope they're reserved; the
  prefix also avoids silent shadowing of caller-side identifiers.

- CodeSignatureParser.h drops an unused <vector> include. The header's
  ParsedCodeDirectory and ParseCodeSignature declarations use std::span
  / std::string only.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Source/common/verifyinghasher/VerifyingHasherCoreTest.mm`:
- Around line 266-267: The test performs raw buffer ops on the vector returned
by Slurp (variable bytes) and currently uses non-fatal XCTAssert* checks before
indexing into bytes; change those precondition assertions to be fatal by either
returning immediately or calling XCTSkipIf(...) right after the precondition
check so the subsequent bytes[3 * bytes.size() / 4] ^= 0xFF (and other places
that index/memset/memcmp) cannot run on an invalid-sized buffer. Locate the
occurrences around Slurp/bytes in VerifyingHasherCoreTest.mm (e.g., the shown
bytes = Slurp(...) and the other ranges 354-358, 403-405, 503-505, 514-522,
551-563, 596-598) and add a guard like if (bytes.size() < expected) {
XCTSkipIf(true, "fixture missing or wrong size"); return; } or place an early
return immediately after the XCTAssert* to make the failure fatal.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ba034f32-87dd-4443-b97b-d000a88fb872

📥 Commits

Reviewing files that changed from the base of the PR and between 1f513f3 and cb226d7.

📒 Files selected for processing (8)
  • Source/common/verifyinghasher/CodeSignatureParser.h
  • Source/common/verifyinghasher/CodeSignatureParserTest.mm
  • Source/common/verifyinghasher/CountingMemoryFileReader.h
  • Source/common/verifyinghasher/HashTraits.h
  • Source/common/verifyinghasher/MemoryFileReader.h
  • Source/common/verifyinghasher/VerifyingHasherCore.mm
  • Source/common/verifyinghasher/VerifyingHasherCoreTest.mm
  • Testing/Fuzzing/install_libclang_fuzzer.sh
✅ Files skipped from review due to trivial changes (1)
  • Source/common/verifyinghasher/CountingMemoryFileReader.h
🚧 Files skipped from review as they are similar to previous changes (5)
  • Source/common/verifyinghasher/MemoryFileReader.h
  • Testing/Fuzzing/install_libclang_fuzzer.sh
  • Source/common/verifyinghasher/HashTraits.h
  • Source/common/verifyinghasher/CodeSignatureParser.h
  • Source/common/verifyinghasher/CodeSignatureParserTest.mm

Comment thread Source/common/verifyinghasher/VerifyingHasherCore.mm
Comment thread Source/common/verifyinghasher/VerifyingHasherCoreTest.mm
mlw added 2 commits May 11, 2026 10:20
Replace 20 manual `::close(fd)` calls across the verifyinghasher unit
tests with `santa::ScopedFile` for RAII closure. Helpers with multiple
early-return paths (CodeSignatureParserTest::ExtractCsBlob, the various
Slurp helpers) no longer repeat `::close` on each error branch.

testErrorOnNonMachOFile now uses `ScopedFile::CreateTemporary` instead
of a fixed-name path under NSTemporaryDirectory(); the mkstemp-backed
file is unlinked at creation and closed on scope exit, removing the
post-failure leftover risk and the cross-run collision risk on the
fixed-name path.

No behavior change. //Source/common/verifyinghasher:unit_tests 8/8.
Two whitespace-only changes from `Testing/fix.sh`:

* VerifyingHasherTest.mm: re-wrap two XCTAssertTrue lines added in the
  ScopedFile-adoption commit so they fit at the project column width.
* HashTraits.h: re-align trailing backslashes in
  VERIFYINGHASHER_CC_UPDATE_LOOP. The vh_-prefix rename in the review-
  response commit shifted the column widths and the macro wasn't re-
  formatted at the time.
@mlw mlw added this to the 2026.4 milestone May 11, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
Source/common/verifyinghasher/VerifyingHasherCoreTest.mm (1)

258-264: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make these test preconditions fatal before raw buffer ops.

XCTAssert* keeps executing. These paths still index into bytes, memset trailing regions, or memcmp fixed-length hashes after non-fatal precondition checks, so a missing fixture / short read / unexpected digest length can turn an ordinary test failure into OOB or UB. Please add an immediate return (or XCTSkipIf(...)) after each prerequisite check, including after the cp step in testDetectsTamperOnCopy.

Representative fix pattern
-  XCTAssertEqual(std::system(cp.c_str()), 0);
+  const int cp_rc = std::system(cp.c_str());
+  XCTAssertEqual(cp_rc, 0);
+  if (cp_rc != 0) return;

   auto bytes = Slurp(tampered.c_str());
+  XCTAssertFalse(bytes.empty());
+  if (bytes.empty()) return;
   bytes[3 * bytes.size() / 4] ^= 0xFF;
   XCTAssertGreaterThanOrEqual(bytes.size(), 32u * 1024u);
+  if (bytes.size() < 32u * 1024u) return;
   std::memset(bytes.data() + bytes.size() - 32 * 1024, 0, 32 * 1024);
   XCTAssertEqual(got.size(), static_cast<size_t>(CS_CDHASH_LEN));
+  if (got.size() != static_cast<size_t>(CS_CDHASH_LEN) ||
+      expected.size() != static_cast<size_t>(CS_CDHASH_LEN)) return;
   XCTAssertEqual(0, std::memcmp(got.data(), expected.data(), CS_CDHASH_LEN));

Also applies to: 351-355, 399-401, 501-502, 507-519, 524-525, 549-560, 578-595

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Source/common/verifyinghasher/VerifyingHasherCoreTest.mm` around lines 258 -
264, Make the precondition assertions fatal before performing raw buffer
operations: in testDetectsTamperOnCopy and the other affected tests, after the
cp creation/assertion and each Slurp call or XCTAssert that validates
fixture/digest sizes (e.g., checks on cp, tampered, bytes.size(), expected
digest length) immediately return or call XCTSkipIf(...) on failure instead of
relying on non-fatal XCTAssert*. This prevents subsequent code that mutates or
indexes the raw buffer (bytes[...], memset of trailing regions, memcmp of
fixed-length hashes) from running on invalid data; update the checks surrounding
Slurp(...), bytes, memset, and memcmp sites to early-return/skip when a
prerequisite check fails.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@Source/common/verifyinghasher/VerifyingHasherCoreTest.mm`:
- Around line 258-264: Make the precondition assertions fatal before performing
raw buffer operations: in testDetectsTamperOnCopy and the other affected tests,
after the cp creation/assertion and each Slurp call or XCTAssert that validates
fixture/digest sizes (e.g., checks on cp, tampered, bytes.size(), expected
digest length) immediately return or call XCTSkipIf(...) on failure instead of
relying on non-fatal XCTAssert*. This prevents subsequent code that mutates or
indexes the raw buffer (bytes[...], memset of trailing regions, memcmp of
fixed-length hashes) from running on invalid data; update the checks surrounding
Slurp(...), bytes, memset, and memcmp sites to early-return/skip when a
prerequisite check fails.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 18fe9425-2338-490c-9379-4d918a01cf36

📥 Commits

Reviewing files that changed from the base of the PR and between cb226d7 and 6c1c840.

📒 Files selected for processing (7)
  • Source/common/verifyinghasher/BUILD
  • Source/common/verifyinghasher/CodeSignatureParserTest.mm
  • Source/common/verifyinghasher/FileReaderTest.mm
  • Source/common/verifyinghasher/HashTraits.h
  • Source/common/verifyinghasher/HeaderParserTest.mm
  • Source/common/verifyinghasher/VerifyingHasherCoreTest.mm
  • Source/common/verifyinghasher/VerifyingHasherTest.mm
🚧 Files skipped from review as they are similar to previous changes (6)
  • Source/common/verifyinghasher/FileReaderTest.mm
  • Source/common/verifyinghasher/VerifyingHasherTest.mm
  • Source/common/verifyinghasher/HashTraits.h
  • Source/common/verifyinghasher/BUILD
  • Source/common/verifyinghasher/HeaderParserTest.mm
  • Source/common/verifyinghasher/CodeSignatureParserTest.mm

tburgin
tburgin previously approved these changes May 11, 2026
Closes the four open review threads on the PR.

* Testing/Fuzzing/README.md: tag the five unlabeled fenced code
  blocks with bash/starlark (markdownlint MD040).

* Testing/OneOffs/verifying_hasher_smoke.sh: derive Case 3's tamper
  offset from Case 1's verifier output (slice_off + codeLimit/2)
  instead of a hardcoded 49152 + 1024. Keeps the smoke test valid
  across future /usr/bin/yes layout changes (size shrink, arch
  reorder, thin-vs-fat).

* VerifyingHasherCore.mm: promote premature EOF in the header phase
  to kIoError to match the phase-3 / phase-5 classification added in
  the prior review-response commit. Also add a "reader served past
  reported size" check in all three phases — production hook for
  fd-backed readers is the phase-1 case (file growth between fstat
  and a later pread); phases 3 and 5 add defense-in-depth against
  misbehaving custom FileReader implementations. Two regression
  tests cover the truncation and over-serve directions for the
  header phase.

* VerifyingHasherCoreTest.mm: make 8 test preconditions fatal via
  if (cond) { XCTFail(...); return; } so a missing or mis-sized
  fixture produces a clean test failure rather than OOB access in
  the next memset/memcmp/index op.
Comment thread Source/common/verifyinghasher/BUILD Outdated
Comment thread Source/common/verifyinghasher/CodeSignatureParser.h Outdated
Comment thread Source/common/verifyinghasher/CodeSignatureParser.mm Outdated
Comment thread Source/common/verifyinghasher/CodeSignatureParser.mm Outdated
Comment thread Source/common/verifyinghasher/HeaderParser.mm Outdated
Comment thread Source/common/verifyinghasher/VerifyingHasherCore.mm Outdated
Comment thread Source/common/verifyinghasher/VerifyingHasherCoreTest.mm Outdated
Comment thread Testing/Fuzzing/fuzzing.bzl Outdated
Comment thread Source/common/verifyinghasher/HeaderParser.mm Outdated
Comment thread Source/common/verifyinghasher/HeaderParser.mm Outdated
Closes the open human-review threads on the PR.

* HeaderParser: reject LE-on-disk fat (host-order FAT_MAGIC /
  FAT_MAGIC_64) at the magic dispatch. ProcessFatHeader and
  ProcessFatArchTable assume BE-on-disk and unconditionally
  OSSwap-from-BE, so accepting LE here would silently corrupt
  every field downstream. Apple's tools and dyld produce only
  BE-on-disk fat; xnu's mach_loader also rejects LE via
  OSSwapBigToHostInt32(magic) == FAT_MAGIC. New regression test
  testRejectsLittleEndianFatMagic covers both magics.

* HeaderParser: factor the mh_swap_ ternaries through a templated
  SwapIfNeeded member. Eight call sites (four ternaries plus two
  if-blocks doing pairwise swaps) collapse to direct assignments.

* CodeSignatureParser: drop the redundant compare_size field on
  ParsedCodeDirectory. Slot stride and compare size already live
  on the HashTraits picked by the RunStreamingPhases<T> dispatch;
  no hashing or verification code read the runtime field.

* CodeSignatureParser: reuse the existing `version` local in
  ParseCodeDirectory instead of byteswapping cd->version twice.

* HeaderParser / CodeSignatureParser: convert XNU file:line
  citations to permalinks pinned at xnu-12377.101.15, and add
  precise lines to the three citations that previously named a
  function without a line. Also correct the "no hash to validate"
  citation from ubc_subr.c:5849 (a local buffer declaration) to
  ubc_subr.c:5922-5928 (the actual `validated = FALSE` path
  inside cs_validate_hash).

* BUILD: drop the internal "Task 13" reference to the
  :VerifyingHasher facade target — point at the rule's actual
  definition below.

* Testing/Fuzzing/fuzzing.bzl: generalize the module docstring;
  objc_fuzz_test is a generic macro, not VerifyingHasher-specific.

* Bump Copyright 2025 -> 2026 on the five files that missed the
  earlier sweep.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
Source/common/verifyinghasher/HeaderParserTest.mm (2)

127-150: ⚡ Quick win

Pin the fat-path tests to the checked-in universal fixture.

These assertions only exercise the fat path while /usr/bin/file happens to stay universal on the runner. If that binary ever becomes thin on a supported macOS image, slice_offset > 0 stops being a stable invariant and testChunkSizeInvariance / testSliceOffsetIfKnown stop covering the fat-header flow. The hw_universal fixture mentioned in this PR would make these tests deterministic across host architectures and OS updates.

Also applies to: 257-287

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Source/common/verifyinghasher/HeaderParserTest.mm` around lines 127 - 150,
The tests testParsesFatBinary and testChunkSizeInvariance (and the other
referenced block) rely on /usr/bin/file being a universal binary; switch them to
use the checked-in universal fixture (e.g. the hw_universal fixture) instead of
Slurp("/usr/bin/file") so the fat-path is deterministic across runners. Update
calls that pass the data into FeedChunked (and the SliceInfo assertions) to read
the fixture (keep using kHostArch, FeedChunked, and santa::SliceInfo) and also
update the related testSliceOffsetIfKnown to use the same hw_universal fixture.

243-250: ⚡ Quick win

Avoid the 256 MiB backing buffer in the size-cap regression.

HeaderParser only needs the declared file size plus the header/load-command bytes to reject this case, so materializing the whole 256 MiB slice just adds a large heap spike to unit and ASan runs. Consider feeding only the small prefix needed for the header while constructing the parser with kTotal, or teach FeedChunked to take a logical file size separate from the supplied bytes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Source/common/verifyinghasher/HeaderParserTest.mm` around lines 243 - 250,
The test testRejectsImplausibleCsBlobSize currently allocates a full 256 MiB
backing buffer via MakeThin64WithCsBlob which is unnecessary; instead produce
only the small prefix bytes required for the Mach-O header and load-commands
(and the code-signature LC entry) and pass the declared total size kTotal as the
logical file size to the parser path: either call FeedChunked with the small
prefix buffer and an added parameter representing the logical file size (or
modify FeedChunked to accept a separate fileSize argument) or construct the
HeaderParser/its input using kTotal while supplying only the prefix bytes; keep
MakeThin64WithCsBlob to return just the header+LC prefix and ensure
FeedChunked/HeaderParser still sees kTotal so the implausible cs blob size check
triggers without allocating 256 MiB.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@Source/common/verifyinghasher/HeaderParserTest.mm`:
- Around line 127-150: The tests testParsesFatBinary and testChunkSizeInvariance
(and the other referenced block) rely on /usr/bin/file being a universal binary;
switch them to use the checked-in universal fixture (e.g. the hw_universal
fixture) instead of Slurp("/usr/bin/file") so the fat-path is deterministic
across runners. Update calls that pass the data into FeedChunked (and the
SliceInfo assertions) to read the fixture (keep using kHostArch, FeedChunked,
and santa::SliceInfo) and also update the related testSliceOffsetIfKnown to use
the same hw_universal fixture.
- Around line 243-250: The test testRejectsImplausibleCsBlobSize currently
allocates a full 256 MiB backing buffer via MakeThin64WithCsBlob which is
unnecessary; instead produce only the small prefix bytes required for the Mach-O
header and load-commands (and the code-signature LC entry) and pass the declared
total size kTotal as the logical file size to the parser path: either call
FeedChunked with the small prefix buffer and an added parameter representing the
logical file size (or modify FeedChunked to accept a separate fileSize argument)
or construct the HeaderParser/its input using kTotal while supplying only the
prefix bytes; keep MakeThin64WithCsBlob to return just the header+LC prefix and
ensure FeedChunked/HeaderParser still sees kTotal so the implausible cs blob
size check triggers without allocating 256 MiB.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6cfe9c76-ea9a-4588-949a-8a367d68ef74

📥 Commits

Reviewing files that changed from the base of the PR and between 1e9c6bc and 8762323.

📒 Files selected for processing (13)
  • Source/common/verifyinghasher/BUILD
  • Source/common/verifyinghasher/CodeSignatureParser.h
  • Source/common/verifyinghasher/CodeSignatureParser.mm
  • Source/common/verifyinghasher/CodeSignatureParserTest.mm
  • Source/common/verifyinghasher/HeaderParser.h
  • Source/common/verifyinghasher/HeaderParser.mm
  • Source/common/verifyinghasher/HeaderParserTest.mm
  • Source/common/verifyinghasher/VerifyingHasherCore.h
  • Source/common/verifyinghasher/VerifyingHasherCore.mm
  • Source/common/verifyinghasher/VerifyingHasherCoreTest.mm
  • Testing/Fuzzing/HeaderParserFuzzer.mm
  • Testing/Fuzzing/VerifyingHasherFuzzer.mm
  • Testing/Fuzzing/fuzzing.bzl
🚧 Files skipped from review as they are similar to previous changes (11)
  • Testing/Fuzzing/HeaderParserFuzzer.mm
  • Testing/Fuzzing/fuzzing.bzl
  • Source/common/verifyinghasher/VerifyingHasherCore.h
  • Source/common/verifyinghasher/CodeSignatureParser.h
  • Source/common/verifyinghasher/HeaderParser.h
  • Source/common/verifyinghasher/CodeSignatureParser.mm
  • Source/common/verifyinghasher/HeaderParser.mm
  • Source/common/verifyinghasher/VerifyingHasherCore.mm
  • Source/common/verifyinghasher/BUILD
  • Source/common/verifyinghasher/VerifyingHasherCoreTest.mm
  • Source/common/verifyinghasher/CodeSignatureParserTest.mm

* HeaderParserTest: switch testParsesFatBinary, testChunkSizeInvariance,
  and testSliceOffsetIfKnown from /usr/bin/file to the checked-in
  hw_universal fixture for determinism. The system binary is fat
  today, but Apple has been progressively shipping arm64-only system
  binaries; a future CI runner could silently lose fat coverage in
  these tests. The fixture carries arm64 + x86_64 (no arm64e), so a
  host-conditional kHwUniversalArch selector picks the slice actually
  present rather than reusing kHostArch (which resolves to arm64e on
  Apple Silicon).

* HeaderParserTest: rework testRejectsImplausibleCsBlobSize to feed
  just the header + LC prefix (~48 bytes) instead of allocating a
  256 MiB zero-filled buffer per test run. The parser hits the
  kMaxCsBlobSize cap on the LC_CODE_SIGNATURE entry before reading
  the declared CS region, so we tell it the logical file size via
  kTotal and only hand it the prefix.
@mlw mlw requested review from russellhancox and tburgin May 12, 2026 01:02
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Source/common/verifyinghasher/CodeSignatureParser.mm`:
- Around line 130-150: The code currently sets saw_canonical as soon as
slot_type == CSSLOT_CODEDIRECTORY before the CD blob is validated, allowing a
malformed canonical CD to be ignored if an alternate parses; fix by deferring
saw_canonical and candidate insertion until after the blob is fully validated:
move the checks for blob_magic == CSMAGIC_CODEDIRECTORY, blob_len range, and
blob_len >= kMinCdBlobLen before touching saw_canonical or pushing into
candidates; additionally, if is_canonical and any of those validations fail, set
err to a descriptive message (e.g., "malformed code signature (CD blob ... )")
and return false so a malformed canonical CD rejects the whole superblob, while
keeping the existing skip behavior for malformed alternates (is_alternate).
- Around line 89-90: The parser must clear the ParsedCodeDirectory output before
attempting any parse to avoid carrying stale identifier/team_id/spans/hashes
into a new parse; at the start of ParseCodeSignature(std::span<const uint8_t>,
uint64_t, ParsedCodeDirectory&, std::string&) assign a fresh value to out (e.g.
out = ParsedCodeDirectory{} or explicitly clear identifier, team_id, span/hash
fields and offsets) so every early return leaves a clean state, and apply the
same reset to the other code-directory parsing function in this file that also
populates a ParsedCodeDirectory to prevent stale values from surviving between
calls.

In `@Source/common/verifyinghasher/HeaderParserTest.mm`:
- Around line 58-66: The Slurp helper currently calls pread once and treats any
short or interrupted read as a failure; change Slurp to perform a read loop
using pread on santa::ScopedFile::UnsafeFD() that accumulates bytes until the
requested st.st_size is read or a terminal error/EOF occurs: use a ssize_t total
= 0 and while (total < static_cast<ssize_t>(v.size())) { n = ::pread(fd,
v.data() + total, v.size() - total, total); if (n > 0) total += n; else if (n ==
0) return {}; else if (errno == EINTR) continue; else return {}; } and only
return the filled vector when total == v.size(); reference Slurp, ::pread,
santa::ScopedFile, and ::fstat to locate and update the function.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0c5ac9bf-0fae-420e-91d8-bda02b7aab6c

📥 Commits

Reviewing files that changed from the base of the PR and between 1e9c6bc and 732d64f.

📒 Files selected for processing (13)
  • Source/common/verifyinghasher/BUILD
  • Source/common/verifyinghasher/CodeSignatureParser.h
  • Source/common/verifyinghasher/CodeSignatureParser.mm
  • Source/common/verifyinghasher/CodeSignatureParserTest.mm
  • Source/common/verifyinghasher/HeaderParser.h
  • Source/common/verifyinghasher/HeaderParser.mm
  • Source/common/verifyinghasher/HeaderParserTest.mm
  • Source/common/verifyinghasher/VerifyingHasherCore.h
  • Source/common/verifyinghasher/VerifyingHasherCore.mm
  • Source/common/verifyinghasher/VerifyingHasherCoreTest.mm
  • Testing/Fuzzing/HeaderParserFuzzer.mm
  • Testing/Fuzzing/VerifyingHasherFuzzer.mm
  • Testing/Fuzzing/fuzzing.bzl
🚧 Files skipped from review as they are similar to previous changes (10)
  • Testing/Fuzzing/HeaderParserFuzzer.mm
  • Source/common/verifyinghasher/CodeSignatureParser.h
  • Source/common/verifyinghasher/VerifyingHasherCore.h
  • Testing/Fuzzing/VerifyingHasherFuzzer.mm
  • Source/common/verifyinghasher/HeaderParser.h
  • Source/common/verifyinghasher/VerifyingHasherCore.mm
  • Source/common/verifyinghasher/BUILD
  • Source/common/verifyinghasher/HeaderParser.mm
  • Source/common/verifyinghasher/CodeSignatureParserTest.mm
  • Source/common/verifyinghasher/VerifyingHasherCoreTest.mm

Comment thread Source/common/verifyinghasher/CodeSignatureParser.mm
Comment thread Source/common/verifyinghasher/CodeSignatureParser.mm
Comment thread Source/common/verifyinghasher/HeaderParserTest.mm
mlw added 2 commits May 11, 2026 22:52
* CodeSignatureParser: reset the output ParsedCodeDirectory at the
  top of ParseCodeSignature so callers reusing it across parses
  don't leak stale identifier/team_id (conditionally-overwritten
  string fields) from a prior successful call into the next one.
  Latent today — VerifyingHasherCore's parsed_cd_ is effectively
  single-use — but a future caller that reuses the struct would
  silently feed stale team_id into the SID/TID drift path.

* HeaderParserTest: loop in Slurp's pread so a short read or EINTR
  doesn't truncate to {} and surface as an opaque assertion failure
  downstream. Test-only helper; rare in practice on local files
  but the loop is the correct shape.
…andidate

ParseCodeSignature's candidate-collection loop previously set
saw_canonical from the index entry's slot_type alone (before the CD
blob's own bytes were inspected) and silently skipped candidates with
the wrong magic or with length < kMinCdBlobLen. Two divergences from
xnu's cs_validate_csblob fell out:

* A slot-0 index entry pointing at non-CodeDirectory bytes still
  flipped saw_canonical. With a well-formed alternate present, the
  parser would succeed picking the alternate. xnu rejects the whole
  superblob via cs_validate_codedirectory on the canonical.

* A malformed high-rank alternate (wrong magic or undersized) would
  silently `continue`, letting the picker fall back to a lower-rank
  valid CD. An attacker could use this to steer which (cdhash,
  identifier, team_id) triple the parser publishes downstream — xnu
  rejects the whole superblob in this case too.

Move the magic + length floor checks before saw_canonical / push_back,
and promote every failure (canonical *and* alternate) to a hard reject
with a descriptive error. Matches the preamble of xnu's
cs_validate_codedirectory (length >= sizeof(*cd), magic ==
CSMAGIC_CODEDIRECTORY) for every CD candidate. The post-loop rationale
comment is rewritten to describe the new invariant: non-picked
candidates still skip the deeper per-CD structural pass xnu runs, but
the floor closes the picker-steering attack and we only hash the
picked CD's bytes anyway.

Tests:

* testRejectsTinyCdBlobLen — the existing assertion expected the old
  "no CodeDirectory blobs found" error from a silent skip; updated to
  expect the new "CD blob too small" rejection.

* testRejectsMalformedCanonicalWithValidAlternate (new) — slot-0
  payload has wrong magic, alternate is well-formed; must reject.

* testRejectsUndersizedCanonicalWithValidAlternate (new) — slot-0
  payload has CD magic but length < kMinCdBlobLen, alternate is
  well-formed; must reject.

* testRejectsMalformedAlternateWithValidCanonical (new) — the
  picker-steering case: well-formed SHA-1 canonical plus malformed
  SHA-256 alternate; must reject rather than fall back to the
  canonical.

Verified clean against the VerifyingHasherFuzzer corpus (60s, 128k
iterations, no crashes; new rejection branches covered).
@mlw mlw merged commit c967421 into main May 12, 2026
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/common lang/objc++ PRs modifying files in ObjC++ size/xl Size: extra large

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants