Skip to content

FIFO (named-pipe) bait as the v1 read sensor (replaces fs_usage); eslogger deferred #100

Description

@LiorFink00

Decision (v1)

Replace fs_usage with FIFO (named-pipe) bait as the sole read sensor for v1. The agent plants each bait as a named pipe and holds the write end; any process that opens the bait to read it unblocks our open() — that open is the detection. Pure shell, unprivileged, cross-platform, and no kdebug — which dissolves the single-consumer collision that originally motivated this issue (#94/#95).

eslogger is deferred to a documented follow-up (see bottom), to be added only if/when we want scan-path coverage or race-free attribution. fs_usage is removed.

Why FIFO is sufficient for the real threat (verified)

The headline threat (Shai-Hulud-class npm worms) reads canonical credential paths by name (~/.aws/credentials, ~/.npmrc, ~/.config/gcloud/..., azureProfile.json) via readFileSync and cloud-SDK config loaders. Verified locally:

  • A blind readFileSync on a FIFO drains our served content instantly.
  • existsSync returns true for a FIFO, and an existsSync-guarded read still drains it — the typical stealer pattern.
  • The serving loop + grep -F path prefilter is pure shell, no dependency.

The worm also runs trufflehog, which skips non-regular files (if !entry.Type().IsRegular() { continue }, confirmed in trufflehog source — scanDir, Chunks, scanSymlink), so it walks past a FIFO. This does not matter for canonical-path bait, because the worm reads those paths by name, independently of the trufflehog sweep. The trufflehog-skip only blinds bait discoverable solely by scanning a non-canonical path — which is not Thumper's placement strategy. (Symlink tricks don't help: trufflehog applies the regular-file test to the resolved target.)

Known blind spots (accepted for v1)

  1. A worm that calls statSync(path).isFile() before reading skips a FIFO (isFile() is false for a pipe; verified). Atypical for credential stealers.
  2. mmap-based readers can't read a FIFO (mmapENODEV). Most readers use read().
  3. Scan-only discovery of a non-canonical decoy path (trufflehog, above).

All three are covered by the deferred eslogger follow-up (a real ES sensor sees the open of a plain regular file regardless of type or discovery method, with inline attribution). Not in v1.

Serving model + integration

  • plant(): mkfifo "$path" instead of curl -o "$path"; fetch the bait content once and cache it; the watcher writes the cached content into the pipe on each read. Keep the existing traversal/symlink/overwrite guards.
  • verify_planted(): treat [ -p "$p" ] as planted. It already only stat-checks (never reads), so it does not self-trigger.
  • Serving loop replaces watch_fs_usage(): while :; do <snapshot reader pid>; printf '%s' "$content" > "$fifo"; fire ...; done.
  • Reuse fire / debounce / is_noise unchanged.

PID/user attribution (best-effort)

After our open(O_WRONLY) unblocks, the reader is parked in read() with the pipe still open. Snapshot lsof "$path" (or fuser) before writing content to grab the reader's pid, then feed the existing user_of_pid. This is less racy than today's post-hoc lookup (the reader is guaranteed to still hold the pipe). Degrade gracefully to "reader unknown" when lsof misses.

Operational risks to handle in v1

Relationship to related sensor issues

Acceptance

  • Reading a planted FIFO bait fires the HMAC callback — pure shell, unprivileged — on macOS and Linux.
  • Two agents on one host both detect reads of their own bait simultaneously (no kdebug contention).
  • Agent exit removes its FIFOs (no orphaned, blocking pipes).
  • The callback includes the reader's pid/user when obtainable, degrading gracefully when not.

Deferred: eslogger follow-up (not v1)

For scan-path coverage + race-free attribution, add eslogger open as an optional macOS sensor: macOS 13+, root, JSON parsed dependency-free via plutil -extract <keypath> raw -o - -, multi-client (no kdebug). Verified feasible on macOS 26.5.1 (event type open, plutil extraction of event.open.file.path / process.executable.path / pid). TCC/Full Disk Access in the launchd/MDM deploy path is the open question. Tracked separately when prioritized.


Update (2026-06-26): validated layered attribution design

The "normal file" gap (a worm that statSync().isFile()-guards, mmaps, or only scan-discovers a path walks past a FIFO — blind spots #1#3 above) and the pid-attribution goal are now handled by a layered sensor rather than FIFO-alone. Each layer was empirically validated on macOS 26.5.1:

  1. Primary detection — atime tripwire on a REGULAR-FILE bait. Re-armable (touch -a -t <past>; APFS relatime bumps atime on read; poll stat). Catches every reader type incl. the FIFO blind spots, and satisfies "normal file." Honors all sensor constraints (regular file, no kdebug, no mount, no privilege). No pid — detection only. Depends on Agent atime fallback sensor is broken #28 (atime is now a primary layer, not a last resort).
  2. Deterministic pid — FIFO companion. Planted alongside the regular-file bait. Agent polls the write end (open(O_WRONLY|O_NONBLOCK)ENXIO until a reader is parked); on rendezvous the reader is parked in read() with its fd open, and an inode/realpath scan of full lsof output yields the exact reader pid+name. Validated 8/8 deterministic. Correctness bug found: lsof -t <path> does not match FIFOs (returns nothing) — attribution must scan full lsof and match the bait by inode/realpath, excluding the agent's own pid.
  3. Best-effort pid shortlist — churn-ledger + bait constellation. A lightweight proc_listpids-delta ledger (no kdebug/ESF) records short-lived processes; when an atime bait trips, the processes newly-spawned-and-alive across the read window form a suspect set, and intersecting the suspect sets of several constellation baits narrows it. Validated: 100% capture of the reader, but a ~9-process shortlist under realistic concurrent churn (NOT a unique pid) — rank within it by stealer-shape heuristics (interpreter, open-file count, live outbound socket, parent lineage). Honest limit: a pre-existing long-lived reader emits no new-process signal and can't be attributed this way.

Net: detection on a normal file under all constraints (atime); a deterministic pid for the dominant raw-read worm (FIFO companion); a best-effort shortlist for the rest — with no FDA / kext / ESF / kdebug / mount. The "read + pid under all six constraints" grail is provably impossible on macOS (every read-with-pid kernel hook — ESF, kdebug, DTrace, OpenBSM, Kauth/MACF, FSE_ACCESS_GRANTED — is FDA/entitlement/kext-gated or dead; verified across ~44 empirical + literature investigations). This layered design is the validated maximum.

Update (2026-06-27): definitive mechanisms only

Product decision: only definitive mechanisms. Layer 3 (churn-ledger best-effort shortlist) and the bait-constellation intersection are dropped — they capture the reader but only within a ~9-candidate shortlist, never a definitive pid, and a tripwire must not guess. PR #161 closed.

The two definitive layers remain:

  1. atime tripwire (regular-file bait) — definitive detection of a read on a normal file, re-armable (PR Re-armable atime sensor + --sensor selector (#28, #100) #160). No pid.
  2. FIFO companion — definitive attribution: parks the raw-reader and reads its exact pid+name via the lsof inode scan, 8/8 (PR FIFO named-pipe bait sensor (replaces fs_usage) #123).

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingenhancementNew feature or request

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions