Soph Fix Checkpoint Path Traversal#1362
Closed
entire[bot] wants to merge 11 commits into
Closed
Conversation
…write Session IDs read from checkpoint metadata on the shared entire/checkpoints/v1 branch flowed into agent.ResolveSessionFile + WriteSession during `entire session resume` and `entire checkpoint rewind` with no validation. A crafted absolute or "../"-laden session ID escaped the agent session directory (and for Codex/Pi, which return absolute IDs verbatim, landed anywhere), letting attacker-controlled transcript bytes overwrite arbitrary files such as ~/.bashrc — RCE on the next resume, with no prompt shown to the victim. Validate the session ID with validation.ValidateSessionID at the two restore choke points before any path construction: - resolveTranscriptPath (covers resume-single, rewind restore, attach) - RestoreLogsOnly write + status loops (multi-session resume/rewind) This mirrors the invariant already enforced when checkpoints are written, so it cannot reject a legitimately-created checkpoint while closing every separator/absolute traversal. Unsafe IDs are rejected (resolveTranscriptPath) or skipped with a warning (RestoreLogsOnly). Adds end-to-end and choke-point regression tests proving a traversal session ID writes nothing outside the agent session directory. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 6963a4a931b1
ValidateSessionID previously rejected only path separators and empty input. A bare "." or ".." is separator-free yet still traverses when an agent uses the ID as a path segment (e.g. Copilot CLI builds <dir>/<id>/events.jsonl), and the separator check can miss platform-specific absolute forms (Windows drive paths). Reject both. Because the same validator guards both checkpoint writes and the resume/rewind restore boundaries, this tightens the whole class without affecting UUID-style IDs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 388c3d8a3b31
When `entire checkpoint rewind` restores files from a checkpoint tree, the file write already used os.Root (kernel-enforced containment), but the preceding directory creation used a raw os.MkdirAll(filepath.Join(repoRoot, f.Name)). A crafted tree entry name containing ".." from an untrusted checkpoint could thus create empty directories outside the repo before the guarded write refused to escape. Add osroot.MkdirAll, which creates each level via os.Root.Mkdir so the kernel rejects any path that escapes the root, and use it in the restore loop. Both filesystem operations in that loop are now containment-safe. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: f52cf53648bf
The PostToolUse (TodoWrite) hook passed input.SessionID and the task tool-use ID straight into SessionMetadataDirFromSessionID / TaskMetadataDir and then os.ReadDir, with no validation — unlike every other hook entry point. A crafted "../.." could redirect the directory listing. Validate both IDs at this choke point; an invalid value simply starts the checkpoint sequence at 1. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: ddf2fe4a50fe
…tests Document on the Agent.ResolveSessionFile interface that some agents use agentSessionID as a directory component or (Codex/Pi) return it verbatim when absolute, so callers sourcing the ID from untrusted data MUST validate it first — the resume/rewind choke points do. Add contract-pinning tests for the two unusual agents: Codex (absolute returned verbatim) and Copilot (ID as directory component). Each asserts both the agent behavior and that ValidateSessionID rejects the dangerous shape, so the guard cannot regress out from under these agents. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 383105db9469
The session-state writers key their filenames on the session ID. The ID
is already validated, so this is belt-and-suspenders: route the writes
through an os.Root scoped to the entire-sessions directory so the kernel
makes escaping the directory impossible even if a validation gap were
ever introduced.
- session/state.go: Save's atomic rename now goes through os.Root
(Load and Clear already did). Drops the now-unused stateFilePath.
- strategy/session_state.go: StoreModelHint, StoreAgentTypeHint,
ClaimSessionStartBanner, LoadModelHint, LoadAgentTypeHint and
ClearSessionState now operate via os.Root. Two small helpers
(openSessionStateRoot / ...ForRead) remove the repeated preamble.
Go 1.26's os.Root natively supports Rename/OpenFile(O_EXCL)/WriteFile,
so the atomic-rename and first-writer-wins semantics are preserved.
The sibling entire-session-locks/ dir is intentionally left as-is: its
flock semantics are inode-bound and it is a separate directory.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: f95ae94b9edc
Go 1.26's os.Root has a native MkdirAll, so the hand-rolled per-segment
Mkdir loop (with its ToSlash/Trim/path.Join handling and extra imports)
collapses to a one-line delegation — battle-tested and stats before
creating. Also:
- drop the dead `dir != ""` guard in the rewind restore loop
(path.Dir never returns "")
- drop TestMkdirAll_EmptyOrDotIsNoop, which exercised the removed
Trim/no-op logic; the call site never passes ""/"."
- correct the now-stale osroot package doc that listed MkdirAll as
unsupported
Pure cleanup from a /simplify review; no behavior change.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 38a921fb7470
Pi's ParseHookEvent calls captureTranscript on agent_end, which builds the destination path from the hook-supplied session ID and writes the transcript — all before the lifecycle dispatcher validates the ID in DispatchLifecycleEvent. A "../"-laden ID could thus write outside the .entire/tmp/pi cache directory. Validate the session ID at the top of captureTranscript (the choke point where it becomes a path, covering both call sites); an unsafe ID returns "" — the function's existing "no capture" signal — so dispatch behavior is unchanged. Also corrects the gosec comment on the write that wrongly claimed the ID was already validated. Taint is the local Pi hook payload (same-privilege), so severity is low; this is defense-in-depth consistent with the other hook hardening in this branch. The sibling cacheSessionID writes a fixed filename, not an ID-derived path, so it needs no guard. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: ddfd40f73b22
handleLifecycleTurnEnd builds .entire/metadata/<session_id>/ via os.MkdirAll + os.WriteFile directly from the hook-supplied event.SessionID with no validation — unlike its siblings (SessionStart/TurnStart/ToolUse), which each validate. A "../"-laden ID could write the transcript outside the metadata directory. The ModelUpdate/Compaction/SubagentEnd handlers were also unguarded, surviving only because the strategy layer (MutateSessionState/StoreModelHint) validates internally. Per-handler validation is the wrong altitude: it is exactly the kind of check a new (or existing) handler forgets, which is what happened here. Validate non-empty event.SessionID once in DispatchLifecycleEvent, before routing, so every handler — current and future — is covered uniformly. Empty IDs still pass through to each handler's own empty-handling (e.g. TurnEnd's fallback to the "unknown" constant). The existing per-handler validations are now redundant but left in place as harmless defense-in-depth; they could be consolidated separately. Taint is local hook input (same-privilege), so severity is low; this completes the hook-path hardening theme on this branch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 0a614e131bab
CalculateTotalTokenUsage and ExtractAllModifiedFiles build agent-<id>.jsonl from subagent IDs and read that file, with no local validation of the ID. Today the IDs are path-safe only because their sole source, extractAgentIDFromText, happens to accept just [a-zA-Z0-9]. That makes the path-safety of a file read depend on the incidental character set of a parser two calls away — a fragile coupling: relaxing extractAgentIDFromText (e.g. to accept hyphenated UUIDs) would silently open a traversal-read of arbitrary agent-*.jsonl-shaped paths. Enforce the invariant at the choke point: ExtractSpawnedAgentIDs now drops any ID that fails validation.ValidateAgentID, so every downstream agent-<id>.jsonl consumer (claudecode + factoryaidroid, token usage and modified-files) is path-safe by construction. Not a live vulnerability — the current parser already constrains the ID, so the guard is a no-op for today's inputs and adds no behavior change. No test is added: the guarded branch is unreachable through the real extractor (it can't emit a non-path-safe ID), so a test would only exercise dead input. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: c56a01dc268e
…cleEvent handleLifecycleSubagentEnd builds a subagent transcript path from event.SubagentID (AgentTranscriptPath -> filepath.Join), then stats and reads it via ExtractModifiedFilesFromOffset — before the value is validated. A "../"-laden SubagentID escapes transcriptDir for a file read. The task-checkpoint *write* paths (CapturePreTaskState, WriteTemporaryTask) already validate ToolUseID/AgentID, but this read ran first and unguarded. Extend the central dispatcher guard (added for SessionID) to also reject path-unsafe event.ToolUseID (ValidateToolUseID) and event.SubagentID (ValidateAgentID) before routing, so every handler that turns these hook-supplied identifiers into a path is covered at one choke point. Empty values pass through to each handler's own empty-handling. Taint is local hook input (same-privilege) and the sink is a read, so severity is low; this closes the last identifier in the lifecycle path family. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 92ddb0963aa9
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
https://entire.io/gh/entireio/cli/trails/512
This pull request was opened automatically by Entire to run CI on the linked trail. Feel free to edit the title or body — the link above is what keeps the trail and PR connected.