fix(paths): findMarkdownFilesRecursive follows symlinks#1503
Conversation
📝 WalkthroughWalkthroughfindMarkdownFilesRecursive was updated to resolve symlink targets via stat(), treating symlinked files and directories according to their targets and skipping broken symlinks. New Windows-skipped tests exercise file symlinks, mixed regular+symlinked files, symlinked directories, and broken symlinks. ChangesSymlink Resolution Implementation
Symlink Behavior Test Suite
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/paths/src/archon-paths.ts (1)
231-263:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftGuard against symlink cycles before recursing.
Following directory symlinks here can recurse forever if a link points back to an ancestor or otherwise forms a loop, which would hang discovery at startup. Please track visited directory targets before descending.
Suggested direction
+// Pass a visited-set through recursion and key it by target inode/dev (or realpath) +// so symlink cycles are skipped instead of recursed into repeatedly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/paths/src/archon-paths.ts` around lines 231 - 263, The recursion can loop on directory symlink cycles; modify findMarkdownFilesRecursive to accept or use a Set<string> visitedRealPaths, and before recursing from the block that handles directories (where you currently call stat(join(...)) for symlinks and then call findMarkdownFilesRecursive), resolve the directory's real path (e.g., via realpath) and check the Set; if present, skip descending, otherwise add the real path to the Set, call findMarkdownFilesRecursive with the same Set, and keep it for the duration of discovery (do not rely on local stack-only removal so shared ancestors remain marked); reference symbols: findMarkdownFilesRecursive, entry.isSymbolicLink(), stat/join, currentDepth/maxDepth, and results.push to locate the insertion point.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@packages/paths/src/archon-paths.ts`:
- Around line 231-263: The recursion can loop on directory symlink cycles;
modify findMarkdownFilesRecursive to accept or use a Set<string>
visitedRealPaths, and before recursing from the block that handles directories
(where you currently call stat(join(...)) for symlinks and then call
findMarkdownFilesRecursive), resolve the directory's real path (e.g., via
realpath) and check the Set; if present, skip descending, otherwise add the real
path to the Set, call findMarkdownFilesRecursive with the same Set, and keep it
for the duration of discovery (do not rely on local stack-only removal so shared
ancestors remain marked); reference symbols: findMarkdownFilesRecursive,
entry.isSymbolicLink(), stat/join, currentDepth/maxDepth, and results.push to
locate the insertion point.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2e52455b-e3df-4272-9f47-e5cee41a4dfc
📒 Files selected for processing (2)
packages/paths/src/archon-paths.test.tspackages/paths/src/archon-paths.ts
1be0bb8 to
0f98fe9
Compare
…m00#1501) `Dirent.isFile()` returns false for symlinks, so symlinked `.md` files in the search root were silently skipped. This made team-shared command directories that get exposed to Archon via symlinks (a `~/.archon/commands/foo.md` → `/path/to/team-repo/...` pattern) unusable — workflow nodes referencing those commands failed with `Command prompt not found` even though the file was present and readable. Workflow discovery in `loadWorkflowsFromDir` already handles this correctly by stat()-ing each entry; this brings command discovery in line. Implementation: when a `Dirent` is a symlink, follow it with `stat()` and use the target's `isFile()` / `isDirectory()` for the branch decision. Broken symlinks are skipped silently (matches the existing tolerance for missing search dirs at the top of the function). Tests cover: symlinked file in the search root, mixed regular + symlinked entries in the same dir, symlinked subdirectory, broken symlink. All four exercise the new branch.
0f98fe9 to
6aaaeb9
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
packages/paths/src/archon-paths.ts (1)
246-260: 💤 Low valueConsider guarding against symlink cycles.
Following symlinks for directories opens the door to infinite recursion if a symlinked subdir points to an ancestor (e.g.,
tempRoot/loop → tempRoot). The previousDirent.isDirectory()path implicitly avoided this since symlinks were skipped. For the primary use case (team-shared command files), this is unlikely, but a visited-realpath set or a hard recursion-depth cap (independent ofmaxDepth, which defaults toInfinity) would make the walk robust against accidentally cyclic setups.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/paths/src/archon-paths.ts` around lines 246 - 260, The current symlink handling (entry.isSymbolicLink() branch using stat(join(fullPath, entry.name)) and then recursing based on isDir/isFile) can follow cycles; add cycle protection by tracking resolved realpaths and/or enforcing a hard recursion cap independent of maxDepth: when you detect a symlink, resolve its real path (e.g., via realpath on join(fullPath, entry.name)), maintain a Set of visitedRealpaths and skip recursion if the realpath is already present, or increment a separate recursionDepth counter and abort further descent when it exceeds a safe constant; update the logic around entry.isSymbolicLink(), stat(...), isDir/isFile and any recursive call sites to consult the visitedRealpaths or recursionDepth before recursing.packages/paths/src/archon-paths.test.ts (1)
5-5: 💤 Low valueConsolidate the
fs/promisesimports.
symlink as fsSymlinkcan be added to the existingfs/promisesimport at line 5 rather than adding a second import statement for the same module.♻️ Proposed consolidation
-import { mkdir, rm, writeFile, lstat, readlink } from 'fs/promises'; +import { mkdir, rm, writeFile, lstat, readlink, symlink as fsSymlink } from 'fs/promises'; @@ -import { symlink as fsSymlink } from 'fs/promises';Also applies to: 41-41
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/paths/src/archon-paths.test.ts` at line 5, Consolidate fs/promises imports by adding "symlink as fsSymlink" to the existing named import that already includes mkdir, rm, writeFile, lstat, readlink (replace the separate second import) so all fs/promises functions are imported from a single statement; update the import that currently lists mkdir, rm, writeFile, lstat, readlink to also include symlink as fsSymlink and remove the duplicate import.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@packages/paths/src/archon-paths.test.ts`:
- Line 5: Consolidate fs/promises imports by adding "symlink as fsSymlink" to
the existing named import that already includes mkdir, rm, writeFile, lstat,
readlink (replace the separate second import) so all fs/promises functions are
imported from a single statement; update the import that currently lists mkdir,
rm, writeFile, lstat, readlink to also include symlink as fsSymlink and remove
the duplicate import.
In `@packages/paths/src/archon-paths.ts`:
- Around line 246-260: The current symlink handling (entry.isSymbolicLink()
branch using stat(join(fullPath, entry.name)) and then recursing based on
isDir/isFile) can follow cycles; add cycle protection by tracking resolved
realpaths and/or enforcing a hard recursion cap independent of maxDepth: when
you detect a symlink, resolve its real path (e.g., via realpath on
join(fullPath, entry.name)), maintain a Set of visitedRealpaths and skip
recursion if the realpath is already present, or increment a separate
recursionDepth counter and abort further descent when it exceeds a safe
constant; update the logic around entry.isSymbolicLink(), stat(...),
isDir/isFile and any recursive call sites to consult the visitedRealpaths or
recursionDepth before recursing.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e4b38726-0dc0-48ef-ae84-ff9b1ae655cc
📒 Files selected for processing (2)
packages/paths/src/archon-paths.test.tspackages/paths/src/archon-paths.ts
|
Hi @blankse — checking in. It's been a while since the last activity here and the PR is currently UNSTABLE. Are you still planning to address the review feedback, or would you prefer to close in favor of a fresh attempt later? Happy to keep it open if you have a timeline — just want to keep the queue moving. If I don't hear back in ~7 days I'll close as inactive (always feel free to reopen). |
Summary
~/.archon/commands/(or<repo>/.archon/commands/) as symlinks are silently ignored. A workflow nodecommand: foofails withCommand prompt not found: foo.mdeven though the symlink target is a perfectly readable.mdfile.findMarkdownFilesRecursivein@archon/pathsnow follows symlinks viastat()per Dirent (matching whatloadWorkflowsFromDirinworkflow-discovery.tsalready does), so symlinked.mdfiles are discovered like regular files. Broken symlinks are skipped silently.UX Journey
Before
After
Architecture Diagram
Before
After
Connection inventory:
findMarkdownFilesRecursivefs/promises.statfindMarkdownFilesRecursiveLabel Snapshot
risk: lowsize: XSpathspaths:archon-pathsChange Metadata
bugpathsLinked Issue
Validation Evidence
The four new tests in
archon-paths.test.tsexercise:.mdfile in the search root (the canonical case from the bug)..mdfiles.All
describe.skipIf(isWindows)for portability of the symlink syscalls.Security Impact
readFile(); the discovery layer was simply missing them.Compatibility / Migration
Human Verification
~/.archon/commands/is now resolved by a workflowcommand: <name>reference (was the failing scenario from Symlinked commands silently ignored: findMarkdownFilesRecursive uses Dirent.isFile() which doesn't match symlinks #1501).Side Effects / Blast Radius
findMarkdownFilesRecursive— that's the command resolver in@archon/workflows/executor-shared.tsand a couple of bundled-defaults helpers. None of them care whether the source is a regular file or a symlink (they justreadFile()the result)..mdfiles won't include them (filter is filename-based, unchanged). A symlinked directory whose name happens to end in.mdwould now be entered as a directory rather than ignored — extremely unusual but worth noting.Rollback Plan
git revert <this commit>— single-package, two-file change.Risks and Mitigations
stat()per Dirent adds I/O overhead in directories with many entries.Dirent.is*()calls). For typical.archon/commands/directories this is at most a handful of extra syscalls per workflow run.Summary by CodeRabbit