Skip to content

fix(git): reject SSH-URL-shaped codebase names in worktree path resolution#1583

Open
blankse wants to merge 1 commit into
coleam00:devfrom
blankse:fix/resolveownerrepo-strict-validation
Open

fix(git): reject SSH-URL-shaped codebase names in worktree path resolution#1583
blankse wants to merge 1 commit into
coleam00:devfrom
blankse:fix/resolveownerrepo-strict-validation

Conversation

@blankse
Copy link
Copy Markdown
Contributor

@blankse blankse commented May 5, 2026

Summary

  • Problem: A codebase registered with an SSH-style remote URL as its name (e.g. git@host.example:org/repo) produces worktree paths like ~/.archon/workspaces/git@host.example:org/repo/worktrees/.... The : in that path silently corrupts any downstream tool that uses : as a separator — most visibly docker-compose short-form volume specs (HOST:CONTAINER:OPT) inside devcontainers, which then fail with invalid volume specification.
  • Why it matters: The bug only surfaces on Web-UI-triggered runs (and any other caller that passes codebaseName through), so users see the same workflow succeed via CLI and fail via Web UI for no obvious reason. The bash-node executor head-truncates failure output, so the actual invalid volume specification line gets dropped — diagnosis is painful.
  • What changed: resolveOwnerRepo() now routes the codebaseName branch through the existing strict parseOwnerRepo() validator from @archon/paths (SAFE_NAME = /^[a-zA-Z0-9._-]+$/). Invalid names log worktree.invalid_codebase_name_format and fall through to the path-derived heuristic — same fallback that already covered the 'invalid-no-slash' case.
  • What did not change (scope boundary): Valid owner/repo codebase names behave identically. The path-derived fallback (extractOwnerRepo(repoPath)) is unchanged. No public API or schema change. No change to codebase registration / persistence — that normalization is a separate, larger discussion (called out at the end of bug(git): SSH-URL codebase names produce worktree paths with ':' that break docker volume parsing #1582).

UX Journey

Before

User                Archon                                   docker compose
────                ──────                                   ──────────────
register codebase
  name="git@host.example:org/repo" ──▶ stored verbatim
  default_cwd=/srv/projects/repo

trigger workflow (Web UI) ───────────▶ resolveOwnerRepo(codebaseName)
                                       split('/') → owner="git@host.example:org",
                                                    repo ="repo"
                                       worktree base:
                                       ~/.archon/workspaces/git@host.example:org/repo/...

bash node `start-preview` runs:
  docker compose up -d (in worktree) ──────────────────────▶ parses volumes
                                                              `<host>:/docker-entrypoint-initdb.d:ro`
                                                              host contains `:`
                                                              ✗ "invalid volume specification"
sees "Bash node failed [exit 1]"
(real reason head-truncated away)

After

User                Archon                                   docker compose
────                ──────                                   ──────────────
register codebase
  name="git@host.example:org/repo"   (unchanged — fix is purely on the resolver path)
  default_cwd=/srv/projects/repo

trigger workflow (Web UI) ───────────▶ resolveOwnerRepo(codebaseName)
                                       [parseOwnerRepo("git@host.example:org/repo")]
                                       [→ null (`:`/`@` rejected by SAFE_NAME)]
                                       [warn: worktree.invalid_codebase_name_format]
                                       [fallback → extractOwnerRepo(repoPath)]
                                       [owner="projects", repo="repo"]
                                       worktree base:
                                       ~/.archon/workspaces/projects/repo/...   ✓ no `:`

bash node `start-preview` runs:
  docker compose up -d ────────────────────────────────────▶ parses volumes
                                                              host has no `:`
                                                              ✓ container starts

Architecture Diagram

Before

packages/git/src/worktree.ts
  resolveOwnerRepo(repoPath, codebaseName?)
   ├─ if codebaseName
   │   └─ split('/') → naive owner/repo  ───── (no SAFE_NAME check)
   │      └─ returned even if owner contains `:` `@` etc.
   ├─ else if repoPath under workspaces/
   │   └─ derive from path segments
   └─ else extractOwnerRepo(repoPath)            (path-basename heuristic)
                              │
                              ▼
                    getProjectWorktreesPath(owner, repo)
                              │
                              ▼
                    consumed as filesystem path  ── docker compose volume parser breaks on `:`

After

packages/git/src/worktree.ts
  resolveOwnerRepo(repoPath, codebaseName?)
   ├─ if codebaseName
   │   └─ [parseOwnerRepo(codebaseName)]            === @archon/paths (existing)
   │       ├─ valid → returned  (unchanged behaviour for clean names)
   │       └─ invalid → [warn, fall through to path heuristic]
   ├─ else if repoPath under workspaces/
   │   └─ derive from path segments              (unchanged)
   └─ else extractOwnerRepo(repoPath)            (unchanged)
                              │
                              ▼
                    getProjectWorktreesPath(owner, repo)   (unchanged)

Connection inventory:

From To Status Notes
worktree.ts @archon/paths modified adds parseOwnerRepo to existing import
resolveOwnerRepo parseOwnerRepo new replaces inline split('/')
resolveOwnerRepo path-derived fallback unchanged now reachable for one additional input shape
getWorktreeBase callers resolveOwnerRepo unchanged public surface unchanged

Label Snapshot

  • Risk: risk: low
  • Size: size: S
  • Scope: git
  • Module: git:worktree

Change Metadata

  • Change type: bug
  • Primary scope: git

Linked Issue

Validation Evidence (required)

bun run check:bundled                          # bundled-defaults up to date
bun --filter @archon/git type-check            # exit 0
bun --filter @archon/git test                  # 143 pass / 0 fail (was 142, +1 new regression test)
bun --filter @archon/isolation type-check      # exit 0 (downstream consumer)
bun --filter @archon/isolation test            # all green
bun x eslint packages/git/src/worktree.ts --max-warnings 0    # clean
bun x prettier --check packages/git/src/{worktree,git.test}.ts  # clean
  • Evidence provided: New test 'ignores SSH-URL-shaped codebaseName ... and falls back to path' in packages/git/src/git.test.ts exercises the regression directly and asserts the resulting worktree base contains neither : nor @. Mirrors the structure of the existing 'invalid-no-slash' test. The test mock for @archon/paths now also exposes a parseOwnerRepo mirror kept aligned with the real SAFE_NAME regex.
  • Skipped: Top-level bun run lint was skipped because root-level ESLint OOMs on this machine (V8 heap exhaustion in the monorepo lint run — independent of this PR). Linted the two touched files directly instead.

Security Impact (required)

  • New permissions/capabilities? No.
  • New external network calls? No.
  • Secrets/tokens handling changed? No.
  • File system access scope changed? No — same getProjectWorktreesPath() output, only the input owner/repo segments are now constrained to [A-Za-z0-9._-]. If anything, this narrows the set of paths the resolver can produce (path-traversal-safe by construction, since SAFE_NAME already excludes ./..).

Compatibility / Migration

  • Backward compatible? Yes — purely additive validation. Codebases with valid owner/repo names see no change.
  • Config/env changes? No.
  • Database migration needed? No.
  • For users with already-existing worktrees under a corrupted git@host:org/repo/... path: those keep working (git doesn't care), but new worktrees for the same codebase will land under the path-derived layout instead. Old worktree dirs can be cleaned up manually if desired.

Human Verification (required)

  • Verified scenarios:
    • Local repro on a real WebUI-triggered workflow run against a codebase whose name contained an SSH URL — failure mode (invalid volume specification from docker-compose) confirmed against dev.
    • After the fix, resolveOwnerRepo("git@host.example:org/repo") returns the path-derived owner/repo, and the resulting worktree base contains no : / @.
    • All 142 existing @archon/git tests still pass; new regression test passes.
  • Edge cases checked:
    • codebaseName already valid (e.g. acme/widget-app) → unchanged.
    • codebaseName with multiple slashes (e.g. acme/sub/repo) → unchanged (still rejected by parseOwnerRepo length check, falls through).
    • codebaseName with leading/trailing dots → unchanged (rejected by SAFE_NAME, falls through).
    • codebaseName with whitespace, +, ~, etc. → now also rejected (was previously accepted as a path segment); this is the intended tightening.
  • What was not verified:

Side Effects / Blast Radius (required)

  • Affected subsystems: only getWorktreeBase() resolution (and any code that depends on the worktree base layout — @archon/isolation, workflow run dispatch, artifact paths derived from the worktree). All consumers ride the same path-derived fallback the CLI already used, so behaviour for new runs converges with the previously-working CLI behaviour.
  • Potential unintended effects: For users whose codebase name happened to contain non-alphanumeric chars but wasn't an SSH URL and where the corrupted path was somehow load-bearing — they'll now silently get the path-derived layout instead. Mitigated by the warn-log (worktree.invalid_codebase_name_format) and by the fact that the corrupted layout was never functional (it's the bug being fixed).
  • Guardrails: existing test for 'invalid-no-slash' remains; new test pins the SSH-URL shape; warn-level log event makes the fallback observable in operations.

Rollback Plan (required)

  • Fast rollback: revert this single commit; the change is two files, additive validation only.
  • Feature flags: none.
  • Observable failure symptoms: a regression would re-surface as invalid volume specification in docker-compose volume parsing (or any other consumer that splits on :). Logged event worktree.invalid_codebase_name_format lets operators see when the fallback fires.

Risks and Mitigations

  • Risk: An existing codebase relies on the corrupted owner/repo path being preserved for some downstream lookup (e.g. cache key, DB FK).
    • Mitigation: searched the codebase — every consumer of the worktree base path treats it as an opaque filesystem location; no DB rows or external integrations key on the corrupted form. The CLI path has been emitting the path-derived layout all along, so any "stable across both triggers" expectation was already broken.
  • Risk: Hides a class of bugs by silently falling back instead of erroring.

Summary by CodeRabbit

  • Bug Fixes

    • Strengthened validation of repository identifiers so SSH-style or otherwise unsafe names are rejected; the system now reliably falls back to workspace-scoped alternatives and emits clearer warnings when formats are invalid.
  • Tests

    • Added a regression test verifying malformed or SSH-style repository names are rejected and fallback derivation produces safe base paths (no ":" or "@").

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a26522e4-9eb1-47c2-904b-0fc78c962e2f

📥 Commits

Reviewing files that changed from the base of the PR and between 2a87f72 and b1d5b28.

📒 Files selected for processing (2)
  • packages/git/src/git.test.ts
  • packages/git/src/worktree.ts

📝 Walkthrough

Walkthrough

Replaces manual codebaseName splitting with strict parseOwnerRepo usage in owner/repo resolution; tests gain a mocked parseOwnerRepo and a regression verifying SSH-style names are rejected and the fallback derives a safe workspace owner/repo.

Changes

Owner/Repo Validation Enhancement

Layer / File(s) Summary
Resolver: parseOwnerRepo usage
packages/git/src/worktree.ts (range_5a1a26de871d, range_4f3e1957fc74)
resolveOwnerRepo now calls parseOwnerRepo(codebaseName) and returns parsed {owner, repo} when valid; invalid inputs continue to the existing warning and workspace-path-derived fallback. Import formatting changed only.
Test mock and regression
packages/git/src/git.test.ts (range_2e1f0c81ec77, range_35509c3524b5)
Adds SAFE_NAME regex and local parseOwnerRepo() to the @archon/paths test mock and a regression test that verifies SSH-URL-shaped codebaseName is rejected and the computed worktree base contains neither : nor @.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Poem

🐰 I hop the paths and check each name,
No colons, at-signs — nothing untame,
Parse first clean, else fall back true,
Workspace names tidy, good as new,
A little hop to see it through.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding validation to reject SSH-URL-shaped codebase names in worktree path resolution, which is the core fix in the PR.
Description check ✅ Passed The description comprehensively covers all template sections including problem statement, UX journey before/after, architecture diagrams, validation evidence, security impact, compatibility, human verification, side effects, rollback plan, and risks/mitigations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@blankse blankse force-pushed the fix/resolveownerrepo-strict-validation branch 2 times, most recently from 55663ca to 2a87f72 Compare May 14, 2026 12:03
…ution

Closes coleam00#1582.

Background

`resolveOwnerRepo()` (used by `getWorktreeBase()`) split the codebase
`name` field at the first `/` and accepted both halves verbatim as
owner/repo. A codebase registered with an SSH-style remote URL as its
name (e.g. `git@host:org/repo`) therefore produced
`{ owner: "git@host:org", repo: "repo" }` and worktrees ended up under

  ~/.archon/workspaces/git@host:org/repo/worktrees/...

The colon in that path silently breaks anything that uses `:` as a
separator. Most visible symptom: docker-compose short-form volume
specs in devcontainer setups. A spec like
`./postgres-init:/docker-entrypoint-initdb.d:ro` is parsed as
`HOST:CONTAINER:OPT`, and when HOST itself already contains a colon
the whole spec gets rejected with `invalid volume specification`. The
failure was hard to spot because the bash-node executor head-truncates
captured error output, dropping exactly the line with the root cause.

The same workspace concept reached via the CLI path-derived heuristic
correctly produced `<parent>/<repo>` (no colon, no `@`), so CLI runs
worked while WebUI runs (which pass `codebaseName` through) failed
deterministically for any codebase whose name was an SSH URL.

Change

Use the existing strict `parseOwnerRepo()` validator from
`@archon/paths` (which already enforces `[A-Za-z0-9._-]` per segment
for path-traversal safety) for the codebase-name branch. When the
name doesn't parse, log `worktree.invalid_codebase_name_format` and
fall through to the path-derived heuristic — same fallback as the
existing `'invalid-no-slash'` case, just covering one more shape.

Behaviour delta is strictly safer:

- Valid `owner/repo` codebase names — unchanged.
- Invalid names (no slash, multiple slashes, dot-segments) — unchanged.
- SSH-URL-shaped names with `:`/`@` — previously produced a corrupted
  workspace path; now logged and falls back to the path-derived layout.

Tests

New regression test in `git.test.ts` covering the SSH-URL shape;
asserts the resulting base contains neither `:` nor `@`. The test
mock for `@archon/paths` now also exposes `parseOwnerRepo` (mirrored
SAFE_NAME, kept aligned with the real implementation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@blankse blankse force-pushed the fix/resolveownerrepo-strict-validation branch from 2a87f72 to b1d5b28 Compare May 21, 2026 12:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(git): SSH-URL codebase names produce worktree paths with ':' that break docker volume parsing

1 participant