| spec | SPEC-0017 | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| title | RepoOps and GitHub integration | ||||||||
| version | 0.1.1 | ||||||||
| date | 2026-02-09 | ||||||||
| owners |
|
||||||||
| status | Partially implemented | ||||||||
| related_requirements |
|
||||||||
| related_adrs |
|
||||||||
| notes | Defines how the system connects to GitHub, manages branches/PRs, and indexes repos for implementation runs. |
Define a RepoOps subsystem that enables Implementation Runs to:
- connect or create a GitHub repo
- clone/check out code in sandbox
- apply patches and commit
- open PRs, monitor checks, and merge after approval
- index repo code for retrieval (bounded; FR-032)
Implemented in this repo:
-
Repo connection persistence (non-secret GitHub repo metadata) via
repostable and DAL. Seesrc/db/schema.tsandsrc/lib/data/repos.server.ts. -
GitHub API operations for PR open/fetch, checks/status polling, and merge. See
src/lib/repo/repo-ops.server.ts. -
Implementation-run workflow steps that use RepoOps + sandbox jobs for checkout/patch/verify, and approval-gated merge. See
src/workflows/runs/project-run.workflow.tsandsrc/workflows/runs/steps/implementation/*. -
Repo indexing (FR-032) foundation:
- bounded file walk from a sandbox checkout
- chunk + embed + upsert into Upstash Vector under
project:{projectId}:repo:{repoId} - invoked as
impl.repo.indexinsrc/workflows/runs/project-run.workflow.tsSeesrc/lib/repo/repo-indexer.server.tsandsrc/workflows/runs/steps/repo-index.step.ts.
Implementation Runs require safe, reviewable changes to a target repository with strong auditability and resumability. RepoOps provides a single, consistent interface for repo connection, patch application, verification, PR workflows, and repo indexing.
- Support PR-based GitOps delivery (branches + PRs + required checks).
- Keep all repo execution inside sandboxed checkouts (no repo code runs in app runtime).
- Persist enough provenance to resume runs and generate audit bundles.
- Bypassing branch protections or required checks.
- Supporting non-GitHub providers in the initial implementation.
Requirement IDs are defined in docs/specs/requirements.md.
- FR-022: Connect a target application repository to a project and persist repo metadata.
- FR-025: Apply code changes as patches/commits and create/manage PRs for review.
- FR-029: Monitor and report implementation run progress across external systems.
- FR-031: Enforce an approval gate for side-effectful operations (push/merge).
- FR-032: Index target repo source code for retrieval to support code-aware agents.
- NFR-013 (Least privilege): Provider credentials are scoped to minimum required permissions; unsafe tools are gated by explicit approvals.
- NFR-015 (Auditability): All side-effectful actions are logged with intent, parameters (redacted), and resulting external IDs.
- IR-011: Repo operations via GitHub (API + Git over HTTPS).
- Use least-privilege credentials; prefer fine-grained PATs initially.
- Do not persist repo secrets (tokens, deploy keys); redact logs.
- Respect GitHub API limitations: Checks API write requires GitHub Apps; PAT mode must primarily read check status and rely on provider CI.
| Criterion | Weight | Score | Weighted |
|---|---|---|---|
| Solution leverage | 0.35 | 9.2 | 3.22 |
| Application value | 0.30 | 9.2 | 2.76 |
| Maintenance & cognitive load | 0.25 | 9.0 | 2.25 |
| Architectural adaptability | 0.10 | 9.1 | 0.91 |
Total: 9.14 / 10.0
- RepoOps exposes idempotent operations used by Implementation Runs:
- connect repo, ensure branch, apply patch, run verification, open PR, poll checks
- All operations that touch the repo execute within Sandbox jobs.
- Repo metadata persisted per project:
- provider, owner/name, default branch, URLs, last indexed SHA
- Indexing metadata persisted per chunk:
- path, language, commit SHA, offsets,
type=code
- path, language, commit SHA, offsets,
- docs/architecture/spec/SPEC-0017-repo-ops-and-github-integration.md: canonical RepoOps behavior.
- docs/architecture/spec/SPEC-0019-sandbox-build-test-and-ci-execution.md: sandbox job building blocks.
- docs/architecture/adr/ADR-0024-gitops-repository-automation-pr-based-workflows.md: PR-based GitOps policy.
- GitHub credentials are feature-gated (see docs/ops/env.md):
GITHUB_TOKEN(optionalGITHUB_WEBHOOK_SECRET)
- Use a fine-grained GitHub personal access token (PAT) stored in environment
variables (
GITHUB_TOKEN). - Scope permissions minimally:
- repository contents: read/write.
- pull requests: read/write.
- workflows/checks: read (for status polling).
- metadata: read.
The system should keep a credential-provider interface to allow switching to a GitHub App without refactoring the run engine.
A project can be in one of these states:
- No repo connected
- Repo connected (owner/name + default branch)
- Repo connected + repo indexed (vector namespace exists)
Persist non-secret repo metadata in DB.
- Link existing:
- validate access, fetch default branch, repo URL
- Create new (optional later):
- create repo under configured owner
- initialize with a standard scaffold
All working copies live inside Vercel Sandbox. The app runtime never executes repo code.
Sandbox job sequence:
git clone(or fetch)git checkout -b <run-branch>- apply patch set (see below)
- run verification commands
- commit
- push branch
Push uses HTTPS. Do not persist tokens into git config beyond the sandbox job lifespan; redact tokens from logs.
Patch formats supported:
- unified diff
- file-level replace/create operations
Patch application must be:
- atomic per task (either commit cleanly or fail)
- logged (store diff and file list)
- replayable (patch ids in artifacts)
Use GitHub API for:
- create PR from run branch → default branch
- add labels, body template, checklists
- monitor checks/statuses
- merge PR (approval-gated)
Two mechanisms:
- poll GitHub checks/status at bounded intervals (default)
- optional: receive GitHub webhooks for check completion / PR updates
Current status: implemented (bounded) in the codebase as of 2026-02-09.
Implementation notes:
- Indexing is latest-only (prefix delete + deterministic IDs), so retries are safe.
- Obvious secret-like/binary paths are filtered (
.env, key files, images/archives). - Strict size/count budgets prevent runaway indexing.
This section also describes the target design expansions (incremental indexing, richer filters).
Indexing pipeline:
- incremental file walk in sandbox checkout
- ignore patterns:
.git/,node_modules/, build artifacts- binary files
- chunk by file:
- for small files, embed whole file
- for large files, chunk by lines/sections
- store embeddings in Upstash Vector namespace:
project:{projectId}:repo:{repoId}
- metadata includes:
- path
- language
- commit SHA
- chunk offsets
type=code
Indexing triggers:
- after repo connect (full index)
- after merges (incremental index by changed files)
RepoOps operations are side-effectful and require approval:
- pushing to remote
- merging PRs
All operations persist:
- intent
- parameters (redacted)
- resulting IDs/URLs
- Repo connect validates access and persists non-secret metadata.
- Patch application is atomic per task and produces replayable artifacts.
- PR creation and merge are approval-gated and respect required checks.
- Repo indexing is project-scoped and ignores secrets/build artifacts.
- Unit tests: patch application validation and ignore patterns.
- Integration tests: RepoOps against a dedicated test repo with branch protections.
- Security tests: token redaction and “no secret persistence” invariants.
- Prefer polling checks with bounded intervals; use webhooks as an optimization.
- Treat merge operations as irreversible side effects; require explicit approval.
- Branch protection blocks merge → surface required checks and stop at merge gate.
- Token lacks scope → fail early with actionable permissions guidance.
- Indexing too slow on large repos → incremental indexing and bounded chunking.
- docs/architecture/spec/SPEC-0017-repo-ops-and-github-integration.md
- docs/architecture/adr/ADR-0024-gitops-repository-automation-pr-based-workflows.md
- docs/architecture/spec/SPEC-0019-sandbox-build-test-and-ci-execution.md
- 0.1 (2026-02-01): Initial draft.