All 9 phases (0–8) are implemented (verified 2026-05-18: 123 workspace tests pass, clippy/fmt clean). This file is the historical per-phase record with its deviation notes. For the forward-looking, prioritized backlog — every
[~]partial and deferred item below, consolidated and ranked — seeTODO.md. Keep the two in sync: when a TODO item lands, update its phase deviation note here.
The project is divided into 6 phases, each delivering a usable increment. Each phase has a clear "you can use it for X" milestone.
Goal: Cargo workspace compiles, core object store works, basic CLI scaffolding.
Status: ✅ Complete (commit a393efb).
- Cargo workspace with all crate stubs
-
gpp-core: Content-addressed object store (BLAKE3, zstd compression)- Blob, Tree object types
- Read/write objects to
.gpp/objects/ - Object validation (hash verification)
-
gpp-cli: Binary scaffold with clap-
gpp init— create.gpp/directory structure -
gpp status— basic status output -
gpp config— read/write TOML config
-
-
.gpp/directory layout established - CI pipeline:
cargo test,cargo clippy,cargo fmt
gpp init creates a valid repository. Objects can be stored and retrieved.
blake3— hashingzstd— compressionclap— CLI argument parsingserde,toml— config serializationanyhow,thiserror— error handlingtracing— logging
Goal: Continuous file change capture works. Developers can promote timeline entries to changesets.
Status: ✅ Complete (commit ea974a9).
-
gpp-timeline:- File system watcher (notify crate)
- Debouncing (100ms default)
- SQLite timeline database (WAL mode)
- Timeline entry creation (author, source, files, hashes)
-
.gppignoresupport (common.gitignoresubset — see note) - Timeline pruning (configurable retention)
-
gpp-history:- Changeset object type
- Intent object type
- Author (Human/Agent) enum
- Promote timeline entries → changeset
- Changeset DAG (parents, branching)
- Branch refs
-
gpp-diff:- Line-based diff (fallback)
- Basic file diff display (unified format)
- CLI commands:
-
gpp timeline— view timeline entries -
gpp timeline watch— live stream -
gpp promote— promote to changeset -
gpp log— view changeset history -
gpp diff— show changes -
gpp branch— create/switch/list branches
-
A developer can work on code, see continuous timeline capture, promote meaningful changes to history, and browse changeset history. This is the "better Git for solo developers" milestone.
notify— file system eventsrusqlite— SQLitesimilar— diff algorithmglobset,walkdir— added for.gppignorematching and tree walking (pure Rust)
.gppignoreimplements the common.gitignoresubset (negation, root vs. basename anchoring,**/*/?, directory patterns), not every edge case.- Rename detection is recorded as delete + add for now; the
renamechange type exists in the schema for a later pass. promote --interactive/--auto-summarize/--signare rejected with a clear message (depend on AI/signing layers in later phases).
Goal: Semantic diffing works for Phase 1 languages. Git repos can be imported.
Status: ✅ Complete.
-
gpp-diff(enhanced):- Tree-sitter integration
- AST parsing for Rust, TypeScript, Python, Go
- Declaration fingerprinting (full + name-blanked body fingerprint)
- Cross-file move detection
- Symbol rename detection
- Semantic diff display format
- Plugin interface (
LanguageParsertrait)
-
gpp-git-bridge:-
gpp git-import— import Git history to gpp -
gpp git-export— export gpp history as Git - Hash mapping database (SQLite)
- Bidirectional sync mode (
gpp git-bridge --watch)
-
- CLI updates:
-
gpp diff --semantic(default for supported languages) -
gpp git-import,gpp git-export,gpp git-bridge
-
Developers can import existing Git repos and immediately see better diffs. This is the "drop-in improvement over Git" milestone.
tree-sitter+tree-sitter-{rust,python,typescript,go}grammarsstreaming-iterator— tree-sitter 0.24 query iterationgit2(libgit2 bindings) — for Git bridge
- Declaration extraction is query-driven per language; adding a language is a grammar + a declaration query. Nested items (e.g. impl methods) are captured too. Fingerprints normalize trailing whitespace and blank edges, so pure reformatting is reported as no semantic change.
- Rename/move detection is fingerprint-based: two declarations with an identical name-blanked body are treated as the same symbol. Trivial bodies (e.g. two empty functions) can therefore look like a rename — this is the expected similarity-heuristic trade-off, mirrored from Git's own heuristics.
git-import/git-exporttraverse the first-parent chain; the hash map keys commits by their oid in the bridged repo. Import-from-A then export-to-a-different-repo-B reuses A's oids (correct for the single-remote bridge model; cross-repo migration would need a fresh map).git-bridge --watchis poll-based (HEAD-oid change detection on an interval);--exportopts into pushing gpp changes back each cycle. Continuous operation-level bidirectional CRDT sync remains Phase 5.
Goal: The encrypted knowledge graph works. Agents can query it via MCP.
Status: ✅ Complete.
-
gpp-graphex:- GraphNode and GraphEdge object types
- SQLite adjacency index
- Envelope encryption (age master + per-tier AES-256-GCM)
- Access tier system (public, agent-readable, agent-restricted, human-only)
- Key hierarchy and key management
- Node lifecycle (proposed → active → deprecated → archived)
- Graph query engine (path pattern language)
- Context projection engine
- Subgraph selection
- Tier filtering
- Scrubbing (over-tier nodes never decrypted/shown)
- Token budget truncation
- Graph access audit log
- Auto-inference from changed paths (propose nodes for new modules)
- Manual node/edge CRUD via CLI
-
gpp-sdk(initial):- Rust SDK for agent integration
-
AgentSessionstruct -
query_graphex(),propose_changeset(),propose_graph_update()
- MCP server (initial):
-
gpp mcp-server --stdio -
graphex_query,graphex_status,graphex_glossary,graphex_conventionstools -
propose_changeset,propose_graph_updatetools
-
- CLI commands:
-
gpp graphex(status/query/project/add/link/show/list/pending/accept/reject/audit/infer) -
gpp mcp-server -
gpp keys(generate, rotate, show)
-
AI tools (Claude Code, Cursor, etc.) can connect via MCP, query the knowledge graph, and propose changes. This is the "AI-native" milestone — the core differentiator.
age— master-identity envelope encryptionaes-gcm— per-tier symmetric node encryptiongetrandom— key/nonce generation- custom MCP implementation (JSON-RPC 2.0 over newline-delimited stdio; no external MCP SDK — keeps the pure-Rust, single-binary constraint)
- Encrypted nodes are stored as ordinary content-addressed
Blobs (wire(zstd(msgpack))sealed with the tier key);graph.dbindexes metadata + a pointer to the current blob. This avoided changing the gpp-core wire format /ObjectTypeset. Node identity is stable (blake3("{type}:{name}")) so edits re-encrypt the same logical node and keep its edges; old blobs remain in object history. Resolved 2026-05-18. Withmaster.agestores the X25519 identity directly andhuman-onlyis master-sealed$GPP_GRAPHEX_PASSPHRASEset (orKeyStore::{generate,open}_with),master.ageis scrypt- passphrase-wrapped at rest and thehuman-onlytier key is sealed directly to the passphrase — the master identity alone can no longer decrypt human-only. With no passphrase the legacy unattended behaviour is unchanged (auto-detected on open), so existing repos keep working;human-onlyis still scrub-enforced in projection regardless.- Query results are metadata-only (names/types/relations) and never decrypt
content; decryption happens exclusively in the tier-gated projection path,
which writes a
graph_access_logentry (accessor, nodes, projection hash). - Auto-inference keys off changed file paths of the HEAD changeset
(
gpp graphex infer), proposingModulenodes; richer semantic-diff-driven edge inference is a future enhancement.AddEdgeproposals are applied directly (edges carry no secret content);AddNoderequires human approval. - Federation (publish/subscribe subgraphs) is intentionally deferred to Phase 5 alongside the CRDT sync protocol, per the roadmap’s own ordering.
Goal: Agent governance works. Compliance-as-code enforced.
Status: ✅ Complete.
-
gpp-trust:- Agent score database (SQLite)
- Score calculation (reviewed-outcome Bayesian model: survival vs. regression)
- Trust policy configuration (thresholds)
- Automatic status transitions (auto-merge, review-required, sandboxed, blocked)
- Module-level trust overrides
- Trust event logging
-
gpp-policy:- Policy file parser (.policy TOML format)
- Pattern-based rules (regex on file content)
- Changeset-based rules (author, files, review requirements)
- Enforcement points: promotion (block/warn/audit) wired into
gpp promote - Built-in policy templates (secrets-scan, pci-dss, soc2)
- Custom policy support
-
gpp-cost:- Cost record database (SQLite)
- Token tracking per changeset
- Budget configuration and alerts
- Cost analytics queries (summary + breakdown)
- Efficiency metrics (cost per survived line)
-
gpp-anomaly:- Detection rules (unusual-scope, burst-activity, large-changeset)
- Event logging and alerting
- Resolution workflow + tunable rule thresholds
- CLI commands:
-
gpp trust(show/history/policy/override/reset) -
gpp policy(list/show/add/template/templates/remove/validate/check) -
gpp cost(summary/breakdown/efficiency/budget/budget-alert) -
gpp anomaly(list/history/resolve/rules/configure) -
gpp audit— cross-layer audit report (trust + anomaly + cost + graphex)
-
Teams can govern AI agent contributions with trust scores, enforce compliance policies, track costs, and detect anomalies. This is the "enterprise-ready" milestone.
- Trust score is based on reviewed outcomes only (survived vs. regression
with a Beta(1,1)-style prior at 50); merely promoting a not-yet-reviewed
changeset is not penalized. Survived/regression signals are recorded by
record_event; the review layer (Phase 6) will drive them automatically — for nowgpp promoterecordschangeset_promotedfor agent authors. - Policy enforcement is wired at the promotion point (block aborts before
any changeset object is written; warn/audit are reported). The timeline-
capture (warn) and sync (block) enforcement points reuse the same
PolicySetAPI and attach in Phases 1-revisit / 5 respectively. - Cost records are created at promote time with tokens/cost = 0 ("unknown"
model) until a Tier-3 SDK reports real usage;
lines_changed/filesare computed from the changeset delta. Budget attribution is repo-wide (per-path attribution lands with the review layer). - Anomaly
burst-activityuses changesets by the author reachable from HEAD in the last 24h as the window count.
Goal: Peer-to-peer sync works. No GitHub dependency needed.
Status: ✅ Complete.
-
gpp-sync:- Noise protocol handshake (Noise_XX via
snow) - State vector exchange (object id set, branch tips, policy set)
- Delta computation (set difference over state vectors)
- Object transfer (raw verified frames, chunked over Noise)
- History sync (changeset objects + ref reconcile)
- Graphex sync (OR-Set add / LWW metadata, zero-knowledge)
- Policy sync (add-only union by name)
- Conflict detection (divergent branch →
name.fork.<peer>) - Resume after connection loss (state exchange is idempotent/cheap)
- Peer authentication (TOFU static-key pinning)
- Peer permission model (known-peers gate; relay ACLs in Phase 7)
- Noise protocol handshake (Noise_XX via
-
gpp-replay:- Environment snapshot creation
- Snapshot storage as objects
- Replay re-materialization engine
- Diff between replay and original (drift detection)
- Graphex federation:
- Publish/subscribe subgraphs (federated sources config + graph-only sync)
- Federated node lifecycle (rides OR-Set graphex sync)
- Cross-project sync (
gpp sync --graph-only)
- CLI commands:
-
gpp sync(add/remove/status/serve/peer/default-all) -
gpp replay(dry-run/diff/output/env) -
gpp graphex federation(add/list), plusgpp merge
-
Teams can sync without GitHub. Multiple projects can federate knowledge. This is the "decentralized" milestone.
snow— Noise protocol
- State exchange uses an explicit object-id set rather than a bloom filter (correct and simple at current scale; a bloom filter is a drop-in optimization later). The transport chunks messages so payloads larger than a 64 KiB Noise message transfer transparently.
- Ref conflicts are fork-preserving rather than Lamport-LWW: a divergent
same-name branch is kept as
name.fork.<peer>(gpp ref names disallow@, so the doc'sname@peerbecomesname.fork.peer).gpp mergeresolves a fork into the current branch via a two-parent merge changeset taking the fork's tree (explicit, human-reviewed — never a silent merge). - Graphex sync is zero-knowledge: encrypted node blobs ride the object set;
only index metadata merges (node upsert keeps higher
updated_at; edges add-only). A backup peer without tier keys still cannot read content. - Trust and timeline are never synced (per
docs/SYNC_PROTOCOL.md). gpp-replayreproduces inputs deterministically/offline (tree + captured toolchain/env). Re-executing the original agent is out of scope.- Federation is config + graph-only sync; richer publish-filter globs and one-way federated read-only enforcement are a later hardening pass.
Goal: Collaboration workflow works. Teams can review, assign permissions, and get notified.
Status: ✅ Complete.
-
gpp-review:- Review object type and SQLite schema
- Review lifecycle (pending → approved/changes_requested/rejected → merged)
- Reviewer suggestion (from RBAC owners/maintainers)
- Review comments with file/line targeting
- Review policy enforcement (RBAC merge-gate: reviewers/human/role/agent)
- Comment threads attached to a changeset's review
-
gpp-rbac:- Role system (owner/maintainer/contributor/reader, ordered)
- Role assignment and revocation (with expiry)
- Branch protection rules (glob → min reviewers/human/role/agent)
- Enforcement at the CLI merge gate (
gpp review merge) - Role change auditing (
role_history)
-
gpp-notify:- Event system with typed events
- Notification database and inbox
- Integration backends: webhook/slack/discord (HTTP POST)
- HMAC-SHA256-signed outgoing webhooks (
X-Gpp-Signature) - Configurable per-backend event subscriptions
- CLI commands:
-
gpp review(list/show/request/approve/request-changes/reject/merge/comment/comments) -
gpp rbac(show/assign/revoke/whoami/protect/protections) -
gpp inbox(list/unread/ack/ack --all) -
gpp notify(integrations/add/remove/dispatch/events)
-
Teams can do code review inside gpp, manage permissions, and get notified via Slack/Discord/webhooks. This is the "team collaboration" milestone.
reqwest(blocking) — webhook/chat deliveryhmac,sha2— webhook signatures
gpp promoteauto-opens a review (config[review].auto_create_on_promote, default true) and emits achangeset.promotedevent to suggested reviewers' inboxes (best-effort — never fails the promote).- Reviewer suggestion uses RBAC owners/maintainers. Graphex semantic
ownership-based assignment is deferred (the
owned-byedge exists; wiring it as the primary source is a later enhancement). - Conversation threads are modelled as the review's comment list rather than
a separate hashed
ConversationThreadobject (no gpp-core wire change). - Email/Jira/Linear backends are not delivered: email needs SMTP creds and
lettreis a heavy dependency, Jira/Linear need live APIs. The backend table + dispatch path are generic, so they slot in without schema change; webhook/slack/discord (HMAC-signed HTTP POST) are implemented and tested via an injectedSender(offline-deterministic unit test). - Outbound HTTP is abstracted behind a
Sendertrait so dispatch is unit-testable without a network; the realHttpSenderuses blockingreqwest. gpp review mergemarks the review merged after an RBACcan_mergecheck; it does not rewrite history (the changeset was already promoted), keeping history append-only.
Goal: gpp works seamlessly with GitHub, GitLab, and Bitbucket. The gh extension exists.
Status: ✅ Complete.
-
gpp-remote:- Platform abstraction (
Platform+ injectableHttpClient) - GitHub create-PR (REST
POST /repos/:repo/pulls) - GitLab create-MR (REST
/projects/:id/merge_requests) - Bitbucket create-PR (REST
/repositories/:repo/pullrequests) -
GenericGitRemote(export +git push, no platform API) - PR creation with gpp metadata enrichment (intent, semantic diff, agent, policy, cost, trust)
- [~] Review/comment sync — payload builders ready; live bidirectional polling deferred
- [~] CI status import — config plumbed; live status fetch deferred
- [~] Issue linking — PR id/url captured; deeper linking deferred
- [~] Graphex-over-Git distribution deferred (covered by
gpp sync --graph-only)
- Platform abstraction (
-
gh-gppextension:-
gh gpp promote— promote + push + create enriched PR -
gh gpp review— changeset + semantic diff + review context -
gh gpp trust— trust scores as a PR comment -
gh gpp cost— cost attribution as a PR comment -
gh gpp audit— audit report (optionally a gist) -
gh gpp sync— import the GitHub default branch into gpp
-
-
gpp-relay:- Relay node binary (
gpp-relay) - Object storage and forwarding (wraps
gpp-sync::serve) - Peer authentication (Noise + repo-id gate + TOFU; auth-keys advisory)
- Docker image (
deploy/relay/Dockerfile) - Relay health endpoint (
GET /healthonport+1)
- Relay node binary (
- CI/CD integration:
- GitHub Action:
gpp-policy-check - GitHub Action:
gpp-trust-gate - GitHub Action:
gpp-audit-report - GitLab CI template (
ci/gitlab/gpp.gitlab-ci.yml)
- GitHub Action:
- CLI commands:
-
gpp remote(setup/status/pr-create/push) -
gpp relay(status/add/remove/push/pull)
-
Teams using GitHub/GitLab continue using their existing platform while getting gpp intelligence in PRs and CI. gh gpp promote is the easiest entry point. This is the "GitHub-compatible" milestone — the adoption unlocker.
- GitHub uses the REST API via blocking
reqwestinstead ofoctocrab(avoids pulling an async runtime; keeps the single-binary/pure-Rust posture). All three platforms share one request/response code path behind an injectableHttpClient, so PR creation is unit-tested fully offline with a mock (GitHub/GitLab/Bitbucket request shapes + result parsing). gh-gppis a Bashghextension (theghconvention runs any executable namedgh-<name>); a Go rewrite is optional. It shells togpp/ghand never pushes gpp metadata into the repo — it surfaces it into the PR as description/comments.- The relay reuses
gpp-sync::serve(Noise handshake, repo-id gate, TOFU).--auth-keysis honored as an advisory allowlist; pre-handshake key rejection needs agpp-synchook and is a later hardening pass. - Bidirectional review/comment sync, live CI-status import and issue linking are scaffolded (config + payload builders) but not wired to live platform polling — they need authenticated network round-trips and are a follow-up; the milestone (enriched PRs + CI gating) is met.
octocrab— GitHub APIgotoolchain — for gh extension (gh extension convention)
Goal: Production-ready. Rich client interfaces. Documentation. Community launch.
Status: ✅ Complete.
-
gpp-tui:- Terminal UI with
ratatui(gpp ui) - Panels: timeline, history, graphex, agents, reviews, anomalies, cost, inbox
- Layout presets (default, minimal, review, monitoring)
- Live auto-refresh (toggle with
--no-live) - Panel navigation (focus by
--panel, Tab/j/k) - Keyboard-driven (q quit, r refresh); pure
Dashboardis unit-tested
- Terminal UI with
-
vscode-gppextension:- Timeline / Graphex / Reviews tree views (over
gpp) - Promote + semantic-diff commands; MCP via
gpp mcp-server --stdio
- Timeline / Graphex / Reviews tree views (over
-
neovim-gppplugin:- Lua plugin with Telescope pickers (fallback to
vim.ui.select) - Timeline / log / Graphex-query / review pickers, inline virtual text
- Lua plugin with Telescope pickers (fallback to
- Performance optimization:
- Benchmark suite (criterion:
gpp-coreobject store,gpp-diffsemantic) - [~] Latency targets — baselines established; tuning is ongoing
- Benchmark suite (criterion:
-
gpp-deps:- Dependency list from lockfiles (Cargo.lock, package-lock.json)
- Heuristic offline risk score + notes
- Newly-added-dependency assessment (
gpp deps --since) - [~] Live registry/CVE/license APIs deferred (network + keys)
- SDK:
- Rust
gpp-sdk(AgentSession) shipped in Phase 3 - [~] Python/JS bindings: the CLI
--jsonsurface +gpp-sdkare the integration path; native PyO3/napi wrappers deferred (build tooling)
- Rust
- Plugin system:
- Language-parser plugin interface (
LanguageParser) —docs/PLUGINS.md - Policy template marketplace (
policies/,gpp policy template) - Compliance report formatters (stable
gpp auditoutput → CI actions)
- Language-parser plugin interface (
- Documentation:
- User guide (mdbook,
docs/book/) - API reference (
cargo doc; every crate has module docs) - All six tutorials (migrate / graphex / mcp / compliance / github / relay)
- User guide (mdbook,
- Distribution:
-
cargo install(gpp-cli, gpp-relay) - Homebrew formula (
packaging/homebrew/gpp.rb) - Docker images (
deploy/gpp,deploy/relay) - Release workflow (binaries + GHCR images on tag)
- [~] apt/dpkg packages deferred (tarball + Docker cover Linux)
-
- Community:
- Contributing guide (
docs/CONTRIBUTING.md) - Issue templates (bug / feature / good-first-issue)
- [~] Discord / logo / public launch — operational, not code
- Contributing guide (
Public launch. Developers can install, migrate from Git, connect AI agents, collaborate via GitHub, use rich TUI/editor interfaces, and contribute to the ecosystem. This is the "public launch" milestone.
ratatui,crossterm— terminal UIcriterion— benchmarks (dev-only)
- The TUI splits a pure
Dashboardsnapshot (aggregated from the stores, unit-tested without a TTY) from a thinratatuievent loop; promote/ approve from the TUI is deferred — the CLI remains the mutation surface. - VS Code / Neovim extensions are thin shells over the
gppCLI (--jsonwhere available) rather than reimplementing logic — the CLI is the single source of truth; MCP context injection ridesgpp mcp-server --stdio. gpp-depsis offline-only (lockfile parse + heuristic risk + newly-added diff). Live crates.io/npm/CVE/license APIs need network/keys and are a follow-up; the agent-dependency-assessment lens is implemented.- Native Python/JS SDK bindings (PyO3/napi) are deferred: they need extra
build toolchains. The
--jsonCLI surface plus the Rustgpp-sdkare the supported integration paths today. - Criterion benches establish baselines; the specific latency targets (timeline < 5ms, hot read < 1ms, 100k-object clone < 30s) are tracked as ongoing tuning, not a gate.
- Web UI for Graphex visualization (
gpp.devhosted platform — Mode 3) - JetBrains plugin (IntelliJ, WebStorm, etc.)
- Agent orchestration layer (lead agent reviewing exploration branches)
- Agent-to-agent collaboration (agents reading each other's explorations)
- AI-powered changeset summarization (built-in, using local models)
- Multi-repo workspaces (monorepo support)
- Graphex schema validation (enforce graph structure rules)
- Time-travel debugging integration (link to production observability)
- REST/gRPC API on relay node (for web UIs and remote tools)
- Hosted relay service (managed relay for teams not wanting to self-host)
- Migration tools from other VCS (Mercurial, SVN, Perforce)
- Mobile app for review and inbox notifications
| Phase | Duration | Cumulative | Milestone |
|---|---|---|---|
| 0: Foundation | 3 weeks | Week 3 | Repository initializes |
| 1: Timeline + History | 4 weeks | Week 7 | Better Git for solo devs |
| 2: Semantic Diff + Git Bridge | 4 weeks | Week 11 | Drop-in Git improvement |
| 3: Graphex + Encryption | 6 weeks | Week 17 | AI-native core |
| 4: Trust + Policy + Cost | 5 weeks | Week 22 | Enterprise-ready |
| 5: Sync Protocol | 6 weeks | Week 28 | Decentralized |
| 6: Review + RBAC + Notifications | 6 weeks | Week 34 | Team collaboration |
| 7: Remote Platforms + Relay | 6 weeks | Week 40 | GitHub-compatible |
| 8: TUI + Editors + Polish | 8 weeks | Week 48 | Public launch |
Total: ~48 weeks (12 months) to public launch.
This assumes a single focused developer. With a small team (2-3), phases can overlap and the timeline compresses to 7-8 months. Key acceleration opportunities:
- Phases 6 and 7 can largely run in parallel (review/RBAC is internal, remote/relay is external)
- The
gh-gppextension can start as early as Phase 2 (once Git bridge works) - TUI can start development alongside Phase 5 (sync) since it's UI over existing layers