gpp uses a CRDT-based, offline-first, peer-to-peer synchronization protocol. There is no central server requirement. Any gpp repository can sync with any other gpp repository directly.
- Offline-first. Work without network, sync when reconnected. All operations are local.
- No central authority. Any peer is equal. No single point of failure.
- Convergent. All peers that have seen the same set of operations will have the same state.
- Efficient. Only transfer what the other peer doesn't have.
- Encrypted in transit. All peer communication uses Noise protocol (like WireGuard).
- Zero-knowledge graph sync. Graph structure can sync without decrypting node content.
| Layer | What Syncs | CRDT Type | Notes |
|---|---|---|---|
| Objects | Blobs, trees, changesets | Set (add-only) | Content-addressed, immutable |
| History | Changeset DAG | DAG CRDT | Parents define partial order |
| Refs | Branch pointers | LWW Register | Last-writer-wins per ref |
| Graphex | Graph nodes + edges | OR-Set | Add/remove with unique tags |
| Trust | Agent scores | Local only | Never synced β each peer computes independently |
| Timeline | File changes | Local only | Never synced β local safety net only |
| Policies | Policy rules | Set (add-only) | Shared policies sync; local overrides don't |
Peers are configured explicitly (no auto-discovery for security):
# .gpp/config.toml
[sync]
peers = [
{ name = "office", address = "192.168.1.50:9473" },
{ name = "backup", address = "backup.example.com:9473" },
]Initiator Responder
β β
βββββ Noise_XX handshake βββββββββββββββ€
β (ephemeral keys, then static) β
β β
βββββ Auth: repo ID + peer identity ββββ€
β β
βββββ Protocol version negotiation βββββ€
β β
βΌ Encrypted channel established βΌ
Transport: TCP + Noise_XX pattern. Both peers authenticate with their static keys. The repo ID ensures both sides are syncing the same repository.
Both peers exchange their "state vector" β a compact summary of what they have:
struct StateVector {
repo_id: Hash,
// Object store: bloom filter of known object hashes
object_bloom: BloomFilter, // False positive rate ~0.01%
// History: set of changeset hashes (tips of all branches)
branch_tips: BTreeMap<String, Hash>,
// Graphex: vector clock of graph operations
graph_vector_clock: VectorClock,
// Policies: hash of active policy set
policy_set_hash: Hash,
// Timestamp of last sync with this peer
last_sync: i64,
}Each peer computes what the other is missing:
Local state vector ββ
βββ Compute delta ββ List of objects/operations to send
Remote state vector ββ
For objects: use bloom filter to identify candidates, then send exact hash lists for the candidates to confirm what's actually missing.
For history: walk the changeset DAG from branch tips backward until reaching changesets the remote peer already has.
For Graphex: use vector clock comparison to identify unseen operations.
enum SyncMessage {
// Object transfer
ObjectBatch { objects: Vec<(Hash, Vec<u8>)> },
ObjectRequest { hashes: Vec<Hash> },
// History transfer
ChangesetBatch { changesets: Vec<Changeset> },
RefUpdate { name: String, hash: Hash, timestamp: i64 },
// Graphex transfer
GraphOperationBatch { operations: Vec<GraphOperation> },
// Policy transfer
PolicySet { policies: Vec<PolicyRule> },
// Control
SyncComplete,
Error { code: u32, message: String },
}Objects are transferred in batches of up to 1MB. The receiving peer validates each object (hash check) before acknowledging.
After transfer, both peers apply received operations:
- Objects: store in object store (idempotent β same hash = same content)
- History: add changesets to DAG, update refs using LWW
- Graphex: apply operations to OR-Set, resolve conflicts
- Policies: merge policy sets
When two peers have different branch tips for the same branch name, gpp does NOT auto-merge. Instead:
Peer A: main β cs:abc123
Peer B: main β cs:def456
After sync, both peers see:
main β cs:abc123 (Peer A's version, by LWW)
main@B β cs:def456 (Peer B's version, preserved as fork)
The developer then explicitly merges:
gpp merge main@B # Merge Peer B's divergent main into local mainGraph operations use an OR-Set CRDT (Observed-Remove Set):
- Add wins over remove when concurrent (if one peer adds a node while another removes it, the node stays)
- Property conflicts use LWW (last-writer-wins based on timestamp)
- Edge conflicts are resolved by keeping both β humans review and prune
Branch refs (pointers to changesets) use Last-Writer-Wins Register:
- Each ref update carries a Lamport timestamp
- Higher timestamp wins
- Ties broken by peer ID (deterministic ordering)
Content-addressed storage means identical files are never transferred twice, even across different changesets or branches.
For initial clone or large syncs, objects are packed into "thin packs" β delta-compressed bundles where similar objects are stored as deltas against a base. Similar to Git's pack files but using zstd dictionary compression for better ratios.
After initial clone, syncs only transfer new objects since last_sync. The bloom filter exchange is O(n) in filter size, not in object count.
For federated Graphex subgraphs, you can sync just the graph layer without syncing code:
gpp sync --graph-only --peer conventions-serverWhen syncing Graphex with a peer, the following information is shared:
- Graph structure (which nodes exist, which edges connect them)
- Node metadata (type, name, access tier, timestamps)
- Encrypted node content blobs (the peer gets the ciphertext but can't read it without the right tier key)
The peer CANNOT read:
- Node descriptions or properties (encrypted)
- Convention text (encrypted)
- Glossary definitions (encrypted)
This means a backup server can hold a full copy of the graph for redundancy without being able to read the project's knowledge. Only peers with the appropriate tier keys can decrypt.
Laptop βββsyncβββ NAS (backup)
Dev A βββsyncβββ Dev B
β β
βββsyncβββ Office Server βββsyncβββ
Project A βββsyncβββ Project A Backup
β
βββfederationβββ Shared Conventions βββfederationβββ
β β
Project B βββsyncβββ Project B Backup β
β β
βββfederationβββ Shared Conventions βββββββββββββββββ
βββββββββββββββββ
β Length: u32 β Payload length (big-endian)
β Type: u8 β Message type enum
β Payload: [u8] β MessagePack-encoded body
β MAC: [u8; 16] β Noise protocol MAC
βββββββββββββββββ
| Type | Code | Direction | Description |
|---|---|---|---|
Hello |
0x01 | Both | Initial handshake |
StateVector |
0x02 | Both | State vector exchange |
ObjectBatch |
0x10 | Both | Batch of objects |
ObjectRequest |
0x11 | Both | Request missing objects |
ChangesetBatch |
0x20 | Both | Batch of changesets |
RefUpdate |
0x21 | Both | Branch ref update |
GraphOpBatch |
0x30 | Both | Batch of graph operations |
PolicySet |
0x40 | Both | Policy set |
Ack |
0xF0 | Both | Acknowledgment |
Error |
0xFE | Both | Error message |
Done |
0xFF | Both | Sync complete |
- Connection lost mid-sync: Resume from last acknowledged batch. State vector exchange is repeated but cheap.
- Corrupt object received: Reject (hash doesn't match), request retransmission.
- Clock skew: Lamport timestamps are logical, not wall-clock. No NTP dependency.
- Peer permanently unavailable: Other peers still work. No single point of failure.
- Storage full: Sync pauses with clear error. Resume when space available.
The relay node (gpp-relay) is a specialized peer optimized for always-on availability.
The relay differs from a regular peer in these ways:
- No working directory. The relay doesn't check out files. It stores objects and forwards sync operations only.
- No timeline. The relay never captures file changes (there are no files).
- No trust computation. Trust scores are local; the relay doesn't compute them.
- No policy enforcement. Policies are enforced by the receiving peer, not the relay.
- RBAC enforcement. The relay DOES check RBAC roles β it rejects pushes from peers without the required role.
- Multi-repo support. A single relay can host multiple repositories, each isolated.
# /etc/gpp-relay/config.toml
[relay]
port = 9473
storage = "/data/gpp"
max_repos = 100
max_storage_gb = 500
log_level = "info"
[auth]
authorized_keys = "/etc/gpp-relay/authorized_keys"
allow_anonymous_read = false # Public repos could enable this
[limits]
max_object_size_mb = 100
max_batch_size_mb = 50
rate_limit_per_peer = 1000 # Max sync operations per minute per peerDeveloper A Relay Developer B
β β β
βββ push changeset ββββββββββββ β
β (relay stores objects, β β
β validates RBAC role) β β
β β β
β ββββ pull βββββββββββββββββββββ€
β β (relay sends delta) β
β βββ objects + changesets βββββββ
β β β
The relay is transparent β Developer B sees the same objects as if they synced directly with Developer A. The relay just adds persistence and availability.
The sync protocol enforces RBAC roles at the protocol level.
| Operation | Required Role | Enforcement Point |
|---|---|---|
| Pull objects/history | Reader or above | Relay/peer |
| Push objects/history | Contributor or above | Relay/peer |
| Push ref updates (non-protected) | Contributor or above | Relay/peer |
| Push ref updates (protected branch) | Maintainer or above | Relay/peer |
| Push role changes | Owner | Relay/peer |
| Push policy changes | Maintainer or above | Relay/peer |
| Pull Graphex (encrypted) | Reader or above | Relay/peer (can't decrypt) |
| Push Graphex operations | Contributor or above | Relay/peer |
The sync protocol resolves peer identity from their Noise static key:
- Peer authenticates via Noise handshake (static key verified)
- Relay/peer maps static key to identity (email/fingerprint)
- Identity looked up in RBAC roles table
- Role checked against required permission for the operation
- If role insufficient, operation rejected with error code 7 (Permission denied)
Reviews can sync between peers alongside changesets.
| Data | Syncs? | Notes |
|---|---|---|
| Review objects | Yes | Status, decisions, policy requirements |
| Review comments | Yes | File/line-targeted comments |
| ConversationThreads | Yes | Full thread history |
| Remote PR links | No | Platform-specific, local to each peer |
If two reviewers approve on different peers before syncing:
- Both approvals are kept (additive)
- The review status resolves to the most advanced state (approved > pending)
- If conflicting decisions exist (one approve, one reject), status stays pending and both decisions are visible for the maintainer to resolve
Default port: 9473 (TCP)
Service name: gpp-sync
Multicast discovery (LAN only, optional): 239.73.73.73:9473