OT-RFC-38 LU-7/8/9/10 — catchup + verify + attestation + devnet harness#609
OT-RFC-38 LU-7/8/9/10 — catchup + verify + attestation + devnet harness#609branarakic wants to merge 1 commit into
Conversation
…late-joiner devnet harness
Closes the curated-CG verification & late-joiner surface that LU-5 (edge
publish) opened, plus end-to-end devnet validation for the whole Phase A
slice. See `docs/specs/SPEC_CG_HOSTING_MEMBERSHIP.md` §7.1.1 for the
implementation-status table and the documented LU-6 gap.
Source surfaces
- LU-7 `POST /api/shared-memory/catchup` — caller-initiated
SWMCatchupRequest. Single-peer mode or parallel fan-out across all
connected peers. Public CGs accept anonymous catchup; curated CGs run
`authorizePrivateSyncRequest` against the requester's signed envelope.
- LU-8 `POST /api/shared-memory/{verify-batch,report-batch-rejection}` +
`packages/agent/src/swm/verify-batch.ts` — member post-decrypt root
recompute using V10's `computeFlatKCRootV10`/`computeFlatKCMerkleLeafCountV10`.
Mismatch → structured `BatchRejection` record gossiped via `agent.share()`
so other members can refetch from a different host.
- LU-9 `POST /api/attestation/{mint,verify}` +
`packages/agent/src/swm/member-attestation.ts` — member signs an
envelope binding (chainId, kavAddress, contextGraphId, batchId,
merkleRoot, plaintextLeafHash, attesterAddress, attestedAt) with
keccak256(abi.encodePacked(...)) + EIP-191 secp256k1, matching the V10
chain-side signature layout so outsiders can hand-verify. Verify route
runs signature recovery, signer-matches-attester, optional candidateLeaf
rehash, and an optional async membershipResolver chain hook.
- `packages/agent/src/swm/enumerate-cg-hosts.ts` — distinct from
`enumerate-cg-members`; returns dialable peer set for LU-7 catchup.
Phase A returns all connected peers minus self; Phase B will refine to
the sharding-table-eligible subset once shard count > 1.
- `packages/cli/src/daemon/routes/assertion.ts` — small read surface
additions that the new attestation flow leans on.
Devnet harness (`scripts/devnet-test-rfc38-*.sh`)
- 11 standalone end-to-end scenarios, all driven through the daemon
HTTP API (no custom libraries). `devnet-test-rfc38-all.sh` runs the
full suite end-to-end and prints a consolidated pass/fail summary.
- Covered: LU-5 (curated + public), LU-7, LU-8, LU-9, LU-10 (public-CG
regression sweep), `e2e` (LU-5→LU-7→LU-8→LU-9 composed in one
user-visible lifecycle), `cross-cg` (isolation: member of CG-A cannot
decrypt CG-B; outsider catchup denied), `multi-member` (3 distinct
member wallets cross-verify the same batch + cross-verify each
other's attestations), `scale` (50 triples / 25 KAs single batch),
`late-joiner` (member-from-curator + member-from-member with curator
offline; plus a documented LU-6 cores-only gap as a passing fail-soft
assertion).
- `scripts/devnet.sh restart-node N` op surface (restart a single node
without wiping state). The late-joiner scenario uses it to take the
curator offline mid-test and bring it back.
Documentation
- `docs/specs/SPEC_CG_HOSTING_MEMBERSHIP.md` §7.1.1 — implementation
status table for Phase A: LU-5/7/8/9/10 landed, LU-6 deferred. The
"deferred LU-6" subsection explains what still works on the current
branch (member-from-curator and member-from-member catchup) vs what
requires the substrate-subscription work (cores-only catchup when
every member is offline).
- `CHANGELOG.md` — Unreleased entry, scoped to OT-RFC-38 Phase A, with
one bullet per LU and a single "Deferred" callout.
Run instructions
./scripts/devnet.sh start 6 # 4 cores + 2 edges, fresh wallets
./scripts/devnet-test-rfc38-all.sh # ~10 min, 11 scenarios
Tested
- All 11 devnet scenarios PASS against a fresh 6-node devnet (4 cores
+ 2 edges, all wallets unique + funded, no on-chain identity for the
edges). Per-scenario logs land under `.devnet/integration-runs/<ts>/`.
Co-authored-by: Cursor <cursoragent@cursor.com>
| const swmGraphUri = contextGraphSharedMemoryUri(contextGraphId, subGraphName); | ||
| const dataGraphUri = `did:dkg:context-graph:${contextGraphId}`; | ||
| try { | ||
| const swmResult = await (agent as any).store.query( |
There was a problem hiding this comment.
🔴 Bug: when quads are omitted this reconstructs the candidate payload from the entire SWM/data graph, but expectedMerkleRoot is for a single KC/batch. As soon as a context graph contains more than one published batch, verify-batch will deterministically report root-mismatch for valid batches because unrelated triples are mixed in. Scope the local read to the requested batch/KC (or require callers to pass the exact batch quads) instead of querying every triple in the graph.
| try { | ||
| const cgList = await (agent as any).listContextGraphs?.(); | ||
| const match = (cgList ?? []).find((cg: any) => cg.id === contextGraphId); | ||
| onChainCgId = match?.onChainId ?? '0'; |
There was a problem hiding this comment.
🔴 Bug: this silently signs attestations with contextGraphId = "0" whenever the local subscription metadata cannot resolve an on-chain id. That produces tokens bound to the wrong domain even though the KC already exists on-chain. Resolve the CG id from chain truth (getKCContextGraphId(BigInt(batchId)) / getContextGraphOnChainId) and fail if it cannot be determined instead of minting an attestation with a placeholder id.
| { subject, predicate: `${NS}rejectedByPeer`, object: `"${record.rejectedBy.peerId ?? ''}"`, graph: '' }, | ||
| { subject, predicate: `${NS}rejectionReportedAt`, object: `"${record.reportedAt}"`, graph: '' }, | ||
| ...(record.batchId !== undefined | ||
| ? [{ subject, predicate: `${NS}rejectedBatchId`, object: `"${record.batchId}"`, graph: '' }] |
There was a problem hiding this comment.
🔴 Bug: batchId comes from the HTTP body and is interpolated into an RDF literal without escaping. A value containing ", newlines, or RDF syntax will either break the SWM write or let callers smuggle malformed triples through this endpoint. Escape the literal with the existing RDF helper (or reject unsafe input) before passing it to agent.share().
| input.verifyResult.actualRoot, | ||
| input.verifyResult.reason ?? 'unknown', | ||
| input.rejectedBy.agentAddress, | ||
| reportedAt, |
There was a problem hiding this comment.
🟡 Issue: reportedAt is part of the rejection digest, so retries of the same rejection from the same member always mint a new subject URI instead of deduping. That defeats the stated hash-dedupe identical rejection reports behavior and makes idempotent re-reporting impossible. Use a stable digest key derived from the batch/root/rejecter fields and keep reportedAt as metadata outside the digest.
| log " OT-RFC-38 INTEGRATION RUN SUMMARY" | ||
| log "================================================================" | ||
| note "OT-RFC-38 INTEGRATION RUN SUMMARY" | ||
| note "Run started: $(date -u -r "$START_TS" +'%Y-%m-%dT%H:%M:%SZ')" |
There was a problem hiding this comment.
🔴 Bug: date -r <epoch-seconds> is BSD/macOS syntax; on GNU/Linux -r expects a file path, so the summary step fails on the project's primary dev/CI environment. Use a portable epoch conversion (date -u -d "@${START_TS}" ... on GNU, or a small POSIX-compatible helper) before relying on this runner in Linux devnets.
Summary
Closes the curated-CG verification & late-joiner surface that LU-5 (#608) opened, plus end-to-end devnet validation for the whole Phase A slice. See
docs/specs/SPEC_CG_HOSTING_MEMBERSHIP.md§7.1.1 for the implementation-status table and the documented LU-6 gap.Stacked on: #608 (LU-5 edge publish), which is itself stacked on #595 (SPEC_CG_MEMORY_MODEL LU-1..LU-4). Reviewing in order keeps each diff small.
Source surfaces
POST /api/shared-memory/catchup— caller-initiatedSWMCatchupRequest. Single-peer mode or parallel fan-out across all connected peers. Public CGs accept anonymous catchup; curated CGs runauthorizePrivateSyncRequestagainst the requester's signed envelope.POST /api/shared-memory/{verify-batch,report-batch-rejection}+packages/agent/src/swm/verify-batch.ts— member post-decrypt root recompute using V10'scomputeFlatKCRootV10/computeFlatKCMerkleLeafCountV10. Mismatch → structuredBatchRejectionrecord gossiped viaagent.share()so other members can refetch from a different host.POST /api/attestation/{mint,verify}+packages/agent/src/swm/member-attestation.ts— member signs an envelope binding(chainId, kavAddress, contextGraphId, batchId, merkleRoot, plaintextLeafHash, attesterAddress, attestedAt)withkeccak256(abi.encodePacked(...))+ EIP-191 secp256k1, matching V10 chain-side signature layout so outsiders can hand-verify. Verify route runs signature recovery, signer-matches-attester, optionalcandidateLeafrehash, and an optional asyncmembershipResolverchain hook.enumerate-cg-hosts(packages/agent/src/swm/enumerate-cg-hosts.ts) — distinct fromenumerate-cg-members; returns dialable peer set for LU-7 catchup. Phase A returns all connected peers minus self; Phase B will refine to the sharding-table-eligible subset once shard count > 1.packages/cli/src/daemon/routes/assertion.ts— small read surface additions the new attestation flow leans on.Devnet harness (
scripts/devnet-test-rfc38-*.sh)11 standalone end-to-end scenarios, all driven through the daemon HTTP API (no custom libraries).
devnet-test-rfc38-all.shruns the full suite end-to-end and prints a consolidated pass/fail summary. Covered:scripts/devnet.sh restart-node Nop surface (restart a single node without wiping state). The late-joiner scenario uses it to take the curator offline mid-test and bring it back.Test plan
Unit:
packages/agent/test/verify-batch.test.ts— pure recompute helper unit testspackages/agent/test/member-attestation.test.ts— mint+verify roundtrip + tamper detection + membership resolver pathspackages/agent/test/enumerate-cg-hosts.test.ts— dialable-peer enumerationDevnet — all 11 scenarios PASS against a fresh 6-node devnet (4 cores + 2 edges, all wallets unique + funded, no on-chain identity for the edges):
Run instructions
For UI manual testing — point Vite at any edge node:
DEVNET_UI_NODE=5 ./scripts/devnet.sh ui start # http://localhost:5173/ui/ now proxies /api/* to node 5 (edge curator)Deferred (Phase A sub-task, tracked for follow-up)
LU-6 substrate hosting on cores — cores do not yet subscribe to the curated-CG SWM gossip topic via the sharding-table assignment (RFC §5.1 + §5.1.1 pre-registration staging). Today's catchup model works when the curator OR any other current member is online; if every member is offline, a late joiner's catchup against cores returns 0 triples cleanly (no crash).
devnet-test-rfc38-late-joiner.shSCENARIO C asserts this fail-soft shape. Full LU-6 lands the encrypted SWM substrate (theSwmSenderKeytwo-layer Sender Keys construction already inpackages/core/src/crypto/swm-sender-key.tsbut not yet wired to the workspace-gossip topic) plus the TTL + byte-cap staging policies in §5.1.1. Path forward documented indocs/specs/SPEC_CG_HOSTING_MEMBERSHIP.md§7.1.1.Made with Cursor