Skip to content

OT-RFC-38 LU-7/8/9/10 — catchup + verify + attestation + devnet harness#609

Open
branarakic wants to merge 1 commit into
feat/ot-rfc-38-lu5from
feat/ot-rfc-38-lu7-10
Open

OT-RFC-38 LU-7/8/9/10 — catchup + verify + attestation + devnet harness#609
branarakic wants to merge 1 commit into
feat/ot-rfc-38-lu5from
feat/ot-rfc-38-lu7-10

Conversation

@branarakic
Copy link
Copy Markdown
Contributor

Summary

Closes the curated-CG verification & late-joiner surface that LU-5 (#608) opened, plus end-to-end devnet validation for the whole Phase A slice. See docs/specs/SPEC_CG_HOSTING_MEMBERSHIP.md §7.1.1 for the implementation-status table and the documented LU-6 gap.

Stacked on: #608 (LU-5 edge publish), which is itself stacked on #595 (SPEC_CG_MEMORY_MODEL LU-1..LU-4). Reviewing in order keeps each diff small.

Source surfaces

  • LU-7 POST /api/shared-memory/catchup — caller-initiated SWMCatchupRequest. Single-peer mode or parallel fan-out across all connected peers. Public CGs accept anonymous catchup; curated CGs run authorizePrivateSyncRequest against the requester's signed envelope.
  • LU-8 POST /api/shared-memory/{verify-batch,report-batch-rejection} + packages/agent/src/swm/verify-batch.ts — member post-decrypt root recompute using V10's computeFlatKCRootV10 / computeFlatKCMerkleLeafCountV10. Mismatch → structured BatchRejection record gossiped via agent.share() so other members can refetch from a different host.
  • LU-9 POST /api/attestation/{mint,verify} + packages/agent/src/swm/member-attestation.ts — member signs an envelope binding (chainId, kavAddress, contextGraphId, batchId, merkleRoot, plaintextLeafHash, attesterAddress, attestedAt) with keccak256(abi.encodePacked(...)) + EIP-191 secp256k1, matching V10 chain-side signature layout so outsiders can hand-verify. Verify route runs signature recovery, signer-matches-attester, optional candidateLeaf rehash, and an optional async membershipResolver chain hook.
  • enumerate-cg-hosts (packages/agent/src/swm/enumerate-cg-hosts.ts) — distinct from enumerate-cg-members; returns dialable peer set for LU-7 catchup. Phase A returns all connected peers minus self; Phase B will refine to the sharding-table-eligible subset once shard count > 1.
  • packages/cli/src/daemon/routes/assertion.ts — small read surface additions the new attestation flow leans on.

Devnet harness (scripts/devnet-test-rfc38-*.sh)

11 standalone end-to-end scenarios, all driven through the daemon HTTP API (no custom libraries). devnet-test-rfc38-all.sh runs the full suite end-to-end and prints a consolidated pass/fail summary. Covered:

id what it exercises
lu5-pub LU-5 public CG regression (edge publishes plaintext to VM, no encryption)
lu5-cur LU-5 curated CG edge publish (chain-key AEAD wrap + no-attribution VM publish)
lu7 LU-7 SWMCatchupRequest (public anonymous + curated member-auth + outsider denial)
lu8 LU-8 verify-batch + report-batch-rejection (member post-decrypt root recompute + gossip)
lu9 LU-9 member-attestation mint+verify (roundtrip + 3 negative-path scenarios)
lu10 LU-10 public-CG regression sweep (publish + anonymous catchup + verify-batch + attestation, all on a public CG)
e2e end-to-end lifecycle (LU-5 → LU-7 → LU-8 → LU-9 composed in one user-visible scenario)
xcg cross-CG isolation (member of CG-A cannot read CG-B; outsider catchup denied; curator can still decrypt its own CGs)
mm multi-member CG (3 distinct member wallets; each verify-batches the same root; outsider cross-verifies all 3 attestations)
scale scale probe (50 triples / 25 KAs in one curated batch; full verify + attestation roundtrip)
lj late-joiner (member-from-curator + member-from-member-with-curator-offline; documented LU-6 cores-only gap as passing fail-soft assertion)

scripts/devnet.sh restart-node N op surface (restart a single node without wiping state). The late-joiner scenario uses it to take the curator offline mid-test and bring it back.

Test plan

Unit:

  • packages/agent/test/verify-batch.test.ts — pure recompute helper unit tests
  • packages/agent/test/member-attestation.test.ts — mint+verify roundtrip + tamper detection + membership resolver paths
  • packages/agent/test/enumerate-cg-hosts.test.ts — dialable-peer enumeration

Devnet — all 11 scenarios PASS against a fresh 6-node devnet (4 cores + 2 edges, all wallets unique + funded, no on-chain identity for the edges):

[ok] lu5-pub  PASS
[ok] lu5-cur  PASS
[ok] lu7      PASS
[ok] lu8      PASS
[ok] lu9      PASS
[ok] lu10     PASS
[ok] e2e      PASS
[ok] xcg      PASS
[ok] mm       PASS
[ok] scale    PASS
[ok] lj       PASS

All 11 scenarios PASSED.

Run instructions

./scripts/devnet.sh start 6          # 4 cores + 2 edges, fresh wallets
./scripts/devnet-test-rfc38-all.sh   # ~10 min, 11 scenarios end-to-end

For UI manual testing — point Vite at any edge node:

DEVNET_UI_NODE=5 ./scripts/devnet.sh ui start
# http://localhost:5173/ui/ now proxies /api/* to node 5 (edge curator)

Deferred (Phase A sub-task, tracked for follow-up)

LU-6 substrate hosting on cores — cores do not yet subscribe to the curated-CG SWM gossip topic via the sharding-table assignment (RFC §5.1 + §5.1.1 pre-registration staging). Today's catchup model works when the curator OR any other current member is online; if every member is offline, a late joiner's catchup against cores returns 0 triples cleanly (no crash). devnet-test-rfc38-late-joiner.sh SCENARIO C asserts this fail-soft shape. Full LU-6 lands the encrypted SWM substrate (the SwmSenderKey two-layer Sender Keys construction already in packages/core/src/crypto/swm-sender-key.ts but not yet wired to the workspace-gossip topic) plus the TTL + byte-cap staging policies in §5.1.1. Path forward documented in docs/specs/SPEC_CG_HOSTING_MEMBERSHIP.md §7.1.1.

Made with Cursor

…late-joiner devnet harness

Closes the curated-CG verification & late-joiner surface that LU-5 (edge
publish) opened, plus end-to-end devnet validation for the whole Phase A
slice. See `docs/specs/SPEC_CG_HOSTING_MEMBERSHIP.md` §7.1.1 for the
implementation-status table and the documented LU-6 gap.

Source surfaces

  - LU-7 `POST /api/shared-memory/catchup` — caller-initiated
    SWMCatchupRequest. Single-peer mode or parallel fan-out across all
    connected peers. Public CGs accept anonymous catchup; curated CGs run
    `authorizePrivateSyncRequest` against the requester's signed envelope.
  - LU-8 `POST /api/shared-memory/{verify-batch,report-batch-rejection}` +
    `packages/agent/src/swm/verify-batch.ts` — member post-decrypt root
    recompute using V10's `computeFlatKCRootV10`/`computeFlatKCMerkleLeafCountV10`.
    Mismatch → structured `BatchRejection` record gossiped via `agent.share()`
    so other members can refetch from a different host.
  - LU-9 `POST /api/attestation/{mint,verify}` +
    `packages/agent/src/swm/member-attestation.ts` — member signs an
    envelope binding (chainId, kavAddress, contextGraphId, batchId,
    merkleRoot, plaintextLeafHash, attesterAddress, attestedAt) with
    keccak256(abi.encodePacked(...)) + EIP-191 secp256k1, matching the V10
    chain-side signature layout so outsiders can hand-verify. Verify route
    runs signature recovery, signer-matches-attester, optional candidateLeaf
    rehash, and an optional async membershipResolver chain hook.
  - `packages/agent/src/swm/enumerate-cg-hosts.ts` — distinct from
    `enumerate-cg-members`; returns dialable peer set for LU-7 catchup.
    Phase A returns all connected peers minus self; Phase B will refine to
    the sharding-table-eligible subset once shard count > 1.
  - `packages/cli/src/daemon/routes/assertion.ts` — small read surface
    additions that the new attestation flow leans on.

Devnet harness (`scripts/devnet-test-rfc38-*.sh`)

  - 11 standalone end-to-end scenarios, all driven through the daemon
    HTTP API (no custom libraries). `devnet-test-rfc38-all.sh` runs the
    full suite end-to-end and prints a consolidated pass/fail summary.
  - Covered: LU-5 (curated + public), LU-7, LU-8, LU-9, LU-10 (public-CG
    regression sweep), `e2e` (LU-5→LU-7→LU-8→LU-9 composed in one
    user-visible lifecycle), `cross-cg` (isolation: member of CG-A cannot
    decrypt CG-B; outsider catchup denied), `multi-member` (3 distinct
    member wallets cross-verify the same batch + cross-verify each
    other's attestations), `scale` (50 triples / 25 KAs single batch),
    `late-joiner` (member-from-curator + member-from-member with curator
    offline; plus a documented LU-6 cores-only gap as a passing fail-soft
    assertion).
  - `scripts/devnet.sh restart-node N` op surface (restart a single node
    without wiping state). The late-joiner scenario uses it to take the
    curator offline mid-test and bring it back.

Documentation

  - `docs/specs/SPEC_CG_HOSTING_MEMBERSHIP.md` §7.1.1 — implementation
    status table for Phase A: LU-5/7/8/9/10 landed, LU-6 deferred. The
    "deferred LU-6" subsection explains what still works on the current
    branch (member-from-curator and member-from-member catchup) vs what
    requires the substrate-subscription work (cores-only catchup when
    every member is offline).
  - `CHANGELOG.md` — Unreleased entry, scoped to OT-RFC-38 Phase A, with
    one bullet per LU and a single "Deferred" callout.

Run instructions

    ./scripts/devnet.sh start 6        # 4 cores + 2 edges, fresh wallets
    ./scripts/devnet-test-rfc38-all.sh # ~10 min, 11 scenarios

Tested

  - All 11 devnet scenarios PASS against a fresh 6-node devnet (4 cores
    + 2 edges, all wallets unique + funded, no on-chain identity for the
    edges). Per-scenario logs land under `.devnet/integration-runs/<ts>/`.

Co-authored-by: Cursor <cursoragent@cursor.com>
const swmGraphUri = contextGraphSharedMemoryUri(contextGraphId, subGraphName);
const dataGraphUri = `did:dkg:context-graph:${contextGraphId}`;
try {
const swmResult = await (agent as any).store.query(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: when quads are omitted this reconstructs the candidate payload from the entire SWM/data graph, but expectedMerkleRoot is for a single KC/batch. As soon as a context graph contains more than one published batch, verify-batch will deterministically report root-mismatch for valid batches because unrelated triples are mixed in. Scope the local read to the requested batch/KC (or require callers to pass the exact batch quads) instead of querying every triple in the graph.

try {
const cgList = await (agent as any).listContextGraphs?.();
const match = (cgList ?? []).find((cg: any) => cg.id === contextGraphId);
onChainCgId = match?.onChainId ?? '0';
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: this silently signs attestations with contextGraphId = "0" whenever the local subscription metadata cannot resolve an on-chain id. That produces tokens bound to the wrong domain even though the KC already exists on-chain. Resolve the CG id from chain truth (getKCContextGraphId(BigInt(batchId)) / getContextGraphOnChainId) and fail if it cannot be determined instead of minting an attestation with a placeholder id.

{ subject, predicate: `${NS}rejectedByPeer`, object: `"${record.rejectedBy.peerId ?? ''}"`, graph: '' },
{ subject, predicate: `${NS}rejectionReportedAt`, object: `"${record.reportedAt}"`, graph: '' },
...(record.batchId !== undefined
? [{ subject, predicate: `${NS}rejectedBatchId`, object: `"${record.batchId}"`, graph: '' }]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: batchId comes from the HTTP body and is interpolated into an RDF literal without escaping. A value containing ", newlines, or RDF syntax will either break the SWM write or let callers smuggle malformed triples through this endpoint. Escape the literal with the existing RDF helper (or reject unsafe input) before passing it to agent.share().

input.verifyResult.actualRoot,
input.verifyResult.reason ?? 'unknown',
input.rejectedBy.agentAddress,
reportedAt,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: reportedAt is part of the rejection digest, so retries of the same rejection from the same member always mint a new subject URI instead of deduping. That defeats the stated hash-dedupe identical rejection reports behavior and makes idempotent re-reporting impossible. Use a stable digest key derived from the batch/root/rejecter fields and keep reportedAt as metadata outside the digest.

log " OT-RFC-38 INTEGRATION RUN SUMMARY"
log "================================================================"
note "OT-RFC-38 INTEGRATION RUN SUMMARY"
note "Run started: $(date -u -r "$START_TS" +'%Y-%m-%dT%H:%M:%SZ')"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: date -r <epoch-seconds> is BSD/macOS syntax; on GNU/Linux -r expects a file path, so the summary step fails on the project's primary dev/CI environment. Use a portable epoch conversion (date -u -d "@${START_TS}" ... on GNU, or a small POSIX-compatible helper) before relying on this runner in Linux devnets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant