Skip to content

Commit d1347d7

Browse files
test/scripts: harden rc.12 devnet test suite from comprehensive RC validation
Productionising the test-suite findings from a full rc.12 comprehensive devnet sweep so the same harness reliably guards rc.13+. All changes are in scripts/; no production code touched. Fixes uncovered while testing release/rc.12: v10-rc-validation.sh — migrate to rc.12 API shapes * publish path: /api/publish (removed) → /api/shared-memory/write + /api/shared-memory/publish, with selection.rootEntities so re-runs against a CG that already has SWM content don't trip the rc.12 rootEntity-uniqueness rule * private quads: rewritten to /api/update + privateMerkleRoot receipt (rc.12 stores private quads encrypted-at-rest; they are intentionally not served via /api/query, so the old "publisher sees private back" assertion is replaced with the storage-receipt check) * CAS: send a real non-empty conditions array (rc.12 rejects empty) * chat: { to, text } (was recipientPeerId/text) * identity: /api/profile (removed) → /api/identity + /api/status * exit non-zero on any FAIL so the orchestrator picks it up * timestamp-suffixed entity URIs, sub-graph names, assertion names so every re-run is collision-free even on a long-lived devnet * helper rewritten so the JSON-extraction pipeline is heredoc-safe (was: stdin shadowed by heredoc; expressions with semicolons silently SyntaxError'd and every parse returned EMPTY) swm-soak-test.sh — python heredoc bug * Final-summary block used `<<PYTHON` (unquoted), so bash expanded every backtick-quoted code reference in Python comments (`m = re.match(...)`, `s`, `by_cg_final[cg]`, etc.) as command substitutions, producing a flood of "command not found" lines. Switched to `<<'PYTHON'` + env-var pass-through. devnet-test-rfc38-curator-offline-midbatch.sh — wrong endpoint * wait_for_m1_onchain_id polled `GET /api/context-graph/<id>`, which is a 404 (no such route) — the test reliably false-failed on every run. Fixed to use `GET /api/context-graph/list` + client-side lookup, and made the timeout configurable via RFC38_M1_ONCHAIN_WAIT_S=60. devnet-test-rfc38-unclean-restart.sh — too-fast catchup + curl ARG_MAX * Bumped WRITES_COUNT 20 → 200 and WRITE_PAYLOAD_BYTES 4096 → 16384, and dropped the partial-catchup poll from 1 s → 100 ms, so the test reliably observes M1 mid-batch at rc.12 catchup speeds (verified: M1_pre-kill = 159 / 200 on the reference devnet). * api_call streams large bodies through stdin (`-d @-`) instead of argv; pre-fix the 3.2 MiB stress body tripped macOS ARG_MAX with "Argument list too long". devnet.sh — `stop` is now actually idempotent * After stopping by pidfile, sweep every devnet port (HARDHAT_PORT + API_PORT_BASE..+N + LIBP2P_PORT_BASE..+N) with `lsof` and SIGTERM/SIGKILL any process still LISTENing. Catches stale processes from prior rc.X devnets that were killed at the worker layer while the supervisor respawned them. Opt out with DEVNET_STOP_PORT_SWEEP=0 when running multiple devnets on one host. Promoted from .rc12-test/ scratch dir to scripts/ for rc.13+: * devnet-probe-hub-rotation.sh — PR OriginTrail#689 (chain hub rotation) * devnet-probe-multi-rpc-failover.sh — PR OriginTrail#684 (multi-RPC failover) * devnet-probe-libp2p-tunables.sh — PR OriginTrail#698 (libp2p tunables) * devnet-probe-cg-phonebook.sh — PR OriginTrail#700 (agents CG) * devnet-probe-ack-rejection-reasons.sh — PR OriginTrail#711 (ACK gate diagnostics) * devnet-test-node-ui-smoke.sh — Vite dev-server smoke * devnet-comprehensive.sh — orchestrator wiring all of the above + v10-rc-validation + _devnet-full-sweep + rc11 recovery + rfc38-all + soak suite; bash-3.2 safe (no `declare -A`); SKIP_SOAK / SOAK_ONLY / SKIP_PROBES / SKIP_RFC38_EXTRAS / SKIP_UI / FAIL_FAST knobs. Verified on the rc.12 reference devnet: - v10-rc-validation: 34 PASS / 0 FAIL / 2 WARN (exit 0) - rfc38-unclean-restart: PASS (mid-batch window: 159/200) - rfc38-curator-offline-midbatch: PASS - 5 promoted probes (new layout): all PASS - swm-soak (2 cycles, quick): 100.00% on both CGs, no errors - devnet-comprehensive orchestrator: pre-flight + suite registration OK Co-authored-by: Cursor <cursoragent@cursor.com>
1 parent f75b171 commit d1347d7

12 files changed

Lines changed: 1574 additions & 172 deletions

scripts/devnet-comprehensive.sh

Lines changed: 331 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,331 @@
1+
#!/usr/bin/env bash
2+
#
3+
# Comprehensive devnet test orchestrator.
4+
#
5+
# Runs (in order, on top of an already-started 6-node devnet):
6+
# 1. v10-rc-validation.sh (15-section API smoke)
7+
# 2. _devnet-full-sweep.sh (baseline harnesses)
8+
# 3. rc11 recovery tests (promote-crash + shutdown-mid-publish)
9+
# 4. rfc38-all aggregator (lu5/lu5-pub/lu7/lu8/lu9/lu10/e2e/xcg/mm/scale/lj)
10+
# 5. rc.12 feature probes (hub-rotation, multi-RPC failover, libp2p tunables,
11+
# CG-phonebook agent discovery, structured ACK rejection reasons)
12+
# 6. node-ui smoke
13+
# 7. soak suite (libp2p / SWM / RS)
14+
#
15+
# Bash 3.2 compatible (uses parallel indexed arrays instead of
16+
# `declare -A`).
17+
#
18+
# IMPORTANT: do NOT edit this file while an instance is running — bash 3.2
19+
# re-reads the script as it executes, and a mid-run byte shift will derail
20+
# the parser (we've observed double-emitted PASS/FAIL log lines under that
21+
# race). Wait for the run to finish, or duplicate the script before
22+
# editing.
23+
#
24+
# Env knobs:
25+
# RESULTS_DIR override the output directory
26+
# (default: $REPO_ROOT/.devnet/comprehensive-results/<ts>)
27+
# SKIP_SOAK=1 skip the long soak suite
28+
# SOAK_ONLY=1 run only the soak suite
29+
# SKIP_PROBES=1 skip the rc.12-specific probes
30+
# SKIP_RFC38_EXTRAS=1 skip the rfc38-all suite
31+
# SKIP_UI=1 skip the node-ui smoke
32+
# FAIL_FAST=1 stop on first FAIL
33+
# SOAK_RS_SECONDS length of the devnet-soak-rs run (default 1800)
34+
# SOAK_LIBP2P_CYCLES libp2p-soak cycle count (default 5; each cycle ~60s)
35+
# SOAK_SWM_CYCLES swm-soak cycle count (default 10)
36+
37+
set -u
38+
39+
REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
40+
TS=$(date -u +'%Y%m%dT%H%M%SZ')
41+
RESULTS="${RESULTS_DIR:-$REPO_ROOT/.devnet/comprehensive-results/$TS}"
42+
mkdir -p "$RESULTS"
43+
# `latest` symlink for convenience. Use -sfn so re-runs in the same dir
44+
# atomically replace any prior link without leaving a "latest/latest" trail.
45+
ln -sfn "$TS" "$(dirname "$RESULTS")/latest" 2>/dev/null || true
46+
47+
log() { echo "[orch $(date -u +'%H:%M:%S')] $*" | tee -a "$RESULTS/orchestrator.log"; }
48+
49+
# ── Pre-flight ───────────────────────────────────────────────────
50+
log "Pre-flight: devnet status"
51+
HARDHAT_PORT="${HARDHAT_PORT:-8545}"
52+
API_PORT_BASE="${API_PORT_BASE:-9201}"
53+
DEVNET_DIR="${DEVNET_DIR:-$REPO_ROOT/.devnet}"
54+
55+
if ! curl -sf "http://127.0.0.1:$HARDHAT_PORT" -X POST -H "Content-Type: application/json" \
56+
-d '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}' > /dev/null 2>&1; then
57+
log "FATAL: Hardhat not responding on :$HARDHAT_PORT — start the devnet first ($REPO_ROOT/scripts/devnet.sh start)"
58+
exit 2
59+
fi
60+
AUTH=$(grep -v '^#' "$DEVNET_DIR/node1/auth.token" 2>/dev/null | head -1 || echo "")
61+
if [ -z "$AUTH" ]; then
62+
log "FATAL: no auth token at $DEVNET_DIR/node1/auth.token"
63+
exit 2
64+
fi
65+
if ! curl -sf -H "Authorization: Bearer $AUTH" "http://127.0.0.1:$API_PORT_BASE/api/status" > /dev/null 2>&1; then
66+
log "FATAL: node 1 not responding on :$API_PORT_BASE"
67+
exit 2
68+
fi
69+
export DKG_AUTH="$AUTH"
70+
log "Devnet is up. Results dir: $RESULTS"
71+
72+
# ── Suite registry (parallel arrays; bash 3.2 compatible) ───────
73+
SUITE_IDS=()
74+
SUITE_CMDS=()
75+
SUITE_GROUPS=()
76+
SUITE_RESULTS=()
77+
SUITE_LOGS=()
78+
SUITE_ELAPSEDS=()
79+
80+
register() {
81+
SUITE_IDS+=("$1")
82+
SUITE_GROUPS+=("$2")
83+
SUITE_CMDS+=("$3")
84+
SUITE_RESULTS+=("PENDING")
85+
SUITE_LOGS+=("")
86+
SUITE_ELAPSEDS+=("0")
87+
}
88+
89+
# Group: smoke
90+
register "v10-rc-validation" "smoke" "$REPO_ROOT/scripts/v10-rc-validation.sh"
91+
92+
# Group: sweep
93+
register "devnet-full-sweep" "sweep" "$REPO_ROOT/scripts/_devnet-full-sweep.sh"
94+
95+
# Group: rc11 recovery
96+
register "rc11-promote-crash" "rc11-recovery" "$REPO_ROOT/scripts/devnet-test-rc11-promote-crash-recovery.sh"
97+
register "rc11-shutdown-mid" "rc11-recovery" "$REPO_ROOT/scripts/devnet-test-rc11-shutdown-mid-publish.sh"
98+
99+
# Group: rfc38 extras (the all-aggregator runs lu5/lu5-pub/lu7/lu8/lu9/lu10/e2e/xcg/mm/scale/lj)
100+
if [ "${SKIP_RFC38_EXTRAS:-0}" != "1" ]; then
101+
register "rfc38-all" "rfc38-extras" "$REPO_ROOT/scripts/devnet-test-rfc38-all.sh"
102+
fi
103+
104+
# Group: rc.12 feature probes
105+
if [ "${SKIP_PROBES:-0}" != "1" ]; then
106+
for p in hub-rotation multi-rpc-failover libp2p-tunables cg-phonebook ack-rejection-reasons; do
107+
register "probe-${p}" "rc12-probes" "$REPO_ROOT/scripts/devnet-probe-${p}.sh"
108+
done
109+
fi
110+
111+
# Group: node-ui smoke
112+
if [ "${SKIP_UI:-0}" != "1" ]; then
113+
register "node-ui-smoke" "node-ui" "$REPO_ROOT/scripts/devnet-test-node-ui-smoke.sh"
114+
fi
115+
116+
# Group: soak (LONG)
117+
if [ "${SKIP_SOAK:-0}" != "1" ]; then
118+
SOAK_RECIPIENT_PEER=$(curl -sf -H "Authorization: Bearer $AUTH" \
119+
"http://127.0.0.1:$((API_PORT_BASE + 1))/api/status" 2>/dev/null \
120+
| python3 -c "import sys,json;print(json.load(sys.stdin).get('peerId',''))" 2>/dev/null || echo "")
121+
SOAK_LIBP2P_CYCLES="${SOAK_LIBP2P_CYCLES:-5}"
122+
SOAK_SWM_CYCLES="${SOAK_SWM_CYCLES:-10}"
123+
SOAK_RS_SECONDS="${SOAK_RS_SECONDS:-1800}"
124+
125+
register "libp2p-soak-short" "soak" \
126+
"env DKG_HOME=$DEVNET_DIR/node1 DKG_AUTH=$AUTH API=http://127.0.0.1:$API_PORT_BASE RECIPIENT_PEER_ID=$SOAK_RECIPIENT_PEER RECIPIENT=devnet-node-2 SENDER_TAG=rc12 TOTAL_CYCLES=$SOAK_LIBP2P_CYCLES INTERVAL_S=60 $REPO_ROOT/scripts/libp2p-soak-test.sh"
127+
128+
# SWM soak — solo mode (PEERS_EXPECTED unset). Confirms write-tag
129+
# rate on local SWM survives N × 30s cycles; cross-peer delivery
130+
# is already exercised by rfc38-multi-member + rfc38-cross-cg.
131+
# PEERS_EXPECTED is a comma-separated TAG list (not a count); set
132+
# only if running concurrent operators sharing a SOAK_COHORT_ID.
133+
register "swm-soak-short" "soak" \
134+
"env DKG_HOME=$DEVNET_DIR/node1 DKG_AUTH=$AUTH API=http://127.0.0.1:$API_PORT_BASE SWM_CG_PUBLIC=devnet-test SWM_CG_CURATED=devnet-isolation SWM_INTERVAL_S=30 SWM_TOTAL_CYCLES=$SOAK_SWM_CYCLES SENDER_TAG=rc12 $REPO_ROOT/scripts/swm-soak-test.sh"
135+
136+
register "devnet-soak-rs" "soak" \
137+
"$REPO_ROOT/scripts/devnet-soak-rs.sh 1 $SOAK_RS_SECONDS"
138+
fi
139+
140+
# Apply SOAK_ONLY filter
141+
if [ "${SOAK_ONLY:-0}" = "1" ]; then
142+
NEW_IDS=()
143+
NEW_CMDS=()
144+
NEW_GROUPS=()
145+
NEW_RESULTS=()
146+
NEW_LOGS=()
147+
NEW_ELAPSEDS=()
148+
i=0
149+
while [ "$i" -lt "${#SUITE_IDS[@]}" ]; do
150+
if [ "${SUITE_GROUPS[$i]}" = "soak" ]; then
151+
NEW_IDS+=("${SUITE_IDS[$i]}")
152+
NEW_CMDS+=("${SUITE_CMDS[$i]}")
153+
NEW_GROUPS+=("${SUITE_GROUPS[$i]}")
154+
NEW_RESULTS+=("PENDING")
155+
NEW_LOGS+=("")
156+
NEW_ELAPSEDS+=("0")
157+
fi
158+
i=$((i + 1))
159+
done
160+
SUITE_IDS=("${NEW_IDS[@]}")
161+
SUITE_CMDS=("${NEW_CMDS[@]}")
162+
SUITE_GROUPS=("${NEW_GROUPS[@]}")
163+
SUITE_RESULTS=("${NEW_RESULTS[@]}")
164+
SUITE_LOGS=("${NEW_LOGS[@]}")
165+
SUITE_ELAPSEDS=("${NEW_ELAPSEDS[@]}")
166+
fi
167+
168+
log "Registered ${#SUITE_IDS[@]} suite(s):"
169+
i=0
170+
while [ "$i" -lt "${#SUITE_IDS[@]}" ]; do
171+
log " - ${SUITE_IDS[$i]} [${SUITE_GROUPS[$i]}]"
172+
i=$((i + 1))
173+
done
174+
175+
# ── Run loop ────────────────────────────────────────────────────
176+
START=$(date +%s)
177+
TOTAL_PASS=0
178+
TOTAL_FAIL=0
179+
TOTAL_MISSING=0
180+
181+
i=0
182+
while [ "$i" -lt "${#SUITE_IDS[@]}" ]; do
183+
id="${SUITE_IDS[$i]}"
184+
cmd="${SUITE_CMDS[$i]}"
185+
group="${SUITE_GROUPS[$i]}"
186+
logfile="$RESULTS/${id}.log"
187+
SUITE_LOGS[$i]="$logfile"
188+
189+
# Extract the script path (last token in the command, possibly after env vars)
190+
bare_path=""
191+
for tok in $cmd; do
192+
case "$tok" in
193+
*.sh) bare_path="$tok" ;;
194+
esac
195+
done
196+
[ -z "$bare_path" ] && bare_path=$(echo "$cmd" | awk '{print $NF}')
197+
198+
if [ ! -e "$bare_path" ]; then
199+
log "MISSING: $id ($bare_path)"
200+
SUITE_RESULTS[$i]="MISSING"
201+
TOTAL_MISSING=$((TOTAL_MISSING + 1))
202+
i=$((i + 1))
203+
if [ "${FAIL_FAST:-0}" = "1" ]; then
204+
log "FAIL_FAST=1 — aborting"
205+
break
206+
fi
207+
continue
208+
fi
209+
210+
log "============================================================"
211+
log "RUN $id [$group]"
212+
log "============================================================"
213+
suite_start=$(date +%s)
214+
( cd "$REPO_ROOT" && bash -c "$cmd" ) > "$logfile" 2>&1
215+
ec=$?
216+
suite_end=$(date +%s)
217+
elapsed=$((suite_end - suite_start))
218+
SUITE_ELAPSEDS[$i]="$elapsed"
219+
220+
if [ "$ec" -eq 0 ]; then
221+
SUITE_RESULTS[$i]="PASS"
222+
TOTAL_PASS=$((TOTAL_PASS + 1))
223+
log "PASS $id (${elapsed}s)"
224+
else
225+
SUITE_RESULTS[$i]="FAIL:$ec"
226+
TOTAL_FAIL=$((TOTAL_FAIL + 1))
227+
log "FAIL $id (exit=$ec, ${elapsed}s)"
228+
log " last 12 lines of $logfile:"
229+
tail -n 12 "$logfile" 2>/dev/null | sed 's/^/ /' | tee -a "$RESULTS/orchestrator.log"
230+
if [ "${FAIL_FAST:-0}" = "1" ]; then
231+
log "FAIL_FAST=1 — aborting"
232+
break
233+
fi
234+
fi
235+
i=$((i + 1))
236+
done
237+
238+
END=$(date +%s)
239+
WALL=$((END - START))
240+
241+
# ── Reports ─────────────────────────────────────────────────────
242+
log ""
243+
log "============================================================"
244+
log "DONE — ${WALL}s wall (~$((WALL/60))m)"
245+
log "PASS=$TOTAL_PASS FAIL=$TOTAL_FAIL MISSING=$TOTAL_MISSING TOTAL=${#SUITE_IDS[@]}"
246+
log "============================================================"
247+
248+
# Markdown report
249+
MD="$RESULTS/REPORT.md"
250+
{
251+
echo "# Comprehensive devnet test report"
252+
echo
253+
echo "- **Started**: $(date -u -r $START +'%Y-%m-%dT%H:%M:%SZ')"
254+
echo "- **Ended**: $(date -u -r $END +'%Y-%m-%dT%H:%M:%SZ')"
255+
echo "- **Wall**: ${WALL}s (~$((WALL/60))m)"
256+
echo "- **Branch**: $(cd "$REPO_ROOT" && git rev-parse --abbrev-ref HEAD) @ $(cd "$REPO_ROOT" && git rev-parse --short HEAD)"
257+
echo "- **Results dir**: \`$RESULTS\`"
258+
echo
259+
echo "## Summary"
260+
echo
261+
echo "| | count |"
262+
echo "|---|---|"
263+
echo "| PASS | $TOTAL_PASS |"
264+
echo "| FAIL | $TOTAL_FAIL |"
265+
echo "| MISSING | $TOTAL_MISSING |"
266+
echo "| Total registered | ${#SUITE_IDS[@]} |"
267+
echo
268+
echo "## Suites"
269+
echo
270+
echo "| id | group | result | elapsed | log |"
271+
echo "|---|---|---|---:|---|"
272+
i=0
273+
while [ "$i" -lt "${#SUITE_IDS[@]}" ]; do
274+
logf=$(basename "${SUITE_LOGS[$i]}")
275+
echo "| \`${SUITE_IDS[$i]}\` | ${SUITE_GROUPS[$i]} | ${SUITE_RESULTS[$i]} | ${SUITE_ELAPSEDS[$i]}s | \`$logf\` |"
276+
i=$((i + 1))
277+
done
278+
echo
279+
if [ "$TOTAL_FAIL" -gt 0 ]; then
280+
echo "## Failures — last 25 lines of each failing log"
281+
echo
282+
i=0
283+
while [ "$i" -lt "${#SUITE_IDS[@]}" ]; do
284+
case "${SUITE_RESULTS[$i]}" in
285+
FAIL:*)
286+
echo "### ${SUITE_IDS[$i]}"
287+
echo
288+
echo '```'
289+
tail -n 25 "${SUITE_LOGS[$i]}" 2>/dev/null || echo "(no log)"
290+
echo '```'
291+
echo
292+
;;
293+
esac
294+
i=$((i + 1))
295+
done
296+
fi
297+
} > "$MD"
298+
299+
# JSON report
300+
JSON="$RESULTS/REPORT.json"
301+
{
302+
echo "{"
303+
echo " \"startedAt\": \"$(date -u -r $START +'%Y-%m-%dT%H:%M:%SZ')\","
304+
echo " \"endedAt\": \"$(date -u -r $END +'%Y-%m-%dT%H:%M:%SZ')\","
305+
echo " \"wallSeconds\": $WALL,"
306+
echo " \"branch\": \"$(cd "$REPO_ROOT" && git rev-parse --abbrev-ref HEAD)\","
307+
echo " \"commit\": \"$(cd "$REPO_ROOT" && git rev-parse HEAD)\","
308+
echo " \"totals\": { \"pass\": $TOTAL_PASS, \"fail\": $TOTAL_FAIL, \"missing\": $TOTAL_MISSING, \"registered\": ${#SUITE_IDS[@]} },"
309+
echo " \"suites\": ["
310+
first=1
311+
i=0
312+
while [ "$i" -lt "${#SUITE_IDS[@]}" ]; do
313+
[ "$first" -eq 0 ] && echo ","
314+
first=0
315+
printf ' { "id": "%s", "group": "%s", "result": "%s", "elapsedSeconds": %s, "log": "%s" }' \
316+
"${SUITE_IDS[$i]}" "${SUITE_GROUPS[$i]}" "${SUITE_RESULTS[$i]}" "${SUITE_ELAPSEDS[$i]}" \
317+
"$(basename "${SUITE_LOGS[$i]}")"
318+
i=$((i + 1))
319+
done
320+
echo
321+
echo " ]"
322+
echo "}"
323+
} > "$JSON"
324+
325+
log "Report: $MD"
326+
log "JSON: $JSON"
327+
328+
if [ "$TOTAL_FAIL" -gt 0 ]; then
329+
exit 1
330+
fi
331+
exit 0

0 commit comments

Comments
 (0)