fix(deps): update adder to fix silent reconnect#54
Conversation
Updates blinklabs-io/adder to commit 460d03e which preserves event channels during auto-reconnect, preventing silent block delivery stalls. Fixes: blinklabs-io/adder#611 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughBumped adder dependency; added helm value and ConfigMap conditional for ntcQueryTimeout; made NodeQueryClient accept a per-instance queryTimeout and threaded it through initialization; propagated block_hash through DB/store row scanning and updated nonce evolution from Blake2b to XOR across nonce logic, tests, and epoch handling. Changes
Sequence Diagram(s)sequenceDiagram
participant Config as "Helm / Config"
participant Main as "main.go"
participant NtC as "NodeQueryClient"
participant Store as "Store / DB"
participant Nonce as "Nonce logic"
Config->>Main: provide ntcQueryTimeout
Main->>NtC: NewNodeQueryClient(host, magic, ntcQueryTimeout)
NtC->>NtC: use queryTimeout for node queries
Main->>Store: StreamBlockNonces() (select epoch, slot, nonce_value, block_hash)
Store->>Main: rows(stream of epoch,slot,nonceValue,blockHash)
Main->>Nonce: evolveNonce(currentEta, nonceValue) (xorBytes)
Nonce->>Nonce: track lastBlockHash -> prevHashNonce
Note right of Nonce: On epoch boundary\nepochNonce = etaC XOR prevHashNonce
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Add leaderlog.ntcQueryTimeout config option (Go duration string). Defaults to 10m if not set. Configurable via helm values. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…dano spec The Cardano Nonce semigroup defines: Nonce a <> Nonce b = Nonce (xor a b) The code was incorrectly using BLAKE2b-256(a || b) instead of XOR for: 1. Per-block nonce evolution (evolveNonce) 2. Epoch nonce transition (TICKN rule) Additionally, the epoch transition was using the previous epoch nonce as the second operand instead of η_ph (prev block hash nonce from TICKN state). Correct formula per cardano-ledger TICKN rule: η(new) = η_c ⊕ η_ph where η_c = candidate nonce, η_ph = last block hash of prior epoch boundary. Changes: - evolveNonce: BLAKE2b-256(a||b) → XOR(a,b) - Epoch transition: hash(etaC||eta0) → XOR(etaC, prevHashNonce) - StreamBlockNonces now returns block_hash for η_ph tracking - Updated both SQLite and PostgreSQL store implementations - Updated all tests to verify XOR behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
nonce.go (1)
334-344:⚠️ Potential issue | 🟠 MajorPost-loop final nonce has the same silent
hex.DecodeStringerror discard.Same issue as line 301. If the last block hash in the DB is corrupted, this silently produces a wrong nonce rather than erroring.
🤖 Fix all issues with AI agents
In `@nonce.go`:
- Around line 299-302: The code silently ignores errors from hex.DecodeString
when decoding lastBlockHash before calling xorBytes, which can yield a nil/short
prevHashNonce and cause panics or wrong nonces; update each occurrence (the
blocks setting prevHashNonce before calling xorBytes at the xorBytes(etaC,
prevHashNonce) call sites and inside BackfillNonces) to check the error returned
by hex.DecodeString(lastBlockHash), handle it (return or propagate an error) and
only call xorBytes when prevHashNonce was successfully decoded; ensure any error
includes context (e.g., lastBlockHash value or a descriptive message) so callers
can handle corrupted/invalid hex data.
🧹 Nitpick comments (2)
nonce.go (1)
80-88:xorByteswill panic if either input is shorter than 32 bytes.If a caller passes a slice shorter than 32 (e.g., from a failed
hex.DecodeString), this will panic with an index-out-of-range. A defensive length check would prevent a hard crash.🛡️ Proposed fix
func xorBytes(a, b []byte) []byte { + if len(a) < 32 || len(b) < 32 { + panic(fmt.Sprintf("xorBytes: expected 32-byte inputs, got %d and %d", len(a), len(b))) + } result := make([]byte, 32) for i := 0; i < 32; i++ { result[i] = a[i] ^ b[i] } return result }epoch612_integration_test.go (1)
75-84: Same silenthex.DecodeStringerror discard as innonce.go.In test code this is lower risk, but if a block hash in the DB is corrupted, this would silently produce a wrong nonce and a confusing test failure rather than a clear error message. Consider at least
t.Fatalfon decode error for debuggability.Also applies to: 115-118
| eta0 = xorBytes(etaC, prevHashNonce) | ||
| if lastBlockHash != "" { | ||
| prevHashNonce, _ = hex.DecodeString(lastBlockHash) | ||
| } |
There was a problem hiding this comment.
Silent discard of hex.DecodeString error could cause a downstream panic in xorBytes.
If lastBlockHash contains invalid hex (e.g., corrupted DB data or odd-length string), hex.DecodeString returns a short/nil slice and the discarded error hides the root cause. The next call to xorBytes(etaC, prevHashNonce) would then panic or produce a wrong nonce.
The same pattern appears at lines 339–341 and in BackfillNonces at lines 394–396.
🛡️ Proposed fix (apply to all three occurrences)
if lastBlockHash != "" {
- prevHashNonce, _ = hex.DecodeString(lastBlockHash)
+ decoded, decErr := hex.DecodeString(lastBlockHash)
+ if decErr != nil || len(decoded) != 32 {
+ return nil, fmt.Errorf("invalid block hash %q for η_ph: %w", lastBlockHash, decErr)
+ }
+ prevHashNonce = decoded
}🤖 Prompt for AI Agents
In `@nonce.go` around lines 299 - 302, The code silently ignores errors from
hex.DecodeString when decoding lastBlockHash before calling xorBytes, which can
yield a nil/short prevHashNonce and cause panics or wrong nonces; update each
occurrence (the blocks setting prevHashNonce before calling xorBytes at the
xorBytes(etaC, prevHashNonce) call sites and inside BackfillNonces) to check the
error returned by hex.DecodeString(lastBlockHash), handle it (return or
propagate an error) and only call xorBytes when prevHashNonce was successfully
decoded; ensure any error includes context (e.g., lastBlockHash value or a
descriptive message) so callers can handle corrupted/invalid hex data.
…y window - README: nonce evolution uses XOR per Cardano Nonce semigroup - README: epoch transition uses TICKN rule (η_c XOR η_ph) - CLAUDE.md: fix stability window references from 70% to 60% - CLAUDE.md: update nonce evolution description to XOR - Update adder dependency version reference Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
README.md (1)
245-245:⚠️ Potential issue | 🟡 MinorStability window math is internally inconsistent.
This line says “60% into the current epoch” but then equates it to
4k/f = 172,800 slots, which is 40% of 432,000. Elsewhere you cite 259,200 slots for 60%. Please pick one and make the math consistent across docs.✏️ Suggested edit (if 60% is correct)
-The next epoch's nonce becomes available after the stability window — 60% into the current epoch (Conway era, `4k/f` = 172,800 slots). duckBot automatically triggers leader schedule calculation at this point. +The next epoch's nonce becomes available after the stability window — 60% into the current epoch (Conway era, 259,200 slots on mainnet). duckBot automatically triggers leader schedule calculation at this point.
🤖 Fix all issues with AI agents
In `@README.md`:
- Line 13: Update the "Epoch Nonces" blurb to correct the backfill timing:
replace the phrase "~400 epochs in under 2 minutes" with a figure that aligns
with the later README statement (e.g., reflect the measured full Shelley-to-tip
sync of ~43 minutes or remove the numeric claim), and ensure the revised
sentence mentions the same sync context as "full Shelley-to-tip" to avoid
inconsistency with the rest of the README.
| **Leader Schedule** — Pure Go CPRAOS implementation checking every slot per epoch against your VRF key. Calculates next epoch schedule automatically at the stability window (60% into epoch). On-demand via `/leaderlog`. | ||
|
|
||
| **Epoch Nonces** — In full mode, streams every block from Shelley genesis extracting VRF outputs per era, evolving the nonce via BLAKE2b-256, and freezing at the stability window. Backfills ~400 epochs in under 2 minutes. | ||
| **Epoch Nonces** — In full mode, streams every block from Shelley genesis extracting VRF outputs per era, evolving the nonce via XOR (Cardano Nonce semigroup), and freezing at the stability window. Backfills ~400 epochs in under 2 minutes. |
There was a problem hiding this comment.
Fix the backfill timing claim to match actual sync performance.
Line 13 says “~400 epochs in under 2 minutes,” but later in this README the full Shelley-to-tip sync is ~43 minutes. Please align these statements to avoid misleading expectations.
✏️ Suggested edit
-**Epoch Nonces** — In full mode, streams every block from Shelley genesis extracting VRF outputs per era, evolving the nonce via XOR (Cardano Nonce semigroup), and freezing at the stability window. Backfills ~400 epochs in under 2 minutes.
+**Epoch Nonces** — In full mode, streams every block from Shelley genesis extracting VRF outputs per era, evolving the nonce via XOR (Cardano Nonce semigroup), and freezing at the stability window. Full Shelley‑to‑tip backfill completes in ~43 minutes.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| **Epoch Nonces** — In full mode, streams every block from Shelley genesis extracting VRF outputs per era, evolving the nonce via XOR (Cardano Nonce semigroup), and freezing at the stability window. Backfills ~400 epochs in under 2 minutes. | |
| **Epoch Nonces** — In full mode, streams every block from Shelley genesis extracting VRF outputs per era, evolving the nonce via XOR (Cardano Nonce semigroup), and freezing at the stability window. Full Shelley‑to‑tip backfill completes in ~43 minutes. |
🤖 Prompt for AI Agents
In `@README.md` at line 13, Update the "Epoch Nonces" blurb to correct the
backfill timing: replace the phrase "~400 epochs in under 2 minutes" with a
figure that aligns with the later README statement (e.g., reflect the measured
full Shelley-to-tip sync of ~43 minutes or remove the numeric claim), and ensure
the revised sentence mentions the same sync context as "full Shelley-to-tip" to
avoid inconsistency with the rest of the README.
* fix(deps): update adder to fix silent reconnect channel orphaning Updates blinklabs-io/adder to commit 460d03e which preserves event channels during auto-reconnect, preventing silent block delivery stalls. Fixes: blinklabs-io/adder#611 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(config): make NtC query timeout configurable Add leaderlog.ntcQueryTimeout config option (Go duration string). Defaults to 10m if not set. Configurable via helm values. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(nonce): use XOR for nonce evolution and epoch transitions per Cardano spec The Cardano Nonce semigroup defines: Nonce a <> Nonce b = Nonce (xor a b) The code was incorrectly using BLAKE2b-256(a || b) instead of XOR for: 1. Per-block nonce evolution (evolveNonce) 2. Epoch nonce transition (TICKN rule) Additionally, the epoch transition was using the previous epoch nonce as the second operand instead of η_ph (prev block hash nonce from TICKN state). Correct formula per cardano-ledger TICKN rule: η(new) = η_c ⊕ η_ph where η_c = candidate nonce, η_ph = last block hash of prior epoch boundary. Changes: - evolveNonce: BLAKE2b-256(a||b) → XOR(a,b) - Epoch transition: hash(etaC||eta0) → XOR(etaC, prevHashNonce) - StreamBlockNonces now returns block_hash for η_ph tracking - Updated both SQLite and PostgreSQL store implementations - Updated all tests to verify XOR behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: fix nonce evolution description (XOR not BLAKE2b), 60% stability window - README: nonce evolution uses XOR per Cardano Nonce semigroup - README: epoch transition uses TICKN rule (η_c XOR η_ph) - CLAUDE.md: fix stability window references from 70% to 60% - CLAUDE.md: update nonce evolution description to XOR - Update adder dependency version reference Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
blinklabs-io/adderto commit460d03e(PR fix(chainsync): preserve event channels during auto-reconnect blinklabs-io/adder#611)output_embeddedcallback becomes disconnected from the new pipeline's event channelContext
The bug caused incomplete block data in the nonce tracking DB, leading to wrong epoch nonce computation and incorrect leader schedules. Epoch 612 nonce was
52f585...instead of the correct8fcd93....Test plan
go buildpasses🤖 Generated with Claude Code
Summary by CodeRabbit