Skip to content

docs(protocol): proposer liveness design (dense blocks, ETH bond, fast takeover)#21757

Open
dantaik wants to merge 6 commits into
mainfrom
claude/upbeat-lamport-9s9wM
Open

docs(protocol): proposer liveness design (dense blocks, ETH bond, fast takeover)#21757
dantaik wants to merge 6 commits into
mainfrom
claude/upbeat-lamport-9s9wM

Conversation

@dantaik

@dantaik dantaik commented Jun 5, 2026

Copy link
Copy Markdown
Member

Summary

Design doc for closing the silent-no-show gap in Taiko's proposer permissioning, so the chain can credibly aim for "Ethereum validators only" (Q1) while removing the subjective Blacklist overseer role (Q2).

The four components:

  • Dense-derivation rule — every L2 second must be filled by a block. Makes "missing" structurally detectable; pairs with an RLE empty-block manifest to keep DA cost flat at 1s block time.
  • LivenessSlasher — new URC ISlasher peer to PreconfSlasherL1 and LookaheadSlasher, routed via UnifiedSlasher. Bond denominated in ETH, shared with existing safety/lookahead slashers in the same URC collateral pool. No new TAIKO coupling.
  • GapSlash fault — objective on-chain proof of an L2 timestamp gap inside an attributed window. Commitment is derived from the on-chain lookahead rather than signed per-window, so a silent operator cannot escape accountability by withholding evidence. EIP-4788 beacon-root oracle excises any L1 missed-slot intervals (reuses the existing pattern from PreconfSlasherL1.sol:88-92).
  • Fast takeover in Inbox.propose — two-stage trigger on head staleness (TAKEOVER_DELAY = 24s → next assignee may step in; TAKEOVER_OPEN_DELAY = 48s → fully permissionless). The takeover transaction may optionally carry GapEvidence for an atomic URC slash on handoff, eliminating any off-chain challenger race. Shrinks the existing 25.6-hour permissionlessInclusionMultiplier backstop into a sub-minute liveness mechanism.

The doc covers terminology, the dense-derivation timestamp rule and RLE encoding, the LivenessSlasher interface and UnifiedSlasher routing, the full GapSlash verification flow, parameter calibration, and a 4-milestone roll-out (M1 substrate → M2 dense blocks → M3 fast takeover + atomic slash → M4 retire Blacklist).

Style and depth match packages/protocol/docs/preconfirmation_lookahead.md. File-by-file contract change scaffold included; no code changes in this PR.

Test plan

  • Protocol team reviews design choices and parameter calibration (§11).
  • Confirm anchor cadence interaction with RLE empty-runs (§4.5, open question §14.2).
  • Confirm URC challenger-payout split semantics for LivenessSlasher (§14.3).
  • Confirm MAX_GAP value against historical L1 missed-slot distribution (§14.1).
  • Confirm cutover plan for BLOCK_TIME_TARGET 2s → 1s (§14.4).
  • Sign-off on the M4 sequencing for retiring the Blacklist overseer role (§13).

https://claude.ai/code/session_01TBxdfr5aeUEWq4MsfAmHYd


Generated by Claude Code

…t takeover)

Specifies an objective, ETH-bonded mechanism to close the silent-no-show gap
that PreconfSlasher leaves open: dense-derivation rule (forced empty blocks),
LivenessSlasher routed via UnifiedSlasher against URC collateral, GapSlash
fault with EIP-4788 censorship exception, and a short-fuse takeover trigger
in Inbox.propose with optional atomic slash on handoff. Outlines a 4-milestone
roll-out that ends with deprecating the subjective Blacklist overseer role.

https://claude.ai/code/session_01TBxdfr5aeUEWq4MsfAmHYd
@dantaik dantaik marked this pull request as ready for review June 5, 2026 03:26
@github-actions

github-actions Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

🐋 DeepSeek Code Review

🔴 Critical Issues

  • Unresolved URC commitment model breaks the entire slashing path. The design relies on a liveness commitment derived from the on-chain lookahead hash, not a per‑window signed attestation from the operator. URC’s slashCommitment entrypoint currently expects a signed commitment, so the proposed LivenessSlasher has no way to trigger a valid slash (open question 14.4). Until URC is confirmed to accept a standing opt‑in or per‑window pre‑signatures, the atomic slash in Inbox.propose will always fail (and the try/catch will silently degrade liveness restoration to a slash‑free takeover). This is a showstopper—resolve the authentication gap before any implementation.

  • Ambiguous takeover edge case can stall the chain. In the stale takeover path, the rescuer inherits the delinquent’s endOfSubmissionWindowTimestamp. If the delinquent’s window has already fully elapsed when TAKEOVER_DELAY fires, that deadline is in the past, making the proposal permanently invalid. The document acknowledges the problem (“the stale branch as drafted would return endOfSubmissionWindowTimestamp = windowEnd (in the past)”) and offers two possible fixes, but leaves the final choice to the implementation. An incomplete fix here would block any handoff and extend the chain stall indefinitely. This edge case must be specified exhaustively and tested before M3.

  • Liveness gap from sparse manifests still lacks a slashing deterrent. The design collapses all faults into NoProposal and punishes sparse intra‑window submissions only with a soft penalty (default‑manifest replacement, lost MEV). An operator who posts a sparse manifest that skips seconds can still earn MEV from the blocks they do include while letting the rest be default‑filled. If MEV_per_second > 0 for the blocks they produce, the soft penalty alone may not deter the behavior—and the slasher provides no bond loss. The document’s invariant PER_SECOND_PENALTY > MEV/s applies only to full silence, not to selective sparseness. This could allow operators to systematically degrade liveness for users without losing URC collateral.

🟡 Warnings

  • try/catch on the atomic slash masks deep URC non‑conformance. If URC’s slashCommitment reverts because the commitment format is fundamentally unsupported (not just a transient “collateral drained”), the slash is silently discarded forever—no retry can work. The AtomicSlashFailed event should emit a clear distinguishable error code so that operators and watchtowers know whether the fault is permanently un‑slashable.

  • Full lookahead array in GapEvidence may discourage rescuers. Carrying the entire LookaheadSlot[] (up to 32 entries) in calldata adds significant cost to the takeover transaction. Until the Merkle‑proof optimization (open question 14.6) is implemented, atomic slashes will be expensive, weakening the incentive for rescuers to include evidence. The Merkle upgrade should be prioritized or, at minimum, a calldata‑cost simulation included in the M3 readiness criteria.

  • RLE decoding complexity introduces new attack surface. A malicious or buggy EmptyBlockRun could be crafted to produce an incorrect number of empty blocks while still passing the dense‑derivation check, potentially corrupting L2 state or spiking prover work. Fuzz the RLE decoder against edge cases (e.g., overflow in count, mismatched startTimestamp) before deploying M2.

  • Collateral‑insufficient operators can still operate for one window. After a slashing event that pushes an operator’s URC collateral below MIN_URC_COLLATERAL, they fall out of future lookaheads but remain eligible for the remainder of their current posted lookahead window. A malicious actor could therefore burn their remaining collateral in one final, undeterred window (e.g., a safety fault with no bond left to slash). While inherent to URC’s delayed eligibility check, this deserves explicit acknowledgement in the security considerations.

🔵 Suggestions

  • Use bytes32 instead of bytes26 for the lookahead hash. bytes26 offers lower collision resistance without a clear benefit; using a full bytes32 simplifies on‑chain reasoning and matches every other hash in the system.

  • Cap the EIP-4788 slot loop to the actual window size, not the epoch. The gap interval’s L1‑slot walk is bounded by MAX_GAP plus the window length, but in practice the maximum penaltyable gap is capped by TAKEOVER_OPEN_DELAY (< 48s). Restricting the loop to windowEnd - windowStart (≤ 144s) would save gas and avoid paying for slots that cannot affect the penalty.

  • Raise TAKEOVER_DELAY slightly during the soft‑launch phase. Blob propagation can occasionally exceed 2 L1 slots under congestion; a 36s (3‑slot) delay during the monitor‑only M1/M2 period would generate fewer false‑positive takeovers while the system collects real‑world data. The value can be tuned down later if safe.

  • Make the slashingEnabled flag immutable after M2. The monitor‑only mode is critical for safe calibration; ensure the flag cannot be accidentally toggled by an operator or early governance.


Automatically triggered on PR update • model: deepseek-v4-pro

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 149b07d1e8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +361 to +364
6. **Raw gap.** Compute
`rawGap = afterGap.firstBlockTs - beforeGap.lastBlockTs - BLOCK_TIME_TARGET`
(or `windowEnd - beforeGap.lastBlockTs` for `ShortTrailingGap`, or
`windowEnd - windowStart` for `NoProposal`).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Add an authenticated source for L2 block timestamps

The GapSlash verifier is specified to compute gaps from beforeGap.firstBlockTs / lastBlockTs, but IInbox.Proposal only stores the L1 proposal timestamp plus blob references; the L2 block timestamps live in derivation manifests/blobs and are not authenticated by the proposal hash in a way this verifier describes. In any implementation following this flow, the slasher either cannot compile against Proposal or must trust caller-supplied timestamps, so sparse/silent proposers remain unverifiable unless the design also stores first/last L2 timestamps or verifies the relevant manifest/blob evidence on-chain.

Useful? React with 👍 / 👎.

Comment on lines +299 to +306
Unlike preconfirmation slashing, the commitment for `LivenessSlasher` is **not
signed per-window**. It is derived from on-chain state:

```solidity
struct LivenessCommitment {
uint48 epochTimestamp;
bytes26 lookaheadHash; // matches LookaheadStore's stored hash for the epoch
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Do not rely on unsigned commitments for URC slashing

This makes the liveness commitment explicitly not signed, but the URC path used elsewhere in this repo debits collateral through IRegistry.slashCommitment(registrationRoot, ISlasher.SignedCommitment, evidence) (for example PreconfSlasherL1.onMessageInvocation), so a silent operator has no signed commitment that the registry can authenticate before calling the slasher. As written, the atomic slash in §7.2 would fail or require a different URC entry point/delegation model; deriving a hash from the lookahead alone is not enough to make URC slash the operator's collateral.

Useful? React with 👍 / 👎.

claude added 5 commits June 5, 2026 03:33
Tractable fixes applied:
- §5.1: clarify URC slashCommitment provides single-frame atomicity (no two-step
  settlement window for front-running).
- §6.2/§6.3: NoProposal gap now measured to the closing proposal's first block
  (or to windowEnd if window fully elapsed), not the full window. Prevents
  strategic-delay inflation of the rescuer payout.
- §7.3 (new): explicit rule for original assignee re-entering after `stale` —
  re-entry allowed, dense backfill required, retroactive GapSlash still applies.
  Closes the "spike-and-lull" loophole.
- §9: document URC single-pot collateral ordering and natural decay of bad
  operators from posted lookaheads.
- §11.1: soft-launch / monitor-only mode for LivenessSlasher gated by a
  governance flag, so false-positive rate is measured before any collateral
  is at risk.
- §13 M1: realigned to monitor-only deployment, matching the soft-launch.

Architectural concerns surfaced (not resolved here):
- §14.4 (new): URC's slashCommitment authenticates a *signed* commitment; the
  unsigned-from-lookahead model needs reconciliation. Four candidate paths
  documented, with standing opt-in (A) preferred subject to URC API review.
- §14.5 (new): IInbox.Proposal does not commit L2 block timestamps; the
  verifier as drafted cannot read them on-chain. Three candidates documented,
  with adding firstL2BlockTs/lastL2BlockTs to Proposal (A) preferred.

https://claude.ai/code/session_01TBxdfr5aeUEWq4MsfAmHYd
…en questions

- §6.2 / §6.3: add L1InclusionGap fault type. Uses only on-chain
  Proposal.timestamp (L1) so it works without §14.5's L2-timestamp
  commitment. Closes the spike-and-lull loophole where a late re-entering
  assignee dense-backfills empty L2 blocks: InternalGap collapses to zero
  under backfill, but the L1-time gap between their successive proposals
  remains slashable.
- §7.3: fix the contradiction in the original prose. Self-rescuing with
  dense backfill DOES erase the L2-gap fault — but L1InclusionGap catches
  the L1-cadence violation regardless of L2 density.
- §14.6 (new): note lookahead-opening calldata cost in GapEvidence and
  the Merkle-proof mitigation against a LookaheadStore-stored root.
- §14.7 (new): scope per-block proving load under dense empty-block
  cadence; RLE addresses DA, not proving. Frame this as a measurement +
  potential compressed-empty-run proof, not a slashing-design blocker.

Renumber §14.6 (BLOCK_TIME_TARGET migration) -> §14.8.

https://claude.ai/code/session_01TBxdfr5aeUEWq4MsfAmHYd
…try/catch atomic slash

Revert L1InclusionGap (introduced in ede6e54):
- §6.2 / §6.3 / §7.3: the added fault would slash honest operators
  whose proposals span L1 blocks, because MAX_GAP=4s < SECONDS_IN_SLOT=12s.
  Structurally subsumed by TAKEOVER_DELAY (any L1 silence ≥24s already
  triggers takeover), and the calibration any-threshold-below-24s
  punishes legitimate batching. Spike-and-lull under 24s is now
  acknowledged as a bounded liveness degradation in §7.3, with three
  forces capping its damage (takeover threshold, MEV-parity, market
  routing).

Censorship excision becomes design, not open question:
- §6.3 step 7: the slasher itself walks every L1 slot in the gap range
  and queries EIP-4788 beacon roots. The challenger no longer supplies
  a missed-slot list.
- §6.2: drop the challenger-supplied missedL1Slots[] field, with an
  inline note explaining why.
- §6.4: rewrite to spell out the automatic enumeration, including the
  per-iteration gas cost table (worst-case ~320k for a 32-slot epoch
  boundary). Acceptable given how rarely the slasher fires.
- §14.1: promote from "mitigations to consider" to "resolved in §6.4".
  What remains in §14.1 is calibration of MAX_GAP, still covered by
  the §11.1 soft-launch.

Other refinements:
- §7.2: wrap the atomic URC slash in try/catch. Closes a griefing
  vector where a malicious challenger drains the operator's collateral
  in the same block to make URC's atomic call revert and block the
  takeover. Liveness is now strictly prioritized over slash success.
- §11.3: explicit blob-propagation tradeoff. Blob-bearing proposals
  take 1–2 L1 slots to land; honest operators in flight can appear
  silent past TAKEOVER_DELAY. Acknowledged tradeoff favoring liveness.
- §14.8: TIMESTAMP_MAX_OFFSET must be narrowed alongside the 2s→1s
  cutover; today's value (up to 102 min on mainnet) is permissive in
  a way the dense rule cannot catch.

https://claude.ai/code/session_01TBxdfr5aeUEWq4MsfAmHYd
DeepSeek surfaced a structural inconsistency in the previous fault model:
the dense-derivation rule + default-manifest replacement (§4.2, §4.3)
guarantee on-chain L2 history is always dense after derivation. There
is therefore no on-chain "internal gap" or "trailing gap" for the
slasher to point at — those failure modes are absorbed by the soft
penalty (loss of MEV from the replaced source).

Restructured §6 to recognize a single fault: NoProposal. The verifier
proves the assigned operator submitted zero proposals in their window
using only Proposal.timestamp (L1) and Proposal.proposer, both already
on-chain. No L2 block timestamps are needed anywhere in the slasher.

Concrete changes:
- §6.2: collapse GapEvidence to a non-existence proof keyed on
  lastProposalBefore / firstProposalAfter, adjacent via parentProposalHash.
  Drop the GapKind enum, the closingProposal field, and all L2-timestamp
  fields.
- §6.3: rewrite verification flow as ring-buffer membership + adjacency
  + non-membership of expectedProposer + L1-time gap calculation.
- §6.5: update the fault table to show how each failure mode is handled
  (PreconfSlasher for signed-promise faults, soft penalty for sparse
  submission, GapSlash only for fully-silent operators).
- §7.1: clarify the rescuer's window assignment. Rescuer inherits the
  remainder of the delinquent's window (not a fresh window), so bond
  exposure is bounded by what the original assignee faced.
- §7.2: clarify that try/catch'd atomic-slash failures do not lose the
  fault — anyone can retry via standalone LivenessSlasher.slash later.
  The AtomicSlashFailed event signals retry is available.
- §14.5: marked RESOLVED. The simplification removes the need for any
  L2 block timestamp authentication. No Proposal struct change required.
- §15 summary: reflect both shifts.

The only architecturally significant open question remaining is §14.4
(URC signed-commitment authentication model). Everything else is
implementation-ready.

https://claude.ai/code/session_01TBxdfr5aeUEWq4MsfAmHYd
…nd-only note

Three targeted refinements from automated review on commit 88167c9:

- §7.1: address the stale-past-windowEnd edge case. If a delinquent's
  window fully elapses while sinceLast > TAKEOVER_DELAY but < TAKEOVER_OPEN_DELAY,
  the prior text would have set endOfSubmissionWindowTimestamp to a past
  windowEnd, adding a redundant ~24s stall before `dead` opens. Document
  the design intent (never block the chain on a takeover permission gap
  when the assignee's window has expired) and give two acceptable
  implementations.

- §7.2: add a gas-stipend note for the try/catch around the atomic URC
  slash. Forwarding all remaining gas means an unexpectedly heavy slash
  verification can OOG before the catch fires, reverting the entire
  takeover. The implementation should cap the slash call at e.g. 500k
  gas based on the §6.4 worst case.

- §6.3 step 5: state explicitly that the NoProposal adjacency proof
  relies on the Inbox ring buffer being strictly append-only within the
  slash window (which it is — 3-day buffer on mainnet, beyond URC's
  slash-window expiry).

Holding bot-driven iteration after this commit; §14.4 (URC signed-
commitment authentication) remains the only architecturally significant
open question requiring user input.

https://claude.ai/code/session_01TBxdfr5aeUEWq4MsfAmHYd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants