Slow rollout feature #190

cody-wang-cb · 2025-04-28T12:52:44Z

Adding a slow rollout feature where it'd only try to fetch the builder block x% of the time, this allows the rollout to happen in an incremental basis rather than all or nothing.

vercel · 2025-04-28T12:52:52Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment

Name	Status	Preview	Comments	Updated (UTC)
rollup-boost	⬜️ Ignored (Inspect)	Visit Preview		Apr 28, 2025 0:52am

cody-wang-cb · 2025-04-28T12:58:35Z

@ferranbt @avalonche wanna give a review on this PR?

0xKitsune · 2025-04-28T16:54:16Z

src/cli.rs

+
+    /// Percentage of blocks built by the builder
+    #[arg(long, env, default_value = "100")]
+    pub rollout_pct: u16,


With this approach, it seems rollup-boost would need to be restarted every time the rollout percentage is updated.

IIUC, the problem this PR is solving is to verify that the builder is healthy and producing blocks correctly before fully enabling it.

It seems like another approach to solve this problem could be using ExecutionMode::DryRun. DryRun forwards payload building jobs to the builder without sending get_payload requests, allowing operators to evaluate builder health/metrics before fully enabling the builder.

This allows us to validate the builder’s readiness without needing rollup-boost restarts/config changes. Curious to hear your thoughts.

IIUC, the problem this PR is solving is to verify that the builder is healthy and producing blocks correctly before fully enabling it.

Not really, this PR is aimed at only using the builder to build real blocks X% amount of time, which is not the same as dry run.
You could argue that rollout_pct = 0 is the same as dry run but otherwise it's not. Once you turn off dry run it's 100% by the builder which might not be ideal if you want to evaluate the builder building real blocks into DB state with some production traffic first.

can this be part of the debug api so the % can can configured without restarts?

Good point, I can make it part of the debug API

I don't really see an issue with this - but why is this useful functionality to have within rollup-boost? Fallback execution mode would actually allow you to send FCU's, and receive payloads from the builder without propagating those payloads to the CL. This would allow you to fully evaluate the health of the builder (web socket streams, etc.) while building 100% of blocks without fully enabling block production on the network which seems most useful.

Curious why an operator would want the builder to only build x% of blocks at random points in time?

Not really, this PR is aimed at only using the builder to build real blocks X% amount of time, which is not the same as dry run.

Right, Im not suggesting that DryRun does the same thing as a partial rollout. Im asking if they seek to solve the same problem (ie. how to verify that the builder is healthy and producing blocks correctly before fully enabling the builder). If partial rollout fits your deployment approach compared to something like using DryRun, agreed that we could make it part of the debug API to avoid restarts.

if you want to evaluate the builder building real blocks into DB state with some production traffic first.

Just noting that if the ultimate goal of partial rollouts is to validate builder payload correctness before fully enabling the builder and propagating these blocks throughout the network, this could be achieved via DryRun (or other execution modes like Fallback) and inspecting traces/logs to evaluate builder produced blocks without publishing them to the network.

Within the Debug API there is also Fallback mode which sends FCUs with payload attributes to the builder and validates payloads with the default execution client, but ultimately falls back on the default execution client's block. This approach could also be used to derisk deployments, allowing you to not only inspect the block via logs but also validate builder blocks via new_payload calls to the local execution client. It's worth noting that @ferranbt and I had been discussing simplifying DryRun and Fallback into a single execution mode, since they are quite similar. We could incorporate the problem that partial rollout is trying to solve into those changes as well.

Let me know if I'm overlooking something here. I won't block on this, just pointing out that we could potentially use or extend existing execution modes to handle builder block correctness and health validation during incremental rollout.

Curious why an operator would want the builder to only build x% of blocks at random points in time?

Yeah this is the key question here, internally we are going to revisit this tomorrow to see if this assumption really makes sense.
What I was thinking here was unrelated to correctness, but whether this partial rollout could allow us to observe some new user behaviours from the new blocks (e.g. users might start to increase fees, more/less spams, etc), because the external builder inherently builds different blocks than the sequencer. But maybe if it's just random blocks it doesn't really help, perhaps some kind of switchback experiment makes more sense here

Hey just bumping this to see if there are any updates after revisiting the assumptions around partial rollout or if we should close this PR.

slow rollout

9237b3b

0xKitsune reviewed Apr 28, 2025

View reviewed changes

SozinM force-pushed the flashblocks branch from 2a3ba5d to 73e7f02 Compare May 19, 2025 09:49

ferranbt closed this May 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Slow rollout feature #190

Slow rollout feature #190

Uh oh!

cody-wang-cb commented Apr 28, 2025

Uh oh!

vercel bot commented Apr 28, 2025 •

edited

Loading

Uh oh!

cody-wang-cb commented Apr 28, 2025

Uh oh!

0xKitsune Apr 28, 2025 •

edited

Loading

Uh oh!

cody-wang-cb Apr 28, 2025 •

edited

Loading

Uh oh!

avalonche Apr 28, 2025

Uh oh!

cody-wang-cb Apr 28, 2025

Uh oh!

0xOsiris Apr 28, 2025

Uh oh!

0xKitsune Apr 28, 2025

Uh oh!

cody-wang-cb Apr 28, 2025

Uh oh!

0xKitsune May 8, 2025

Uh oh!

Uh oh!

Slow rollout feature #190

Slow rollout feature #190

Uh oh!

Conversation

cody-wang-cb commented Apr 28, 2025

Uh oh!

vercel bot commented Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cody-wang-cb commented Apr 28, 2025

Uh oh!

0xKitsune Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cody-wang-cb Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

avalonche Apr 28, 2025

Choose a reason for hiding this comment

Uh oh!

cody-wang-cb Apr 28, 2025

Choose a reason for hiding this comment

Uh oh!

0xOsiris Apr 28, 2025

Choose a reason for hiding this comment

Uh oh!

0xKitsune Apr 28, 2025

Choose a reason for hiding this comment

Uh oh!

cody-wang-cb Apr 28, 2025

Choose a reason for hiding this comment

Uh oh!

0xKitsune May 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vercel bot commented Apr 28, 2025 •

edited

Loading

0xKitsune Apr 28, 2025 •

edited

Loading

cody-wang-cb Apr 28, 2025 •

edited

Loading