Skip to content

[runtime] Create a trait for briding async and parallel code#3636

Open
cronokirby wants to merge 2 commits into
mainfrom
ck/strategist
Open

[runtime] Create a trait for briding async and parallel code#3636
cronokirby wants to merge 2 commits into
mainfrom
ck/strategist

Conversation

@cronokirby
Copy link
Copy Markdown
Collaborator

When calling into code doing heavy computation using a Strategy, you want to avoid blocking the tokio thread on waiting for the computation. Instead, you want the waiting to be async, so that other tasks can use the thread while you wait for the computation to finish in the rayon thread pool.

To facilitate this pattern, a new capability, Strategist, is added to to the runtime Context. This takes in a synchronous closure, which needs a strategy to do parallel computation. Runtimes implement this correctly, by using, e.g. a oneshot channel to asynchronously wait on the rayon result, or using the tokio blocking pool for a sequential strategy.

To use this change throughout the code base, places where a Strategy was stored have been replaced with using the context instead.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Apr 20, 2026

Deploying monorepo with  Cloudflare Pages  Cloudflare Pages

Latest commit: 1a04122
Status: ✅  Deploy successful!
Preview URL: https://8fcdea30.monorepo-eu0.pages.dev
Branch Preview URL: https://ck-strategist.monorepo-eu0.pages.dev

View logs

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Apr 20, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
commonware-mcp 1a04122 Apr 22 2026, 09:16 PM

@cronokirby
Copy link
Copy Markdown
Collaborator Author

We could also get rid of the ThreadPooler API entirely, but that would require modifying some code in Storage to be over strategy instead. Probably worth doing if we merge this, but subsequently.

Comment thread consensus/src/simplex/actors/batcher/actor.rs Outdated
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit dba50ae. Configure here.

.await;
}

work.insert(updated_view, round);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round removed from work but leader_nullified still needs it

Medium Severity

When updated_view equals current.view, the round is removed from work at line 666. The forward_proposal check at line 648 runs before the remove and works fine. However, if a subsequent iteration's vote arm (line 632) fires for the same view while a different updated_view is being processed in on_end, Self::leader_nullified(&current, &work) will not find the current view's round (since it was removed but not yet reinserted). In the old code, rounds were borrowed in-place via get_mut and never removed from work, so leader_nullified always saw the current round. The remove-process-reinsert pattern can cause a missed leader-nullify detection if the timing aligns.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit dba50ae. Configure here.

timer.cancel();
}
result
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant with_strategy calls for likely-no-op certificate construction

Low Severity

Three sequential with_strategy calls (construct_notarization, construct_nullification, construct_finalization) are made unconditionally for every updated view. Each with_strategy call may involve channel communication or thread pool dispatch depending on the runtime. In practice, at most one of these will return Some (and they each have fast-path early returns checking has_*() and quorum counts), but the overhead of three round-trips through the strategy bridge is incurred regardless. Combining these three into a single with_strategy call that attempts all three constructions at once would reduce the overhead.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit dba50ae. Configure here.

@patrick-ogrady patrick-ogrady added this to the v2026.4.1 milestone Apr 21, 2026
@patrick-ogrady patrick-ogrady moved this to In Progress in Tracker Apr 21, 2026
if filtered.len() >= quorum as usize {
if let Some(certificate) = Certificate::from_acks(&*scheme, filtered, &self.strategy) {
let certificate = self
.context
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep strategy less tightly bound to context? We could accept strategy with it?

When calling into code doing heavy computation using a Strategy, you want
to avoid blocking the tokio thread on waiting for the computation. Instead,
you want the waiting to be async, so that other tasks can use the thread
while you wait for the computation to finish in the rayon thread pool.

To facilitate this pattern, a new capability, Strategist, is added to
to the runtime Context. This takes in a synchronous closure, which needs
a strategy to do parallel computation. Runtimes implement this correctly,
by using, e.g. a oneshot channel to asynchronously wait on the rayon result,
or using the tokio blocking pool for a sequential strategy.

To use this change throughout the code base, places where a Strategy was
stored have been replaced with using the context instead.
Comment on lines +741 to +757
RuntimeStrategy::Sequential(strategy) => {
let strategy = RuntimeStrategy::Sequential(strategy);
let handle = self
.clone()
.shared(true)
.spawn(move |_| async move { f(&strategy) });
Either::Left(async move { handle.await.expect("strategy task failed") })
}
RuntimeStrategy::Rayon(strategy) => {
let (sender, receiver) = oneshot::channel();
let pool = strategy.thread_pool().clone();
let strategy = RuntimeStrategy::Rayon(strategy);
pool.spawn(move || {
let _ = sender.send(f(&strategy));
});
Either::Right(async move { receiver.await.expect("strategy task failed") })
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RuntimeStrategy::Sequential(strategy) => {
let strategy = RuntimeStrategy::Sequential(strategy);
let handle = self
.clone()
.shared(true)
.spawn(move |_| async move { f(&strategy) });
Either::Left(async move { handle.await.expect("strategy task failed") })
}
RuntimeStrategy::Rayon(strategy) => {
let (sender, receiver) = oneshot::channel();
let pool = strategy.thread_pool().clone();
let strategy = RuntimeStrategy::Rayon(strategy);
pool.spawn(move || {
let _ = sender.send(f(&strategy));
});
Either::Right(async move { receiver.await.expect("strategy task failed") })
}
RuntimeStrategy::Sequential(_) => {
let handle = self
.clone()
.shared(true)
.spawn(move |_| async move { f(&strategy) });
Either::Left(async move { handle.await.expect("strategy task failed") })
}
RuntimeStrategy::Rayon(ref rayon) => {
let (sender, receiver) = oneshot::channel();
let pool = rayon.thread_pool().clone();
pool.spawn(move || {
let _ = sender.send(f(&strategy));
});
Either::Right(async move { receiver.await.expect("strategy task failed") })
}

I was looking at this block for a while trying to internalize exactly what's happening here, and in playing around with it I thought this made it easier to read. Feel free to ignore if this is not as idiomatic though.

(I did not know the ref part would be needed / had never seen the "use of partially moved value" compiler error that occurs with out.)

let filtered = acks
.values()
.filter(|a| a.item.digest == ack.item.digest)
.cloned()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder if there is a way to avoid this clone with some more careful lifetime management?

@patrick-ogrady patrick-ogrady modified the milestones: v2026.5.0, v2026.6.0 May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

3 participants