[runtime/tokio] support for core pinned dedicated tasks by andresilva · Pull Request #3549 · commonwarexyz/monorepo

andresilva · 2026-04-07T09:42:31Z

This PR adds a pinned(core) builder method to the Spawner trait that runs tasks on a dedicated thread pinned to a specific CPU core. It implies dedicated(), since you can't be pinned without being dedicated: context.dedicated().spawn(f) for an unpinned dedicated thread, context.pinned(3).spawn(f) for a pinned one. A separate method was preferred over dedicated(Option<usize>) to avoid the awkward dedicated(None) for the common unpinned case. Users who want to co-locate two dedicated threads on the same core simply pass the same value.

Core pinning uses sched_setaffinity on Linux and is a no-op on other platforms, macOS only offers affinity hints (not true pinning), and Windows is currently not a target for production environments. A public commonware_runtime::available_cores() helper is also exposed for users who want to query the CPU count for their own assignment strategies.

Related #3537.

cloudflare-workers-and-pages · 2026-04-07T09:42:42Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Updated (UTC)
✅ Deployment successful! View logs	commonware-mcp	`1fb2e83`	Apr 22 2026, 10:42 PM

cloudflare-workers-and-pages · 2026-04-07T09:49:21Z

Deploying monorepo with Cloudflare Pages

Latest commit:	`1fb2e83`
Status:	✅ Deploy successful!
Preview URL:	https://8c08cd96.monorepo-eu0.pages.dev
Branch Preview URL:	https://andre-runtime-spawn-pinned.monorepo-eu0.pages.dev

View logs

andresilva · 2026-04-07T10:01:56Z

+}
+
+/// Best-effort attempt to pin the current thread to the given core.
+/// The `core` value wraps around the number of available CPUs.


Undecided whether wrapping is a good idea or not. Do we want to panic if the user tries to do pinned(16) on a 16-core CPU? Or just silently ignore it and don't pin at all? The latter doesn't seem like a good idea.

in this case, I think we could panic (clearly this is an advanced API and I think we should be transparent)

I updated this to panic on an invalid core number, but still silently fail if the actual syscalls to get the number of cores and to pin fail (e.g. on a restricted environment like a cgroup with CPU limits). Let me know if you think we should still panic there.

patrick-ogrady · 2026-04-15T03:51:52Z

+        });
+        return;
+    };
+    assert!(


Do we need to assert in two locations?

If we don't assert in Context::pinned(...) then the failure only happens at spawn time. That may be fine, but may also be a bit weird? Also the panic will happen in the new spawned thread, and won't necessarily take down the main thread (we can workaround that though if you think panicking at spawn time is better).

patrick-ogrady · 2026-04-15T03:52:44Z

+    static WARN_AFFINITY: Once = Once::new();
+
+    let Some(num_cores) = available_cores() else {
+        WARN_CPUS.call_once(|| {


Is the idea that warning each time would become very noisy/preference would just be to continue?

Yeah, if this fails once, it will very likely just keep failing forever (i.e. the syscall just isn't allowed for some reason). I can remove and just warn every time though.

patrick-ogrady · 2026-04-15T03:55:18Z

+///
+/// Panics if `core` is greater than or equal to the number of available CPUs.
+#[cfg(target_os = "linux")]
+pub(crate) fn pin_to_core(core: usize) {


If you are designing your application with core pinning in mind, I wonder if there is some credence to just panicking if core pining fails?

For example, I suspect glommio just dies if it can't pin?

I think in a lot of cases core pinning is mostly a performance optimization but not necessarily something that MUST happen, i.e. a best-effort attempt, since it doesn't incur any correctness issue if it doesn't happen. Maybe here we can just return a bool and that let's the user decide what to do? The issue is that in Context::pined(...) we have no way to expose this 🫤. I'm thinking mostly of a scenario where someone distributes a node binary, that tries to pin threads as a performance optimization, and then it suddenly panics because someone is running that (general) binary in a container.

My concern (maybe too pedantic) is that correct/smooth behavior does actually rely on pinning in the applications that want it (or else there is such a big performance hit it doesn't work well). 🤔

Maybe the behavior could be controlled by a runtime configuration option? IMO it should default to "best-effort", but can be set to strict in which case we panic if pinning fails.

# Conflicts: # runtime/src/lib.rs

cursor · 2026-04-22T14:32:06Z

+            panic!("failed to start dedicated task");
+        }
+    }
+}


Unnecessary synchronous blocking for non-pinned dedicated spawns

Medium Severity

spawn_dedicated performs a synchronous mpsc::sync_channel handshake for all dedicated spawns, including Execution::Dedicated(None) where no CPU pinning is requested. Previously, plain dedicated() spawns just called thread::spawn and returned immediately. Now they block the calling thread (potentially a Tokio worker thread) waiting for the new thread to start and send an Ok(()) through the channel. This sync wait is only needed when cpu.is_some() to surface pinning failures at spawn time; for non-pinned dedicated tasks, the handshake result is always Ok(()) and the blocking adds unnecessary latency.

Additional Locations (1)

runtime/src/tokio/runtime.rs#L624-L626

^{Reviewed by Cursor Bugbot for commit 2ee66b0. Configure here.}

If cpu is None we might need to reset the affinity mask to the default (if the runtime supports pinning), so we still need to make sure that doesn't fail. We could avoid this if cpu == None and available_cpus == None but IMO it's not worth the code complexity.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 3 total unresolved issues (including 2 from previous reviews).

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 380231c. Configure here.}

patrick-ogrady · 2026-04-22T17:26:18Z

+/// This queries the calling thread's affinity mask via `sched_getaffinity` and
+/// returns `None` if that query fails.
+#[cfg(target_os = "linux")]
+pub(crate) fn available_cpus() -> Option<NonEmptyVec<usize>> {


I wonder if we should hide this complexity and layout cores as a contiguous 0,1,2,3 and then map to physical cores?

I've never done pinning before, so that probably is more confusing than its worth (and we should let the application make this call)?

I don't love how it complicates the spawner trait but it isn't "wrong"?

I'd say we shouldn't hide it for now, another thing I want to do in the future is to also have an helper to expose CPU topology, so that you can pin on the same "core" / "CCD" / "NUMA-node" etc. (and it might just replace this helper altogether). If you're going to pin threads you probably care about this as well, e.g. cross-CCD cacheline invalidation is like 4x slower depending on the CPU.

I don't love how it complicates the spawner trait but it isn't "wrong"?

Also it's not obvious to me how making this linear would change anything in the spawner trait, seems ortoghonal? The reason I decided to expose Spawner::available_cpus is to avoid the footgun of using this helper from inside tasks, I can revert that if you want, we only need to cache the baseline available CPUs in tokio runtime so that we can restore the affinity mask on dedicated(), but it doesn't need to be exposed.

let cpus = utils::thread::available_cpus(); // vec![0, 1, 2, 3]; context.pinned(1).spawn(|_| { assert!(utils::thread::available_cpus() == vec![1]); // true assert!(ctx.available_cpus() == vec![0, 1, 2, 3]); // true })

Removed Spawner::available_cpus, and utils::thread::available_cpus should always return the baseline value now (since it checks affinity of process PID).

codecov · 2026-04-22T23:41:56Z

Codecov Report

❌ Patch coverage is 91.51943% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.81%. Comparing base (2a7dd42) to head (1fb2e83).
⚠️ Report is 4 commits behind head on main.

Files with missing lines	Patch %	Lines
runtime/src/utils/thread.rs	88.77%	10 Missing and 1 partial ⚠️
runtime/src/tokio/runtime.rs	87.23%	5 Missing and 1 partial ⚠️
runtime/src/lib.rs	96.87%	2 Missing and 2 partials ⚠️
runtime/src/utils/cell.rs	0.00%	3 Missing ⚠️

@@            Coverage Diff             @@
##             main    #3549      +/-   ##
==========================================
- Coverage   95.87%   95.81%   -0.06%     
==========================================
  Files         442      442              
  Lines      172121   173696    +1575     
  Branches     4010     4083      +73     
==========================================
+ Hits       165017   166425    +1408     
- Misses       5840     5969     +129     
- Partials     1264     1302      +38

Files with missing lines	Coverage Δ
runtime/src/deterministic.rs	`96.23% <100.00%> (+0.01%)`	⬆️
runtime/src/telemetry/metrics/task.rs	`100.00% <100.00%> (ø)`
runtime/src/utils/mod.rs	`97.58% <100.00%> (ø)`
runtime/src/utils/cell.rs	`76.92% <0.00%> (-1.51%)`	⬇️
runtime/src/lib.rs	`97.58% <96.87%> (-0.05%)`	⬇️
runtime/src/tokio/runtime.rs	`84.58% <87.23%> (-0.28%)`	⬇️
runtime/src/utils/thread.rs	`88.09% <88.77%> (+0.21%)`	⬆️

... and 20 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2a7dd42...1fb2e83. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

[runtime] support pinning dedicated tasks to a core

1b43392

github-actions Bot assigned andresilva Apr 7, 2026

andresilva added this to Tracker Apr 7, 2026

andresilva moved this to Ready for Review in Tracker Apr 7, 2026

andresilva requested a review from patrick-ogrady April 7, 2026 09:42

[runtime] add helper for available cores

fd20460

pepyakin reviewed Apr 7, 2026

View reviewed changes

Comment thread runtime/src/utils/thread.rs Outdated

andresilva commented Apr 7, 2026

View reviewed changes

andresilva added 4 commits April 7, 2026 11:02

[runtime] clippy

8c076c1

[runtime] warn once on pin_to_core failure

df22aa9

[runtime] clippy

cc2321b

[runtime] tests

53bff09

andresilva force-pushed the andre/runtime-spawn-pinned branch from 4f8b1a0 to 53bff09 Compare April 7, 2026 10:36

[runtime] panic on invalid core number

f00c82e

cursor Bot reviewed Apr 7, 2026

View reviewed changes

Comment thread runtime/src/tokio/runtime.rs

Comment thread runtime/src/deterministic.rs Outdated

andresilva added 2 commits April 7, 2026 16:44

[runtime] test pinned panic

d3877f5

[runtime] panic eagerly on invalid pinned

595bbcc

cursor Bot reviewed Apr 7, 2026

View reviewed changes

Comment thread runtime/src/lib.rs

andresilva added 3 commits April 7, 2026 16:53

[runtime] fix test

a3b5f97

[runtime] available_cores works on any unix

a0d9699

[runtime] test available_cores

cd17564

patrick-ogrady reviewed Apr 15, 2026

View reviewed changes

patrick-ogrady and others added 4 commits April 21, 2026 15:55

Merge branch 'main' into pr/3549

e6077ad

# Conflicts: # runtime/src/lib.rs

[runtime] remove stale instrumented spawner method

ab37233

Merge branch 'main' into andre/runtime-spawn-pinned

bb4238b

[runtime] pin cpus not cores

cc4a3b7

andresilva added 4 commits April 22, 2026 12:23

[runtime] panic on failed cpu pinning

017c4c7

[runtime] clippy

64ffa2d

[runtime] cleanup

eb5ac73

[runtime] fix large cpu oom

2ee66b0

cursor Bot reviewed Apr 22, 2026

View reviewed changes

andresilva added 3 commits April 22, 2026 15:37

[runtime] cleanup available_cpus

1d33fe6

[runtime] cleanup

d6bdfdf

[runtime] cleanup

6ef8a13

andresilva force-pushed the andre/runtime-spawn-pinned branch from af27fc2 to 6ef8a13 Compare April 22, 2026 15:20

cursor Bot reviewed Apr 22, 2026

View reviewed changes

Comment thread runtime/src/utils/thread.rs Outdated

Comment thread runtime/src/tokio/runtime.rs

[runtime] cleanup

380231c

andresilva force-pushed the andre/runtime-spawn-pinned branch from 4802f6a to 380231c Compare April 22, 2026 15:35

cursor Bot reviewed Apr 22, 2026

View reviewed changes

Comment thread runtime/src/lib.rs Outdated

andresilva added 7 commits April 22, 2026 17:00

[runtime] cleanup tests

61be589

[runtime] avoid clone

aff0464

[runtime] cleanup

d478b2a

[runtime] nit

2885b19

[runtime] nit

6a9261e

[runtime] fix

72bc5ea

[runtime] fix test

201683d

patrick-ogrady reviewed Apr 22, 2026

View reviewed changes

andresilva added 6 commits April 22, 2026 18:43

[runtime] clippy

412967a

[runtime] remove Spawner::available_cpus

8df7099

[runtime] export available_cpus

87f22f0

[runtime] clippy

34c0cdc

[runtime] clippy

2e18622

[runtime] fix wasm

1fb2e83

Conversation

andresilva commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying with Cloudflare Workers

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying monorepo with Cloudflare Pages

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andresilva Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andresilva Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andresilva Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andresilva Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor Bot Apr 22, 2026

Choose a reason for hiding this comment

Unnecessary synchronous blocking for non-pinned dedicated spawns

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andresilva Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andresilva Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Apr 22, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

andresilva commented Apr 7, 2026 •

edited

Loading

cloudflare-workers-and-pages Bot commented Apr 7, 2026 •

edited

Loading

cloudflare-workers-and-pages Bot commented Apr 7, 2026 •

edited

Loading

andresilva Apr 7, 2026 •

edited

Loading

andresilva Apr 15, 2026 •

edited

Loading

andresilva Apr 15, 2026 •

edited

Loading

andresilva Apr 16, 2026 •

edited

Loading

andresilva Apr 22, 2026 •

edited

Loading

andresilva Apr 22, 2026 •

edited

Loading