Skip to content

Conversation

@t-nelson
Copy link

@t-nelson t-nelson commented Nov 18, 2025

Problem

we do the xdp init steps that requires capabilities quite late in validator process start up. we've already spawned many threads, which will have inherited caps that they do not need, posing security risk

Summary of Changes

  • do a bit of refactoring to get the steps that require caps isolated
  • move xdp init as early as possible in validator start up
  • revoke caps from the process when we're done with them

Fixes: #6087

i know the new type names suck. suggestions welcome

@codecov-commenter
Copy link

codecov-commenter commented Nov 18, 2025

Codecov Report

❌ Patch coverage is 1.51229% with 521 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.6%. Comparing base (252709c) to head (8df9c74).
⚠️ Report is 15 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master    #9133     +/-   ##
=========================================
- Coverage    82.6%    82.6%   -0.1%     
=========================================
  Files         852      852             
  Lines      318162   318341    +179     
=========================================
+ Hits       262963   262991     +28     
- Misses      55199    55350    +151     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@t-nelson t-nelson force-pushed the xdpinit branch 2 times, most recently from b5b9f5d to af2baa0 Compare January 12, 2026 20:05
@t-nelson t-nelson marked this pull request as ready for review January 12, 2026 20:05
@t-nelson
Copy link
Author

ok this is ready so far as i'm concerned. definitely recommend per-commit review. feel free to suggest a different batching for PRing the commits

@t-nelson
Copy link
Author

t-nelson commented Jan 12, 2026

fixing the platform stuff

done

@t-nelson
Copy link
Author

please stop with the irrelevant pedantic shit. look at the pr title. that's what i'm trying to do here. if you can't stay focused i don't want your review

@puhtaytow
Copy link

please stop with the irrelevant pedantic shit. look at the pr title. that's what i'm trying to do here. if you can't stay focused i don't want your review

there is no more pedantic shit to point out so ill focus on the code now, but it shouldnt hurt to fix typos etc at the final point 👍

gregcusack
gregcusack previously approved these changes Jan 15, 2026
Copy link

@gregcusack gregcusack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm from a technical point of view. a few nits. but looks good. alessandro may have some comments on keeping the xdp/ generic enough for aya. thanks for doing this!

@t-nelson
Copy link
Author

Just a note - if we merge this we would be unable to backport any xdp PRs anymore unless this one is also backported.

this is what we get for merging features without cleaning up our mess first

@t-nelson
Copy link
Author

i'll give @alessandrod 'til eod tomorrow to take a look, then merge

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request moves XDP initialization steps that require elevated Linux capabilities to happen early in the validator process startup, before any threads are spawned. This prevents capability inheritance by child threads, reducing the attack surface and security risk.

Changes:

  • Refactored XDP initialization into a builder pattern that separates capability-requiring operations from runtime operations
  • Moved XDP initialization to happen immediately after node creation and before validator thread spawning
  • Added proper capability management including clearing inheritable and permitted capability sets after initialization

Reviewed changes

Copilot reviewed 11 out of 14 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
xdp/src/umem.rs Added OwnedUmem type for memory ownership and moved methods to traits for better abstraction
xdp/src/tx_loop.rs Refactored tx_loop into builder pattern (TxLoopBuilder) and runtime execution (TxLoop)
xdp/src/device.rs Added Send trait implementations for thread-safe types with safety documentation
xdp/Cargo.toml Removed caps dependency (moved to turbine)
validator/src/commands/run/execute.rs Moved XDP initialization to early startup before thread spawning
validator/src/admin_rpc_service.rs Updated test to pass new None parameter to Validator::new
turbine/src/xdp.rs Split XdpRetransmitter into XdpRetransmitBuilder (caps phase) and XdpRetransmitter (runtime)
turbine/Cargo.toml Added aya dependency
test-validator/src/lib.rs Updated test to pass new None parameter
local-cluster/src/local_cluster.rs Updated tests to pass new None parameter
core/src/validator.rs Added maybe_partial_xdp_retransmitter parameter to Validator::new
Cargo.lock, dev-bins/Cargo.lock, programs/sbf/Cargo.lock Updated dependency graph reflecting caps movement

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

@alessandrod alessandrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this work!

It was SO EASY to review since the changes are so nicely split in atomic commits. We really need to start doing this more it saves so much time.

Copy link

@alexpyattaev alexpyattaev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to drop the caps no matter if XDP retransmit is enabled or not. It is likely that operators will have the binary marked with extra capabilities (setting caps always in build/startup scripts), and if they do not enable XDP retransmit for whatever reason, the capabilities will not get dropped.

As a secondary layer of protection, we may also set PR_SET_NO_NEW_PRIVS to be certain (should probably be another PR though).

@t-nelson
Copy link
Author

ok addressed all the relevant review comments. ci should be good to go

Comment on lines 148 to 151
// since we aren't necessarily allocating from the thread that we intend to run on,
// temporarily switch to the target cpu to ensure that the region is allocated to the
// correct numa node
set_cpu_affinity([cpu_id]).unwrap();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can possibly numa(7) our way out of this later. i only skimmed, but i think you can induce migration after the fact

Copy link

@alessandrod alessandrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, one last round trip

.cloned()
.collect::<HashSet<_>>();
if !reserved.is_empty() {
let available = core_affinity::get_core_ids()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it turns outtt... this returns the current affinity mask, so we have to call it
before the block above, or this ends up doing set_difference([i], [..., i]) = [] and
the validator doesn't start

I've confirmed that if I move this call above the block then the validator
starts and XDP seems to work correctly

Copy link
Author

@t-nelson t-nelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok took care of the moving the main thread back to its initial cpu


// try to allocate huge pages first, then fall back to regular pages
const HUGE_2MB: usize = 2 * 1024 * 1024;
set_cpu_affinity([cpu_id]).unwrap();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the first refactor, we're still constructing the TxLoopBuilder on the target thread. set cpu affinity here, even though we'll do it again below in TxLoop::new

Comment on lines 210 to 213
// since we aren't necessarily allocating from the thread that we intend to run on,
// temporarily switch to the target cpu for each TxLoop to ensure that the Umem region
// is allocated to the correct numa node
let this_cpu = get_cpu().expect("linux implements sched_getcpu");
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next we pull TxLoopBuilder init out to main thread. stash initial main thread cpu and restore it once all of the tx loops have been started

Comment on lines +170 to +183
// since we aren't necessarily allocating from the thread that we intend to run on,
// temporarily switch to the target cpu for each TxLoop to ensure that the Umem region
// is allocated to the correct numa node
let this_cpu = get_cpu().expect("linux implements sched_getcpu");
let tx_loop_builders = cpus
.into_iter()
.zip(std::iter::repeat_with(|| tx_loop_config.clone()))
.enumerate()
.map(|(i, (cpu_id, config))| {
TxLoopBuilder::new(cpu_id, QueueId(i as u64), config, &dev)
})
.collect::<Vec<_>>();
// migrate main thread back off of the last xdp reserved cpu
set_cpu_affinity([this_cpu]).unwrap();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retain the main thread cpu restoration after isolating the logic that requires caps being raised

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_cpu_affinity([this_cpu]) seems incorrect? this cpu != all the possible cpus in the original mask

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we haven't applied the masked yet. i moved that back

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

XDP: drop CAP_NET_ADMIN when possible

6 participants