perf(mavlink): remove mutex from forward_message hot path by mrpollo · Pull Request #26927 · PX4/PX4-Autopilot

mrpollo · 2026-04-01T03:14:57Z

Remove mavlink_module_mutex from forward_message() to eliminate per-message lock overhead in the MAVLink forwarding hot path.

Spun off from discussion in #26894 with @dakejahl.

The mutex is unnecessary because instances can only be destroyed via destroy_all_instances(), which stops all receiver threads before modifying the array. Pointer reads from mavlink_module_instances[] are naturally atomic on aligned architectures. pass_message() has its own per-instance _message_buffer_mutex for the ring buffer.

Benchmarked on three platforms (200 Hz, 30-60s, clock_gettime(CLOCK_MONOTONIC)):

Platform	With Mutex	Without	Improvement
SITL (macOS arm64)	648 ns/msg	563 ns/msg	-13%
CubeOrange+ (STM32H7)	4,521 ns/msg	2,948 ns/msg	-35%
Pixhawk 6X-RT (i.MX RT1176)	791 ns/msg	162 ns/msg	-80%

Full methodology and raw data: https://gist.github.com/mrpollo/167d4ec7436fb773bf193be6c2800639

Also includes a MAVSDK C++ benchmark tool for reproducing the measurements.

Remove mavlink_module_mutex from forward_message() to eliminate per-message lock overhead in the forwarding path. The mutex is unnecessary because: - Instances can only be destroyed via destroy_all_instances(), which stops all threads before modifying the array - Pointer reads from mavlink_module_instances[] are naturally atomic on aligned architectures (ARM, x86) - pass_message() has its own per-instance _message_buffer_mutex Benchmark results (SITL SIH, 2 connections, wall-clock timing): 200 Hz (400 msg/s): avg 648 ns -> 563 ns (-13%) 500 Hz (1000 msg/s): avg 469 ns -> 407 ns (-13%) Mutex lock section: avg 327 ns -> 262 ns (-20%) Benchmark gist: https://gist.github.com/mrpollo/167d4ec7436fb773bf193be6c2800639 Also includes a MAVSDK C++ traffic generator tool for reproducing the benchmark. Signed-off-by: Ramon Roche <mrpollo@gmail.com>

Copilot

Pull request overview

This PR aims to improve MAVLink forwarding performance by removing the global mavlink_module_mutex from the Mavlink::forward_message() hot path, and adds a standalone MAVSDK-based benchmark tool + runner script to measure forwarding-path performance and related perf counters.

Changes:

Remove the mavlink_module_mutex lock in Mavlink::forward_message() to reduce per-message overhead.
Add a MAVSDK C++ benchmark that floods broadcast MAVLink messages across multiple UDP connections.
Add a helper shell script and standalone CMake file to build and run the benchmark against PX4 SITL.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File	Description
`src/modules/mavlink/mavlink_main.cpp`	Removes the forwarding-path mutex and updates inline rationale.
`test/mavsdk_tests/test_mavlink_forwarding_bench.cpp`	Adds a multi-connection sender benchmark for stressing `forward_message()` contention/perf.
`test/mavsdk_tests/run_forwarding_bench.sh`	Adds an end-to-end script to build SITL + benchmark and gather logs/results.
`test/mavsdk_tests/CMakeLists_bench.txt`	Adds a standalone CMake definition for building the benchmark binary.

Comments suppressed due to low confidence (1)

src/modules/mavlink/mavlink_main.cpp:525

Removing the global lock here introduces a data race and potential use-after-free: other code paths still modify and delete entries in mavlink_module_instances[] while receiver threads can be running (e.g. Mavlink::stop_command() removes an instance and deletes it under mavlink_module_mutex). Iterating the raw pointer array without synchronization can concurrently observe a pointer that is being cleared/freed, and calling methods on it is undefined behavior.

If the goal is to eliminate per-message lock overhead, consider switching mavlink_module_instances[] to an atomic pointer array (with acquire/release semantics) and using a safe reclamation strategy (or restricting deletion to destroy_all_instances() by changing stop_command() accordingly). Otherwise this needs to keep synchronization around the iteration.

	// No mutex needed: instances can't be destroyed while receiver threads run
	// (destroy_all_instances() stops all threads first). Pointer reads are
	// naturally atomic on aligned architectures. pass_message() has its own
	// per-instance _message_buffer_mutex for the ring buffer.

	for (Mavlink *inst : mavlink_module_instances) {
		if (inst && (inst != self) && (inst->get_forwarding_on())) {
			// Pass message only if target component was seen before
			if (inst->_receiver.component_was_seen(target_system_id, target_component_id)) {
				inst->pass_message(msg);

Copilot · 2026-04-01T03:20:37Z

test/mavsdk_tests/test_mavlink_forwarding_bench.cpp

+int main(int argc, char *argv[])
+{
+	int duration_sec = 60;
+	int rate_hz = 200;
+	int num_connections = 2;
+	int gcs_port = 14550;
+	int api_port = 14540;
+	std::string report_file;
+
+	for (int i = 1; i < argc; ++i) {
+		std::string arg(argv[i]);
+
+		if (arg == "--duration" && i + 1 < argc) {
+			duration_sec = std::atoi(argv[++i]);
+
+		} else if (arg == "--rate" && i + 1 < argc) {
+			rate_hz = std::atoi(argv[++i]);
+
+		} else if (arg == "--connections" && i + 1 < argc) {
+			num_connections = std::atoi(argv[++i]);


rate_hz is parsed from CLI but never validated. If the user passes --rate 0 (or a negative value), 1'000'000 / rate_hz will divide by zero (or produce a negative interval), which will break the sender timing loop.

Please validate rate_hz > 0 (and similarly validate duration_sec > 0, num_connections > 0) and fail with a clear error message before starting threads.

Copilot · 2026-04-01T03:20:38Z

test/mavsdk_tests/test_mavlink_forwarding_bench.cpp

+	for (size_t i = 0; i < ports.size(); ++i) {
+		auto config = Mavsdk::Configuration(
+				      static_cast<uint8_t>(245 + i),  // unique sysid per connection
+				      MAV_COMP_ID_MISSIONPLANNER,
+				      false                           // not autopilot
+			      );
+		auto mavsdk = std::make_unique<Mavsdk>(config);
+
+		std::string url = "udp://:" + std::to_string(ports[i]);
+		std::cout << "Connecting to " << url << " (sysid=" << (245 + i) << ")..." << std::endl;


system_id is derived as static_cast<uint8_t>(245 + i). For larger --connections this silently wraps past 255, producing duplicate sysids and potentially confusing routing/forwarding behavior.

Consider bounding num_connections to a safe range (e.g. so sysid stays within 1..255 and unique), or choose sysids from a configurable base with validation.

Suggested change

for (size_t i = 0; i < ports.size(); ++i) {

auto config = Mavsdk::Configuration(

static_cast<uint8_t>(245 + i), // unique sysid per connection

MAV_COMP_ID_MISSIONPLANNER,

false // not autopilot

);

auto mavsdk = std::make_unique<Mavsdk>(config);

std::string url = "udp://:" + std::to_string(ports[i]);

std::cout << "Connecting to " << url << " (sysid=" << (245 + i) << ")..." << std::endl;

constexpr uint8_t sysid_base = 245;

const size_t max_unique_sysids = static_cast<size_t>(255u - sysid_base + 1u);

if (ports.size() > max_unique_sysids) {

std::cerr << "Requested " << ports.size()

<< " connections, but only " << max_unique_sysids

<< " unique system IDs are available starting at "

<< static_cast<unsigned>(sysid_base) << "." << std::endl;

return 1;

}

for (size_t i = 0; i < ports.size(); ++i) {

const uint8_t system_id = static_cast<uint8_t>(sysid_base + static_cast<uint8_t>(i));

auto config = Mavsdk::Configuration(

system_id, // unique sysid per connection

MAV_COMP_ID_MISSIONPLANNER,

false // not autopilot

);

auto mavsdk = std::make_unique<Mavsdk>(config);

std::string url = "udp://:" + std::to_string(ports[i]);

std::cout << "Connecting to " << url << " (sysid="

<< static_cast<unsigned>(system_id) << ")..." << std::endl;

Copilot · 2026-04-01T03:20:38Z

test/mavsdk_tests/test_mavlink_forwarding_bench.cpp

+		std::cout << " OK" << std::endl;
+
+		auto system = mavsdk->systems().front();
+		auto *passthrough = new MavlinkPassthrough(system);
+		passthroughs.push_back(passthrough);
+		mavsdks.push_back(std::move(mavsdk));


passthrough is allocated with new and stored as a raw pointer. There are multiple early return 1; paths above, so this can leak on error, and it makes ownership harder to reason about.

Prefer storing MavlinkPassthrough in a std::unique_ptr vector (or even directly in a vector) so cleanup is automatic on all exit paths.

Copilot · 2026-04-01T03:20:38Z

test/mavsdk_tests/run_forwarding_bench.sh

+echo ""
+echo "=== Step 5: Collect perf counters ==="
+
+# Send 'perf' command to PX4 via mavlink shell (or direct if available)
+sleep 2
+
+echo ""
+echo "=== Step 6: Stop PX4 and extract perf data ==="
+kill -INT "$PX4_PID" 2>/dev/null || true


The script claims to “collect perf counter data” and has a “Step 5: Collect perf counters”, but it never actually issues the perf command to the PX4 console/MAVLink shell. As written, ${RESULT_PREFIX}_perf.txt will usually be missing unless PX4 happens to dump perf counters on exit.

Either implement the perf collection step (e.g. via mavlink shell/pxh) or update the step text and header comment so expectations match what the script really does.

Copilot · 2026-04-01T03:20:38Z

test/mavsdk_tests/run_forwarding_bench.sh

+echo "=== Step 6: Stop PX4 and extract perf data ==="
+kill -INT "$PX4_PID" 2>/dev/null || true
+sleep 3
+
+# PX4 dumps perf counters on exit in some configurations.
+# Extract perf lines from the log.
+if grep -q "mavlink: fwd_msg" "${RESULT_PREFIX}_px4.log"; then
+    echo "Perf counters from PX4 log:"
+    grep -E "mavlink: (fwd_|comp_seen|unsigned_cb)" "${RESULT_PREFIX}_px4.log" | tee "${RESULT_PREFIX}_perf.txt"


After sending SIGINT to PX4, the script sleeps for a fixed 3 seconds but never waits for the process to exit. If shutdown takes longer, the log may be incomplete when you grep for perf lines, and the PX4 process can be left running in the background.

Consider wait "$PX4_PID" (optionally with a timeout/fallback SIGKILL) before extracting results.

- Validate rate, duration, and connections args (reject <= 0) - Cap connections at 10 to prevent sysid overflow past 255 - Use unique_ptr for MavlinkPassthrough (no manual delete) - Fix runner script: document that perf is read via MAVLink shell, wait for PX4 process on shutdown Signed-off-by: Ramon Roche <mrpollo@gmail.com>

github-actions · 2026-04-02T05:47:26Z

💡 Commit messages could be improved

Not blocking, but these commit messages could use some cleanup.

Commit	Message	Suggestion
`ed70e1d91c`	fix(mavlink): address review feedback on benchmark tool	Review-response commit

See the commit message convention for details.

This comment will be automatically removed once the issues are resolved.

mrpollo requested a review from Copilot April 1, 2026 03:16

Copilot started reviewing on behalf of mrpollo April 1, 2026 03:16 View session

Copilot AI reviewed Apr 1, 2026

View reviewed changes

mrpollo requested a review from dakejahl April 2, 2026 05:44

mrpollo marked this pull request as ready for review April 2, 2026 05:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(mavlink): remove mutex from forward_message hot path#26927

perf(mavlink): remove mutex from forward_message hot path#26927
mrpollo wants to merge 2 commits intomainfrom
mrpollo/mavlink-forwarding-bench

mrpollo commented Apr 1, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 1, 2026

Uh oh!

Copilot AI Apr 1, 2026

Uh oh!

Copilot AI Apr 1, 2026

Uh oh!

Copilot AI Apr 1, 2026

Uh oh!

Copilot AI Apr 1, 2026

Uh oh!

github-actions bot commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mrpollo commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Apr 2, 2026

💡 Commit messages could be improved

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mrpollo commented Apr 1, 2026 •

edited

Loading