Skip to content

perf(mavlink): remove mutex from forward_message hot path#26927

Open
mrpollo wants to merge 2 commits intomainfrom
mrpollo/mavlink-forwarding-bench
Open

perf(mavlink): remove mutex from forward_message hot path#26927
mrpollo wants to merge 2 commits intomainfrom
mrpollo/mavlink-forwarding-bench

Conversation

@mrpollo
Copy link
Copy Markdown
Contributor

@mrpollo mrpollo commented Apr 1, 2026

Remove mavlink_module_mutex from forward_message() to eliminate per-message lock overhead in the MAVLink forwarding hot path.

Spun off from discussion in #26894 with @dakejahl.

The mutex is unnecessary because instances can only be destroyed via destroy_all_instances(), which stops all receiver threads before modifying the array. Pointer reads from mavlink_module_instances[] are naturally atomic on aligned architectures. pass_message() has its own per-instance _message_buffer_mutex for the ring buffer.

Benchmarked on three platforms (200 Hz, 30-60s, clock_gettime(CLOCK_MONOTONIC)):

Platform With Mutex Without Improvement
SITL (macOS arm64) 648 ns/msg 563 ns/msg -13%
CubeOrange+ (STM32H7) 4,521 ns/msg 2,948 ns/msg -35%
Pixhawk 6X-RT (i.MX RT1176) 791 ns/msg 162 ns/msg -80%

Full methodology and raw data: https://gist.github.com/mrpollo/167d4ec7436fb773bf193be6c2800639

Also includes a MAVSDK C++ benchmark tool for reproducing the measurements.

Remove mavlink_module_mutex from forward_message() to eliminate
per-message lock overhead in the forwarding path.

The mutex is unnecessary because:
- Instances can only be destroyed via destroy_all_instances(), which
  stops all threads before modifying the array
- Pointer reads from mavlink_module_instances[] are naturally atomic
  on aligned architectures (ARM, x86)
- pass_message() has its own per-instance _message_buffer_mutex

Benchmark results (SITL SIH, 2 connections, wall-clock timing):

  200 Hz (400 msg/s): avg 648 ns -> 563 ns (-13%)
  500 Hz (1000 msg/s): avg 469 ns -> 407 ns (-13%)
  Mutex lock section: avg 327 ns -> 262 ns (-20%)

Benchmark gist: https://gist.github.com/mrpollo/167d4ec7436fb773bf193be6c2800639

Also includes a MAVSDK C++ traffic generator tool for reproducing
the benchmark.

Signed-off-by: Ramon Roche <mrpollo@gmail.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to improve MAVLink forwarding performance by removing the global mavlink_module_mutex from the Mavlink::forward_message() hot path, and adds a standalone MAVSDK-based benchmark tool + runner script to measure forwarding-path performance and related perf counters.

Changes:

  • Remove the mavlink_module_mutex lock in Mavlink::forward_message() to reduce per-message overhead.
  • Add a MAVSDK C++ benchmark that floods broadcast MAVLink messages across multiple UDP connections.
  • Add a helper shell script and standalone CMake file to build and run the benchmark against PX4 SITL.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
src/modules/mavlink/mavlink_main.cpp Removes the forwarding-path mutex and updates inline rationale.
test/mavsdk_tests/test_mavlink_forwarding_bench.cpp Adds a multi-connection sender benchmark for stressing forward_message() contention/perf.
test/mavsdk_tests/run_forwarding_bench.sh Adds an end-to-end script to build SITL + benchmark and gather logs/results.
test/mavsdk_tests/CMakeLists_bench.txt Adds a standalone CMake definition for building the benchmark binary.
Comments suppressed due to low confidence (1)

src/modules/mavlink/mavlink_main.cpp:525

  • Removing the global lock here introduces a data race and potential use-after-free: other code paths still modify and delete entries in mavlink_module_instances[] while receiver threads can be running (e.g. Mavlink::stop_command() removes an instance and deletes it under mavlink_module_mutex). Iterating the raw pointer array without synchronization can concurrently observe a pointer that is being cleared/freed, and calling methods on it is undefined behavior.

If the goal is to eliminate per-message lock overhead, consider switching mavlink_module_instances[] to an atomic pointer array (with acquire/release semantics) and using a safe reclamation strategy (or restricting deletion to destroy_all_instances() by changing stop_command() accordingly). Otherwise this needs to keep synchronization around the iteration.

	// No mutex needed: instances can't be destroyed while receiver threads run
	// (destroy_all_instances() stops all threads first). Pointer reads are
	// naturally atomic on aligned architectures. pass_message() has its own
	// per-instance _message_buffer_mutex for the ring buffer.

	for (Mavlink *inst : mavlink_module_instances) {
		if (inst && (inst != self) && (inst->get_forwarding_on())) {
			// Pass message only if target component was seen before
			if (inst->_receiver.component_was_seen(target_system_id, target_component_id)) {
				inst->pass_message(msg);

Comment on lines +136 to +155
int main(int argc, char *argv[])
{
int duration_sec = 60;
int rate_hz = 200;
int num_connections = 2;
int gcs_port = 14550;
int api_port = 14540;
std::string report_file;

for (int i = 1; i < argc; ++i) {
std::string arg(argv[i]);

if (arg == "--duration" && i + 1 < argc) {
duration_sec = std::atoi(argv[++i]);

} else if (arg == "--rate" && i + 1 < argc) {
rate_hz = std::atoi(argv[++i]);

} else if (arg == "--connections" && i + 1 < argc) {
num_connections = std::atoi(argv[++i]);
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rate_hz is parsed from CLI but never validated. If the user passes --rate 0 (or a negative value), 1'000'000 / rate_hz will divide by zero (or produce a negative interval), which will break the sender timing loop.

Please validate rate_hz > 0 (and similarly validate duration_sec > 0, num_connections > 0) and fail with a clear error message before starting threads.

Copilot uses AI. Check for mistakes.
Comment on lines +206 to +215
for (size_t i = 0; i < ports.size(); ++i) {
auto config = Mavsdk::Configuration(
static_cast<uint8_t>(245 + i), // unique sysid per connection
MAV_COMP_ID_MISSIONPLANNER,
false // not autopilot
);
auto mavsdk = std::make_unique<Mavsdk>(config);

std::string url = "udp://:" + std::to_string(ports[i]);
std::cout << "Connecting to " << url << " (sysid=" << (245 + i) << ")..." << std::endl;
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

system_id is derived as static_cast<uint8_t>(245 + i). For larger --connections this silently wraps past 255, producing duplicate sysids and potentially confusing routing/forwarding behavior.

Consider bounding num_connections to a safe range (e.g. so sysid stays within 1..255 and unique), or choose sysids from a configurable base with validation.

Suggested change
for (size_t i = 0; i < ports.size(); ++i) {
auto config = Mavsdk::Configuration(
static_cast<uint8_t>(245 + i), // unique sysid per connection
MAV_COMP_ID_MISSIONPLANNER,
false // not autopilot
);
auto mavsdk = std::make_unique<Mavsdk>(config);
std::string url = "udp://:" + std::to_string(ports[i]);
std::cout << "Connecting to " << url << " (sysid=" << (245 + i) << ")..." << std::endl;
constexpr uint8_t sysid_base = 245;
const size_t max_unique_sysids = static_cast<size_t>(255u - sysid_base + 1u);
if (ports.size() > max_unique_sysids) {
std::cerr << "Requested " << ports.size()
<< " connections, but only " << max_unique_sysids
<< " unique system IDs are available starting at "
<< static_cast<unsigned>(sysid_base) << "." << std::endl;
return 1;
}
for (size_t i = 0; i < ports.size(); ++i) {
const uint8_t system_id = static_cast<uint8_t>(sysid_base + static_cast<uint8_t>(i));
auto config = Mavsdk::Configuration(
system_id, // unique sysid per connection
MAV_COMP_ID_MISSIONPLANNER,
false // not autopilot
);
auto mavsdk = std::make_unique<Mavsdk>(config);
std::string url = "udp://:" + std::to_string(ports[i]);
std::cout << "Connecting to " << url << " (sysid="
<< static_cast<unsigned>(system_id) << ")..." << std::endl;

Copilot uses AI. Check for mistakes.
Comment on lines +242 to +247
std::cout << " OK" << std::endl;

auto system = mavsdk->systems().front();
auto *passthrough = new MavlinkPassthrough(system);
passthroughs.push_back(passthrough);
mavsdks.push_back(std::move(mavsdk));
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

passthrough is allocated with new and stored as a raw pointer. There are multiple early return 1; paths above, so this can leak on error, and it makes ownership harder to reason about.

Prefer storing MavlinkPassthrough in a std::unique_ptr vector (or even directly in a vector) so cleanup is automatic on all exit paths.

Copilot uses AI. Check for mistakes.
Comment on lines +93 to +101
echo ""
echo "=== Step 5: Collect perf counters ==="

# Send 'perf' command to PX4 via mavlink shell (or direct if available)
sleep 2

echo ""
echo "=== Step 6: Stop PX4 and extract perf data ==="
kill -INT "$PX4_PID" 2>/dev/null || true
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script claims to “collect perf counter data” and has a “Step 5: Collect perf counters”, but it never actually issues the perf command to the PX4 console/MAVLink shell. As written, ${RESULT_PREFIX}_perf.txt will usually be missing unless PX4 happens to dump perf counters on exit.

Either implement the perf collection step (e.g. via mavlink shell/pxh) or update the step text and header comment so expectations match what the script really does.

Copilot uses AI. Check for mistakes.
Comment on lines +100 to +108
echo "=== Step 6: Stop PX4 and extract perf data ==="
kill -INT "$PX4_PID" 2>/dev/null || true
sleep 3

# PX4 dumps perf counters on exit in some configurations.
# Extract perf lines from the log.
if grep -q "mavlink: fwd_msg" "${RESULT_PREFIX}_px4.log"; then
echo "Perf counters from PX4 log:"
grep -E "mavlink: (fwd_|comp_seen|unsigned_cb)" "${RESULT_PREFIX}_px4.log" | tee "${RESULT_PREFIX}_perf.txt"
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After sending SIGINT to PX4, the script sleeps for a fixed 3 seconds but never waits for the process to exit. If shutdown takes longer, the log may be incomplete when you grep for perf lines, and the PX4 process can be left running in the background.

Consider wait "$PX4_PID" (optionally with a timeout/fallback SIGKILL) before extracting results.

Copilot uses AI. Check for mistakes.
@mrpollo mrpollo requested a review from dakejahl April 2, 2026 05:44
@mrpollo mrpollo marked this pull request as ready for review April 2, 2026 05:44
- Validate rate, duration, and connections args (reject <= 0)
- Cap connections at 10 to prevent sysid overflow past 255
- Use unique_ptr for MavlinkPassthrough (no manual delete)
- Fix runner script: document that perf is read via MAVLink shell,
  wait for PX4 process on shutdown

Signed-off-by: Ramon Roche <mrpollo@gmail.com>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

💡 Commit messages could be improved

Not blocking, but these commit messages could use some cleanup.

Commit Message Suggestion
ed70e1d91c fix(mavlink): address review feedback on benchmark tool Review-response commit

See the commit message convention for details.


This comment will be automatically removed once the issues are resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants