Skip to content

Segfault in crosswalk module due to non-strict std::sort comparator #12517

@Ching-Hsiang-18

Description

@Ching-Hsiang-18

Checklist

  • I've read the contribution guidelines.
  • I've searched other issues and no duplicate issues were found.
  • I'm convinced that this is not my fault but a bug.

Description

sortCrosswalksByDistance in autoware_behavior_velocity_crosswalk_module uses a comparator lambda whose (!l1_ep || !l2_ep) -> return true branch violates the strict-weak-ordering precondition required by std::sort. For N≥17, some valid input arrangements with no-endpoint entries trigger libstdc++'s introsort path, where __unguarded_partition can read past the end of the vector and the behavior_planning_container process terminates with SIGSEGV. In closed-loop operation, this kills the entire planning pipeline mid-drive and the vehicle stalls on the road with no automatic recovery.

Expected behavior

sortCrosswalksByDistance should provide a strict weak ordering for every pair of crosswalk lanelets passed to std::sort, including lanelets whose polygons conflict with the ego road lanelet but do not intersect the ego path.

In the closed-loop planning scenario, behavior_planning_container should not crash when the ego vehicle approaches the active crosswalk. The planner should continue publishing trajectories, handle the crossing pedestrian normally, and complete the route instead of dropping out of autonomous operation because of a process-level SIGSEGV.

Actual behavior

We reproduced the failure in planning_simulator using the synthetic Lanelet2 map shown below. The map contains one long ego road lanelet, one active ego-path-crossing crosswalk (c_active), and 16 additional pedestrian crosswalk polygons that conflict with the same road lanelet but do not intersect the ego path.

Image

On the buggy build, behavior_planning_container exits with SIGSEGV (exit code -11) the moment ego's planning horizon first reaches the active crosswalk, on the very first invocation of the sort. The process never recovers, no further trajectories are published, and ego coasts to a halt roughly 50 m short of the crosswalk while /api/operation_mode drops out of AUTONOMOUS. With only the comparator patched (no other changes), the same scenario completes the route normally.

The end-to-end behavior is shown below: before the patch, the planner process dies and the vehicle stalls before the crosswalk; after the comparator fix, the same scenario yields to the crossing pedestrian and completes the route.

Image

Steps to reproduce

  1. Build/run Autoware Universe with the current implementation of autoware_behavior_velocity_crosswalk_module::sortCrosswalksByDistance.
  2. Create a situation where clampAttentionRangeByNeighborCrosswalks receives at least 17 pedestrian crosswalk lanelets from conflictingInGraph(road_lanelet, PEDESTRIAN_GRAPH_ID).
  3. Ensure that the active crosswalk intersects the ego path, but at least one of the returned conflicting crosswalks does not have a path-intersection endpoint on the ego path.
  4. Let the planning horizon reach the active crosswalk so that sortCrosswalksByDistance is invoked.
  5. On libstdc++ 11, the invalid comparator can trigger an out-of-bounds read in std::sort's introsort path, causing behavior_planning_container to exit with SIGSEGV.

A minimal standalone C++ reproducer is included below. It reproduces the same
strict-weak-ordering violation without requiring a full Autoware launch.

Versions

  • Autoware Universe: bug reproduced on 14ae8fd36cf50158ed6153b29154d9ba0b320543; buggy code unchanged in current main d40e111069d247db426fd3d1266b996515862f98 (lines 144-146).
  • Autoware Core: 1.7.2 (per repositories/autoware.repos).
  • ROS 2: Humble (ghcr.io/autowarefoundation/autoware:universe-devel, image sha256:eaa0c1919ade009fdd7f32a7509ad9d508197eb6ab16e0ba3517f3a3baf80f59).
  • Compiler: g++ 11.4.0 (Ubuntu 22.04 / 11.4.0-1ubuntu1~22.04.3), libstdc++-11.
  • Architecture: x86_64.

Possible causes

Affected code

planning/behavior_velocity_planner/autoware_behavior_velocity_crosswalk_module/src/scene_crosswalk.cpp#L135-L167
(sortCrosswalksByDistance, comparator at lines 139-156, offending branch at lines 144-146):

const auto compare = [&](const lanelet::ConstLanelet & l1, const lanelet::ConstLanelet & l2) {
    const auto l1_end_points_on_crosswalk =
      getPathEndPointsOnCrosswalk(ego_path, l1.polygon2d().basicPolygon(), ego_pos);
    const auto l2_end_points_on_crosswalk =
      getPathEndPointsOnCrosswalk(ego_path, l2.polygon2d().basicPolygon(), ego_pos);

    if (!l1_end_points_on_crosswalk || !l2_end_points_on_crosswalk) {
        return true;                                                // <-- buggy branch
    }
    // ... arc-length comparison
};

For any pair where either operand lacks a path-intersection endpoint, the comparator returns true regardless of operand order, so compare(a, b) == compare(b, a) == true. This breaks antisymmetry; for two no-endpoint operands it additionally breaks irreflexivity (compare(x, x) == true).

Trigger conditions

The vector passed into std::sort is built by clampAttentionRangeByNeighborCrosswalks from conflictingInGraph(road_lanelet, PEDESTRIAN_GRAPH_ID). SIGSEGV reproduces deterministically when all of these hold:

  1. The active CrosswalkModule's ego path actually crosses its own polygon (otherwise the module early-exits in modifyPathVelocity before reaching the sort).
  2. conflictingInGraph returns ≥17 crosswalks for the ego lanelet — libstdc++'s std::sort switches from insertion sort to introsort at N=17. In our tests, N≤16 stays on the insertion-sort path and does not trigger the observed OOB, while N≥17 reaches the introsort path where the UB manifests.
  3. At least one returned crosswalk has no endpoint on the ego path (its polygon overlaps the ego lanelet polygon but the path polyline does not enter it).

These are realistic in production: any road lanelet long enough to overlap a dense set of pedestrian polygons satisfies (2) and (3). The Nishishinjuku sample map ships road lanelets up to 331 m (p99 = 225 m); whether 17+ conflicting pedestrian polygons land on a single lanelet depends on the mapping vendor's segmentation choices for long urban arterials.

A 2000-trial Monte-Carlo scan over (has-ep, no-ep) splits at fixed N=17 shows the trigger is sharply deterministic at extreme splits and probabilistic in between (numbers are crash rate over 2000 random shuffle seeds per split, libstdc++ from g++ 11.4):

has-ep no-ep crash rate
0 17 100.00%
1 16 100.00%
7 10 88.05%
13 4 47.85%
16 1 14.30%
17 0 0.00%

The "1 has-ep + 16 no-ep" case is interesting because it is the smallest configuration that combines deterministic crashing (every shuffle seed crashes) with the natural runtime semantic of "ego is approaching exactly one active crosswalk and 16 others happen to be polygon-conflicting".

Standalone reproducer

The following program reproduces the SIGSEGV deterministically.

// Minimal standalone reproducer for the SWO violation in
// sortCrosswalksByDistance.
//
// The buggy comparator returns true whenever EITHER operand lacks a
// path-intersection endpoint. For all-no-endpoint arrays, this makes
// compare(x, y) = true for every pair, violating irreflexivity, which
// libstdc++'s __unguarded_partition treats as UB and reads past array
// bounds.
//
// Build (AddressSanitizer):
//   g++ -std=c++17 -O1 -g -fsanitize=address -fno-omit-frame-pointer \
//       -o test_swo_oob_asan test_swo_oob.cpp
// Build (plain):
//   g++ -std=c++17 -O1 -g -o test_swo_oob test_swo_oob.cpp
//
// Usage:
//   ./test_swo_oob_asan <n> <buggy|patched>
//
// Expected:
//   buggy + n>=17 -> ASan heap-buffer-overflow or SIGSEGV
//   patched + any n -> clean exit
//   buggy  + n<=16 -> clean exit (insertion-sort path, no OOB)

#include <algorithm>
#include <cstdio>
#include <cstdlib>
#include <optional>
#include <string>
#include <vector>

struct Crosswalk {
  int id;
  std::optional<double> endpoint_dist;  // nullopt == no-ep on ego path
};

static bool buggy_compare(const Crosswalk& a, const Crosswalk& b) {
  if (!a.endpoint_dist.has_value() || !b.endpoint_dist.has_value()) {
    return true;
  }
  return *a.endpoint_dist < *b.endpoint_dist;
}

static bool patched_compare(const Crosswalk& a, const Crosswalk& b) {
  if (!a.endpoint_dist.has_value() && !b.endpoint_dist.has_value()) {
    return a.id < b.id;
  }
  if (!a.endpoint_dist.has_value()) return false;
  if (!b.endpoint_dist.has_value()) return true;
  return *a.endpoint_dist < *b.endpoint_dist;
}

static std::vector<Crosswalk> make_all_noep(int n) {
  std::vector<Crosswalk> v(n);
  for (int i = 0; i < n; ++i) {
    v[i].id = 3000000 + i;
    v[i].endpoint_dist = std::nullopt;
  }
  return v;
}

int main(int argc, char** argv) {
  int n = (argc > 1) ? std::atoi(argv[1]) : 20;
  std::string which = (argc > 2) ? argv[2] : "buggy";

  auto xs = make_all_noep(n);
  std::fprintf(stderr, "[test] n=%d comparator=%s\n", n, which.c_str());

  if (which == "buggy") {
    std::sort(xs.begin(), xs.end(), buggy_compare);
  } else if (which == "patched") {
    std::sort(xs.begin(), xs.end(), patched_compare);
  } else {
    std::fprintf(stderr, "[test] unknown comparator: %s\n", which.c_str());
    return 2;
  }

  std::fprintf(stderr, "[test] sort returned; first.id=%d last.id=%d\n",
               xs.front().id, xs.back().id);
  return 0;
}

Build with AddressSanitizer for a clear heap-buffer-overflow diagnostic, or plain for a raw SIGSEGV (exit 139):

g++ -std=c++17 -O1 -g -fsanitize=address -fno-omit-frame-pointer \
    -o test_swo_oob_asan test_swo_oob.cpp
./test_swo_oob_asan 17 buggy        # crashes
./test_swo_oob_asan 17 patched      # clean exit
./test_swo_oob_asan 16 buggy        # clean exit (insertion-sort path)

Additional context

Proposed fix

Restore strict weak ordering by replacing the || branch with explicit no-endpoint handling:

if (!l1_end_points_on_crosswalk && !l2_end_points_on_crosswalk) {
    return l1.id() < l2.id();    // both no-ep: order by lanelet id
}
if (!l1_end_points_on_crosswalk) {
    return false;                 // single no-ep sorts after has-ep
}
if (!l2_end_points_on_crosswalk) {
    return true;
}
const auto dist_l1 = calcSignedArcLength(...);
const auto dist_l2 = calcSignedArcLength(...);
return dist_l1 < dist_l2;

This restores irreflexivity (compare(x, x) = false for all x), antisymmetry (at most one of compare(a, b) / compare(b, a) returns true), and transitivity, so introsort's unguarded partition stays in bounds for any input. Validated against the standalone test (clean exit
across 2000 shuffle seeds for every (has-ep, no-ep) split) and in the closed-loop simulator (planner survives, ego yields to a crossing pedestrian, completes the route).

Optional follow-up: API-level concern

sortCrosswalksByDistance's only caller uses the result solely to find c_active's immediate prev / next neighbor. Sorting an entire O(n) set including no-endpoint entries (whose ordering is geometrically ill-defined) is more work than necessary; replacing the sort with a linear prev / next scan over has-endpoint entries only would eliminate the SWO question entirely and avoid invoking the comparator on ill-defined pairs in the first place. Happy to open a separate discussion if that direction is of interest, but the minimal comparator fix above is sufficient on its own and is what we recommend merging first.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    To Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions