Skip to content

Commit ed7f125

Browse files
committed
ui-smoke: only use a readable core from this run for the native backtrace
Testing the crash path (segfault injected into a GUI) surfaced two things in the crash dump helper. First, it globbed /tmp/core* and picked up a stale, root-owned core from an unrelated run, so gdb printed only 'Permission denied'. Second, the non-root case (CI, and local runtests -u) never produces a core at all: we will not sudo to point kernel.core_pattern at a writable dir, so nothing lands. Restrict the core search to a core the kernel wrote into our own fresh CORE_DIR, or a relative 'core' in the cwd that postdates arming, and require it to be readable. When there is no such core, say so and point at the Python faulthandler traceback in linuxcnc.err, which names the crash site and is the reliable signal in every environment. The native backtrace stays a best-effort extra for the root case. Verified: an injected GUI segfault now fails the test in ~20s (no hang), logs the Python traceback, and prints a clear 'no readable core dump' note instead of a misleading permission error.
1 parent f318204 commit ed7f125

1 file changed

Lines changed: 38 additions & 24 deletions

File tree

tests/ui-smoke/_lib/crashdump.sh

Lines changed: 38 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,15 @@
11
#!/bin/bash
2-
# Native crash capture for the UI smoke launchers. A GUI segfault is the
3-
# failure these tests most need to explain, and it lands in C/C++ (Qt,
4-
# dbus, GL) where PYTHONFAULTHANDLER stops at the event-loop frame. Arm a
5-
# core dump before launch; after the run, if the GUI left a core, print a
6-
# native backtrace into the log so CI shows the faulting frame directly.
7-
# Source with LIB_DIR set; runs only on the failure path, so green runs
8-
# pay nothing.
2+
# Native crash capture for the UI smoke launchers. A GUI segfault lands in
3+
# C/C++ (Qt, dbus, GL); PYTHONFAULTHANDLER (set in launch-env.sh) prints a
4+
# Python traceback to linuxcnc.err naming the frame that called in, which
5+
# is the reliable, environment-independent crash signal and is surfaced in
6+
# every failure log. This helper adds a best-effort native backtrace on
7+
# top: arm a core dump before launch, and after the run, if a readable
8+
# core from this run is present, gdb-print its backtrace. The core only
9+
# materialises when we can point kernel.core_pattern at a writable dir,
10+
# which needs root; non-root runs (CI, local -u) keep the Python traceback
11+
# and skip the native one. Source with LIB_DIR set; runs only on the
12+
# failure path, so green runs pay nothing.
913

1014
crashdump_arm() {
1115
CORE_DIR="$(mktemp -d -t ui-smoke-cores.XXXXXX)"
@@ -22,24 +26,34 @@ crashdump_arm() {
2226

2327
crashdump_report() {
2428
[ -n "${CORE_DIR:-}" ] || return 0
25-
local core
26-
# shellcheck disable=SC2012 # mktemp dir, no odd filenames
27-
core=$(ls -t "$CORE_DIR"/core* ./core* /tmp/core* 2>/dev/null | head -1)
28-
if [ -n "$core" ]; then
29+
local c core=""
30+
# Only trust a core we know is from this run and can actually read:
31+
# one the kernel wrote into our fresh CORE_DIR (root path, where we set
32+
# core_pattern), or a relative "core" in the cwd that postdates arming.
33+
# A broad /tmp glob would pick up a stale or foreign core (often root-
34+
# owned), and gdb would just print "Permission denied".
35+
for c in "$CORE_DIR"/core*; do
36+
[ -e "$c" ] && [ -r "$c" ] && { core="$c"; break; }
37+
done
38+
if [ -z "$core" ]; then
39+
for c in ./core*; do
40+
[ -e "$c" ] && [ -r "$c" ] && [ "$c" -nt "$CORE_DIR" ] && { core="$c"; break; }
41+
done
42+
fi
43+
if [ -n "$core" ] && command -v gdb >/dev/null 2>&1; then
2944
echo "=== crash: native backtrace ($core) ==="
30-
# gdb is expected to be installed by .github/scripts/install-deps.sh
31-
# on CI and by the developer locally; the suite does not apt-get.
32-
if command -v gdb >/dev/null 2>&1; then
33-
# "bt" first: gdb auto-selects the faulting thread on a SIGSEGV
34-
# core. "thread apply all bt" after gives the rest.
35-
gdb -batch -nx \
36-
-ex "bt" \
37-
-ex "echo \n=== all threads ===\n" \
38-
-ex "thread apply all bt" \
39-
"$(command -v python3)" "$core" 2>&1 | head -400
40-
else
41-
echo "(gdb unavailable; core left at $core)"
42-
fi
45+
# "bt" first: gdb auto-selects the faulting thread on a SIGSEGV
46+
# core. "thread apply all bt" after gives the rest.
47+
gdb -batch -nx \
48+
-ex "bt" \
49+
-ex "echo \n=== all threads ===\n" \
50+
-ex "thread apply all bt" \
51+
"$(command -v python3)" "$core" 2>&1 | head -400
52+
else
53+
# No readable core (the common non-root case). The Python
54+
# faulthandler traceback in linuxcnc.err already names the crash
55+
# site; the native backtrace is only a best-effort extra.
56+
echo "=== crash: no readable core dump; see the Python traceback in linuxcnc.err above ==="
4357
fi
4458
rm -rf "$CORE_DIR"
4559
}

0 commit comments

Comments
 (0)