Skip to content

Commit 5198c30

Browse files
scripts: inspect_state: add coredump inspection
Added ability to inspect coredumps with the inspect_state.py script for post-mortem inspection of the system state at the time of a crash. Signed-off-by: Trond F. Christiansen <trond.christiansen@nordicsemi.no>
1 parent b0c75a1 commit 5198c30

File tree

2 files changed

+201
-11
lines changed

2 files changed

+201
-11
lines changed

docs/common/tooling_troubleshooting.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -382,10 +382,12 @@ When enabling immediate logging, it might be necessary to increase the stack siz
382382
383383
### State Inspection Script
384384
385-
The `inspect_state.py` script allows you to inspect the current state of the application's state machines and internal data structures on a running device.
386-
It connects to the device via J-Link, parses the ELF file to find symbol locations and types, and reads the memory to display the current state.
385+
The `inspect_state.py` script allows you to inspect the current state of the application's state machines and internal data structures on a running device. It suport two modes of operation:
387386
388-
This is particularly useful for debugging when the application is stuck or behaving unexpectedly, and you want to see the exact state of each module without halting the CPU or adding extensive logging.
387+
- **Live Inspection**: It connects to the device via J-Link, parses the ELF file to find symbol locations and types, and reads the memory to display the current state.
388+
- **Coredump Analysis**: It can also analyze a coredump file generated by the device, allowing you to inspect the state at the time of the crash.
389+
390+
This is particularly useful for debugging when the application is stuck or behaving unexpectedly, and you want to see the exact state of each module without halting the CPU or adding extensive logging, or when analyzing crashes.
389391
390392
**Prerequisites:**
391393
@@ -402,7 +404,10 @@ This is particularly useful for debugging when the application is stuck or behav
402404
Run the script from the `scripts` directory (or adjust the path), providing the path to your ELF file and optionally the J-Link device name:
403405
404406
```bash
407+
# Live inspection
405408
python3 Asset-Tracker-Template/scripts/inspect_state.py --elf build/app/zephyr/zephyr.elf
409+
# Coredump analysis
410+
python3 Asset-Tracker-Template/scripts/inspect_state.py --elf path/to/symbols.elf --coredump path/to/coredump.elf
406411
```
407412
408413
**Example Output:**

scripts/inspect_state.py

Lines changed: 193 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,18 +2,42 @@
22
"""
33
Asset Tracker Template State Inspector
44
5-
This script connects to an nRF91 device via J-Link, inspects the ELF file for
6-
symbol locations, and reads the current state of various state machines in RAM.
7-
It also allows interactive inspection of the full state structure of each module.
5+
This script inspects the current state of modules in the Asset Tracker application.
6+
It supports two modes of operation:
7+
8+
1. LIVE DEBUGGING (J-Link mode - default):
9+
Connects to a running nRF91 device via J-Link and reads state information directly
10+
from the device's RAM in real-time. This mode allows you to monitor state changes
11+
as the application runs.
12+
13+
Usage:
14+
python3 inspect_state.py --elf path/to/zephyr.elf
15+
python3 inspect_state.py --elf path/to/zephyr.elf --device Cortex-M33 --snr <serial_number>
16+
17+
2. COREDUMP ANALYSIS (offline mode):
18+
Analyzes a captured coredump ELF file without requiring a physical device or J-Link
19+
connection. This mode is useful for post-mortem debugging of crashes or for analyzing
20+
states from field devices.
21+
22+
Usage:
23+
python3 inspect_state.py --elf path/to/zephyr.elf --coredump path/to/coredump.elf
24+
25+
Both modes provide:
26+
- Summary table showing the current SMF state of all modules
27+
- Interactive menu to inspect detailed structure contents of individual modules
28+
- Full DWARF-based type resolution for accurate memory interpretation
829
930
Prerequisites:
1031
pip install "pyelftools>=0.30" pylink-square
32+
33+
Note: pylink-square is only required for J-Link mode, not for coredump analysis.
1134
"""
1235

1336
import sys
1437
import argparse
1538
import logging
1639
import traceback
40+
import struct
1741
from dataclasses import dataclass, field
1842
from pathlib import Path
1943
from typing import Dict, List, Optional, Any, Union
@@ -620,6 +644,134 @@ def interactive_loop(lookup, device_name: str, serial_number: Optional[str]):
620644
jlink.close()
621645

622646

647+
def interactive_coredump_loop(lookup, mem):
648+
"""Interactive loop for coredump-backed inspection (no J-Link)."""
649+
650+
while True:
651+
print_summary(mem, lookup)
652+
653+
print("\nOptions:")
654+
print(" q: Quit")
655+
656+
modules = list(lookup.keys())
657+
658+
for i, name in enumerate(modules):
659+
print(f" {i+1}: Inspect {name}")
660+
661+
choice = input("\nSelect option: ").strip().lower()
662+
663+
if choice == 'q':
664+
break
665+
666+
try:
667+
idx = int(choice) - 1
668+
669+
if 0 <= idx < len(modules):
670+
inspect_module_detail(mem, lookup[modules[idx]], modules[idx])
671+
input("\nPress Enter to continue...")
672+
else:
673+
print("Invalid selection.")
674+
except ValueError:
675+
pass
676+
677+
678+
# --- Minimal coredump ELF reader (PT_LOAD only) ---
679+
680+
class SegmentMemory:
681+
"""Minimal memory reader over PT_LOAD segments for state inspection."""
682+
683+
def __init__(self, memory_segments: List[tuple]):
684+
# memory_segments: List[(name, vaddr, data)]
685+
self.segments = []
686+
for _, start, data in memory_segments:
687+
end = start + len(data)
688+
self.segments.append((start, end, data))
689+
self.segments.sort(key=lambda s: s[0])
690+
691+
def _slice(self, addr: int, size: int) -> bytes:
692+
for start, end, data in self.segments:
693+
if start <= addr and addr + size <= end:
694+
offset = addr - start
695+
return data[offset:offset + size]
696+
raise ValueError(f"Address 0x{addr:x} (size {size}) not in coredump segments")
697+
698+
def memory_read(self, addr: int, size: int) -> bytes:
699+
return self._slice(addr, size)
700+
701+
def memory_read32(self, addr: int, count: int = 1) -> List[int]:
702+
raw = self._slice(addr, 4 * count)
703+
return [int.from_bytes(raw[i * 4:(i + 1) * 4], 'little') for i in range(count)]
704+
705+
706+
def load_coredump_segments(coredump_path: Path) -> List[tuple]:
707+
"""Load PT_LOAD segments from an ELF coredump (32/64-bit, little/big-endian)."""
708+
709+
with open(coredump_path, 'rb') as f:
710+
data = f.read()
711+
712+
if len(data) < 0x34 or data[0:4] != b"\x7fELF":
713+
raise ValueError("Not an ELF file")
714+
715+
ei_class = data[4]
716+
ei_data = data[5]
717+
is_64 = ei_class == 2
718+
endian = '<' if ei_data == 1 else '>'
719+
720+
def u16(off):
721+
return struct.unpack(f"{endian}H", data[off:off+2])[0]
722+
723+
def u32(off):
724+
return struct.unpack(f"{endian}I", data[off:off+4])[0]
725+
726+
def u64(off):
727+
return struct.unpack(f"{endian}Q", data[off:off+8])[0]
728+
729+
if is_64:
730+
e_phoff = u64(32)
731+
e_phentsize = u16(54)
732+
e_phnum = u16(56)
733+
else:
734+
e_phoff = u32(28)
735+
e_phentsize = u16(42)
736+
e_phnum = u16(44)
737+
738+
if e_phentsize == 0 or e_phnum == 0:
739+
return []
740+
741+
segments = []
742+
for i in range(e_phnum):
743+
off = e_phoff + i * e_phentsize
744+
if off + e_phentsize > len(data):
745+
break
746+
747+
p_type = u32(off)
748+
749+
if is_64:
750+
p_flags = u32(off + 4)
751+
p_offset = u64(off + 8)
752+
p_vaddr = u64(off + 16)
753+
p_filesz = u64(off + 32)
754+
else:
755+
p_offset = u32(off + 4)
756+
p_vaddr = u32(off + 8)
757+
p_filesz = u32(off + 16)
758+
p_flags = u32(off + 24)
759+
760+
PT_LOAD = 1
761+
if p_type != PT_LOAD or p_filesz == 0:
762+
continue
763+
764+
end = p_offset + p_filesz
765+
if end > len(data):
766+
continue
767+
768+
seg_data = data[p_offset:end]
769+
name = f"Segment_{len(segments)}"
770+
segments.append((name, p_vaddr, seg_data))
771+
772+
return segments
773+
774+
623775
def main():
624776
try:
625777
# pylint: disable=import-outside-toplevel
@@ -633,15 +785,31 @@ def main():
633785
except ImportError:
634786
pass
635787

636-
parser = argparse.ArgumentParser(description='Asset Tracker State Inspector',
637-
allow_abbrev=False)
638-
parser.add_argument('--elf', required=True, help='Path to zephyr.elf file')
788+
parser = argparse.ArgumentParser(
789+
description='Asset Tracker State Inspector: Inspect module states via J-Link or coredump',
790+
allow_abbrev=False,
791+
epilog='''
792+
Examples:
793+
Live debugging via J-Link:
794+
%(prog)s --elf build/zephyr/zephyr.elf
795+
%(prog)s --elf build/zephyr/zephyr.elf --device Cortex-M33 --snr 123456789
796+
797+
Offline coredump analysis:
798+
%(prog)s --elf build/zephyr/zephyr.elf --coredump coredump-12345678.elf
799+
''',
800+
formatter_class=argparse.RawDescriptionHelpFormatter
801+
)
802+
parser.add_argument('--elf', required=True,
803+
help='Path to zephyr.elf file with debug symbols')
804+
parser.add_argument('--coredump',
805+
help='Path to coredump ELF file for offline analysis (mutually exclusive with J-Link)')
639806
parser.add_argument(
640807
'--device',
641808
default='Cortex-M33',
642-
help='J-Link Device Name (default: Cortex-M33).'
809+
help='J-Link device name for live debugging (default: Cortex-M33). Only used without --coredump.'
643810
)
644-
parser.add_argument('--snr', help='J-Link Serial Number')
811+
parser.add_argument('--snr',
812+
help='J-Link serial number for live debugging. Only used without --coredump.')
645813
args = parser.parse_args()
646814

647815
elf_path = Path(args.elf)
@@ -659,6 +827,23 @@ def main():
659827
)
660828
sys.exit(1)
661829

830+
# If a coredump is provided, read SMF states from PT_LOAD segments instead of J-Link
831+
if args.coredump:
832+
try:
833+
memory_segments = load_coredump_segments(Path(args.coredump))
834+
except Exception as exc: # pylint: disable=broad-except
835+
logger.error("Failed to load coredump: %s", exc)
836+
sys.exit(1)
837+
838+
if not memory_segments:
839+
logger.error("Coredump does not contain ELF PT_LOAD segments to read state from.")
840+
sys.exit(1)
841+
842+
mem = SegmentMemory(memory_segments)
843+
interactive_coredump_loop(lookup, mem)
844+
sys.exit(0)
845+
846+
# Default: live device via J-Link
662847
interactive_loop(lookup, args.device, args.snr)
663848

664849

0 commit comments

Comments
 (0)