|
| 1 | +<!-- |
| 2 | +Copyright (C) 2026 Intel Corporation |
| 3 | +
|
| 4 | +SPDX-License-Identifier: Apache-2.0 |
| 5 | +--> |
| 6 | + |
| 7 | +# Command Reference |
| 8 | + |
| 9 | +## Monitoring Modes |
| 10 | + |
| 11 | +| Mode | Tracks | Overhead | Use when | |
| 12 | +|------|--------|----------|----------| |
| 13 | +| **Thread** (default) | Individual threads (TIDs) | ~5–10% | Debugging, optimization | |
| 14 | +| **PID** (`--pid-only`) | Processes only | ~2–3% | Production, long-term runs | |
| 15 | + |
| 16 | +## Quick Reference |
| 17 | + |
| 18 | +| Task | Command | Duration | |
| 19 | +|------|---------|----------| |
| 20 | +| Quick check | `make quick-check` | 30 s | |
| 21 | +| Full monitor | `make monitor` | 60 s | |
| 22 | +| Full monitor (PID mode) | `make monitor-pid` | 60 s | |
| 23 | +| Monitor specific node | `make monitor NODE=/my_node` | 60 s | |
| 24 | +| Extended session | `make monitor-long` | 5 min | |
| 25 | +| Graph only | `make graph-only` | 60 s | |
| 26 | +| Resources only (threads) | `make resources-threads` | 60 s | |
| 27 | +| Resources only (PIDs) | `make resources-pid` | 60 s | |
| 28 | +| Remote system | `make monitor-remote REMOTE_IP=<ip>` | 60 s | |
| 29 | +| Remote system (PID mode) | `make monitor-remote-pid REMOTE_IP=<ip>` | 60 s | |
| 30 | +| Pipeline graph (PNG) | `make pipeline-graph` | — | |
| 31 | +| Pipeline graph (session) | `make pipeline-graph SESSION=<name>` | — | |
| 32 | +| List sessions | `make list-sessions` | — | |
| 33 | +| Re-visualize last session | `make visualize-last` | — | |
| 34 | +| Clean all data | `make clean` | — | |
| 35 | + |
| 36 | +All `make` targets accept optional variables: `NODE=`, `DURATION=`, `INTERVAL=`, |
| 37 | +`SESSION=`, `REMOTE_IP=`, and `REMOTE_USER=`. |
| 38 | + |
| 39 | +```bash |
| 40 | +make monitor NODE=/slam_toolbox DURATION=120 INTERVAL=2 |
| 41 | +make monitor-remote REMOTE_IP=192.168.1.100 NODE=/slam_toolbox REMOTE_USER=ros |
| 42 | +``` |
| 43 | + |
| 44 | +## monitor_stack.py |
| 45 | + |
| 46 | +```bash |
| 47 | +uv run python src/monitor_stack.py [OPTIONS] |
| 48 | +``` |
| 49 | + |
| 50 | +| Option | Description | |
| 51 | +|--------|-------------| |
| 52 | +| `--node NAME` | Narrow graph discovery to one node (proc delay measured for all nodes) | |
| 53 | +| `--session NAME` | Name for this session (default: timestamp) | |
| 54 | +| `--duration SECS` | Auto-stop after N seconds | |
| 55 | +| `--interval SECS` | Update interval (default: 5) | |
| 56 | +| `--output-dir PATH` | Where to save results | |
| 57 | +| `--graph-only` | Skip resource monitoring | |
| 58 | +| `--resources-only` | Skip graph monitoring | |
| 59 | +| `--pid-only` | Process-level only, no thread details | |
| 60 | +| `--no-visualize` | Skip auto-visualization on exit | |
| 61 | +| `--remote-ip IP` | Monitor a remote machine | |
| 62 | +| `--remote-user USER` | SSH user for remote machine (default: ubuntu) | |
| 63 | +| `--list-sessions` | List previous sessions and exit | |
| 64 | + |
| 65 | +```bash |
| 66 | +uv run python src/monitor_stack.py --node /slam_toolbox --session my_test --duration 120 |
| 67 | +uv run python src/monitor_stack.py --remote-ip 192.168.1.100 --node /slam_toolbox |
| 68 | +uv run python src/monitor_stack.py --resources-only --pid-only --duration 60 |
| 69 | +``` |
| 70 | + |
| 71 | +## ros2_graph_monitor.py |
| 72 | + |
| 73 | +```bash |
| 74 | +uv run python src/ros2_graph_monitor.py # All nodes |
| 75 | +uv run python src/ros2_graph_monitor.py --node /slam_toolbox # Scope to one node |
| 76 | +uv run python src/ros2_graph_monitor.py --node /ctrl --log t.csv # With CSV logging |
| 77 | +uv run python src/ros2_graph_monitor.py --interval 2 # Custom interval |
| 78 | +uv run python src/ros2_graph_monitor.py --remote-ip 192.168.1.100 |
| 79 | +``` |
| 80 | + |
| 81 | +## monitor_resources.py |
| 82 | + |
| 83 | +```bash |
| 84 | +uv run python src/monitor_resources.py # CPU only |
| 85 | +uv run python src/monitor_resources.py --memory --threads # CPU + memory + threads |
| 86 | +uv run python src/monitor_resources.py --memory --log out.log # With logging |
| 87 | +uv run python src/monitor_resources.py --list # List ROS2 processes |
| 88 | +uv run python src/monitor_resources.py --remote-ip 192.168.1.100 --memory |
| 89 | +``` |
| 90 | + |
| 91 | +## visualize_timing.py |
| 92 | + |
| 93 | +```bash |
| 94 | +uv run python src/visualize_timing.py timing.csv --delays --frequencies --output-dir ./plots/ |
| 95 | +``` |
| 96 | + |
| 97 | +| Option | Description | |
| 98 | +|--------|-------------| |
| 99 | +| `--timestamps` | Message arrival scatter plot | |
| 100 | +| `--frequencies` | Topic message rates over time | |
| 101 | +| `--delays` | Processing delay over time | |
| 102 | +| `--inter-arrival` | Inter-message timing / jitter | |
| 103 | +| `--output-dir DIR` | Save plots as PNG (omit to display interactively) | |
| 104 | +| `--summary` | Print statistics only, no plots | |
| 105 | + |
| 106 | +## visualize_resources.py |
| 107 | + |
| 108 | +```bash |
| 109 | +uv run python src/visualize_resources.py resource.log --cores --heatmap --top 10 --output-dir ./plots/ |
| 110 | +uv run python src/visualize_resources.py resource.log --summary |
| 111 | +``` |
| 112 | + |
| 113 | +| Option | Description | |
| 114 | +|--------|-------------| |
| 115 | +| `--cores` | CPU utilization per core over time | |
| 116 | +| `--pids` | CPU utilization per PID/thread (top N) | |
| 117 | +| `--heatmap` | Core utilization heatmap | |
| 118 | +| `--mapping` | Thread-to-core scatter plot | |
| 119 | +| `--top N` | Number of top threads to show (default: 10) | |
| 120 | +| `--output-dir DIR` | Save plots as PNG | |
| 121 | +| `--summary` | Print statistics only, no plots | |
| 122 | + |
| 123 | +> **Note:** `pidstat` reports CPU% where 100% = 1 full core. On a 20-core |
| 124 | +> system the maximum is 2000%. Use the **Avg Cores** column in `--summary` |
| 125 | +> output for a human-readable reading. |
| 126 | +
|
| 127 | +## visualize_graph.py |
| 128 | + |
| 129 | +Renders the ROS2 computation graph as a directed topology diagram. |
| 130 | + |
| 131 | +```bash |
| 132 | +# Headless PNG |
| 133 | +uv run python src/visualize_graph.py monitoring_sessions/<name> --no-show --output graph.png |
| 134 | + |
| 135 | +# Interactive (click nodes to see topic detail popups) |
| 136 | +uv run python src/visualize_graph.py monitoring_sessions/<name> --show |
| 137 | +``` |
| 138 | + |
| 139 | +Or via make: |
| 140 | + |
| 141 | +```bash |
| 142 | +make pipeline-graph |
| 143 | +make pipeline-graph SESSION=20260306_154140 |
| 144 | +``` |
| 145 | + |
| 146 | +## Grafana Dashboard Commands |
| 147 | + |
| 148 | +| Command | Description | |
| 149 | +|---------|-------------| |
| 150 | +| `make grafana-start` | Start Grafana + Prometheus (Docker) | |
| 151 | +| `make grafana-stop` | Stop the stack | |
| 152 | +| `make grafana-status` | Check services — shows URL http://localhost:30000 | |
| 153 | +| `make grafana-export SESSION=<name>` | Export session metrics to Prometheus | |
| 154 | +| `make grafana-export-live` | Continuously export live monitoring data | |
| 155 | +| `make grafana-open` | Open dashboard in browser | |
| 156 | + |
| 157 | +Metrics are exposed on **port 9092** (Prometheus occupies 9090 in |
| 158 | +host-network mode). Prometheus is pre-configured to scrape `localhost:9092`. |
| 159 | + |
| 160 | +## Remote Monitoring |
| 161 | + |
| 162 | +| Component | How it works | |
| 163 | +|-----------|-------------| |
| 164 | +| Graph monitor | DDS peer discovery via `CYCLONEDDS_URI` / `ROS_STATIC_PEERS` | |
| 165 | +| Resource monitor | Runs `ps` and `pidstat` over SSH | |
| 166 | + |
| 167 | +Results are stored and visualized **locally** on the monitoring machine. |
| 168 | + |
| 169 | +```bash |
| 170 | +make monitor-remote REMOTE_IP=192.168.1.100 |
| 171 | +make monitor-remote REMOTE_IP=192.168.1.100 REMOTE_USER=ros NODE=/slam_toolbox |
| 172 | +uv run python src/monitor_stack.py --remote-ip 192.168.1.100 --pid-only --duration 120 |
| 173 | +``` |
| 174 | + |
| 175 | +## Troubleshooting |
| 176 | + |
| 177 | +| Problem | Fix | |
| 178 | +|---------|-----| |
| 179 | +| No ROS2 processes found | Run `ros2 node list` to verify nodes are up | |
| 180 | +| Monitor exits immediately | Source ROS2: `source /opt/ros/humble/setup.bash` | |
| 181 | +| Visualizations not generated | Run `make visualize-last` manually | |
| 182 | +| Permission denied | Run `uv sync` if modules are missing | |
| 183 | +| Remote: no data | Check SSH auth and matching `ROS_DOMAIN_ID` | |
| 184 | +| CPU shows e.g. "563%" | Normal — 100% = 1 core. Check **Avg Cores** column. | |
| 185 | +| `grafana-export` port in use | `fuser -k 9092/tcp && make grafana-export SESSION=<name>` | |
| 186 | +| Graph click does nothing | Use `--show` flag to enable TkAgg interactive mode | |
0 commit comments