Skip to content

Latest commit

 

History

History
320 lines (223 loc) · 8.12 KB

File metadata and controls

320 lines (223 loc) · 8.12 KB

Troubleshooting Guide

This document covers common issues, gotchas, and solutions for the Unified IIoT Monitoring Platform.

Table of Contents

  1. LoRaWAN Issues
  2. Modbus Issues
  3. Display Issues
  4. Docker/Services Issues
  5. Grafana Issues
  6. Firmware Flashing

LoRaWAN Issues

No LoRa Data in Grafana After Gateway Restart

Symptom: Gateway was restarted, nodes appear to be transmitting (TX count incrementing), but no data appears in Grafana.

Cause: When the gateway restarts, LoRaWAN sessions become invalid. The nodes don't know this because unconfirmed uplinks have no acknowledgment - they keep transmitting but the gateway silently drops the frames.

Solution: The firmware includes automatic stale session detection:

  • Every 5th uplink is sent as a confirmed uplink (requires ACK)
  • If 3 consecutive confirmed uplinks fail, the node automatically rejoins
  • Detection time: ~7.5 minutes worst case

Quick fix: Power cycle the LoRa node to force an immediate rejoin.

Node Stuck on "Connecting..."

Symptom: Node display shows Join:X and Connecting... indefinitely.

Possible causes:

  1. Gateway is offline or out of range
  2. LoRaWAN credentials mismatch (DevEUI/AppEUI/AppKey)
  3. Gateway not configured for the correct sub-band (AU915 sub-band 1)

Solutions:

  • Check gateway is powered and connected to network
  • Verify credentials match between node firmware and gateway application
  • Check gateway is configured for AU915 sub-band 1 (channels 0-7, 915.2-916.6 MHz)

MQTT Bridge Connection Refused

Symptom: mqtt-bridge logs show Error: [Errno 111] Connection refused

Cause: The MQTT broker on the gateway is not reachable.

Solutions:

# Check gateway is reachable
ping 10.10.10.254

# Check MQTT port is open
nc -zv 10.10.10.254 1883

# Check gateway MQTT broker is running (raw messages)
mosquitto_sub -h 10.10.10.254 -t "#" -v

# View decoded sensor readings
python3 mqtt_subscriber.py

Node Display Shows TX Count But No Data Received

Symptom: Node OLED shows increasing TX count, but mqtt-bridge logs show no incoming messages.

Possible causes:

  1. Gateway application not forwarding to MQTT
  2. Wrong MQTT topic subscription
  3. Payload codec misconfigured on gateway

Solutions:

  • Check gateway application settings - ensure MQTT integration is enabled
  • Verify topic: should be application/+/device/+/event/up
  • Check gateway codec matches the payload format

Modbus Issues

Modbus Data Missing

Symptom: No Modbus sensor data in Grafana.

Diagnostic steps:

# 1. Check network connectivity
ping 10.10.10.100
ping 10.10.10.200

# 2. Check bridge logs
docker compose logs modbus-bridge --tail 20

# 3. Test Modbus connection directly
python3 -c "
from pymodbus.client import ModbusTcpClient
client = ModbusTcpClient('10.10.10.100', port=502)
if client.connect():
    result = client.read_holding_registers(0, 4)
    print('Registers:', result.registers if not result.isError() else 'Error')
    client.close()
else:
    print('Connection failed')
"

Connection Timeout to Modbus Devices

Symptom: modbus-bridge logs show connection timeouts.

Cause: The Modbus devices are on the OT subnet (10.10.10.x) which requires host network access.

Solution: Ensure modbus-bridge container uses host network mode in docker-compose.yml:

modbus-bridge:
  network_mode: host

Display Issues

OLED Display Flickering

Symptom: The OLED display flickers/blinks every time it updates.

Cause: Calling display.init() or display.clear() on every update cycle resets the display hardware, causing visible flicker.

Solution (implemented):

  • Call display.init() only once on first boot
  • Use buffered graphics mode - clear the buffer (not the display), draw all content, then flush once
  • The single flush() sends the complete frame, eliminating flicker

Display Not Initializing

Symptom: OLED stays blank or shows garbage.

Possible causes:

  1. I2C wiring issue
  2. Wrong I2C address
  3. Display not powered

Diagnostic:

# Check firmware logs via probe-rs for I2C scan results
# Should show devices at 0x3C (display) and 0x44/0x76/0x77 (sensor)

Docker/Services Issues

Services Won't Start

Port already in use:

# Check what's using port 3000/8086/1883
lsof -i :3000
lsof -i :8086
lsof -i :1883

# Kill the process or change ports in docker-compose.yml

Permission issues:

# Fix ownership of data directories
sudo chown -R $USER:$USER ./grafana ./influxdb

Container Name Conflicts

Symptom: docker compose up fails with "container name already in use"

Solution:

# Remove old containers
docker compose down
docker rm unified-grafana unified-influxdb unified-mqtt-bridge unified-modbus-bridge unified-mosquitto

# Start fresh
docker compose up -d

Bridge Can't Reach InfluxDB

Symptom: Bridge logs show connection errors to InfluxDB.

Solution: Ensure bridges use the Docker network hostname:

INFLUXDB_URL = "http://influxdb:8086"  # Not localhost

Grafana Issues

Dashboard Not Loading

Diagnostic steps:

# 1. Check provisioning files exist
docker exec unified-grafana ls -la /var/lib/grafana/dashboards/

# 2. Check datasource config
docker exec unified-grafana cat /etc/grafana/provisioning/datasources/influxdb.yml

# 3. Check Grafana logs
docker compose logs grafana --tail 50

# 4. Restart Grafana
docker compose restart grafana

"No Data" in Panels

Possible causes:

  1. Time range doesn't include recent data
  2. Query syntax error
  3. Datasource misconfigured

Solutions:

  • Set time range to "Last 5 minutes"
  • Check InfluxDB has data: docker compose logs mqtt-bridge --tail 10
  • Verify datasource URL is http://influxdb:8086

Dashboard Changes Not Persisting

Cause: Dashboards are provisioned from files - manual changes are overwritten on restart.

Solution: Export your modified dashboard JSON and save it to grafana/dashboards/unified-dashboard.json.


Firmware Flashing

Probe Not Found

Symptom: probe-rs reports "No connected probes were found"

Solutions:

  1. Check USB connection
  2. Verify probe ID with probe-rs list
  3. Check udev rules (Linux):
    # Add udev rule for ST-Link
    echo 'ATTRS{idVendor}=="0483", ATTRS{idProduct}=="374e", MODE="666"' | sudo tee /etc/udev/rules.d/99-stlink.rules
    sudo udevadm control --reload-rules
    sudo udevadm trigger

Wrong Probe Selected

Symptom: Firmware flashes to wrong board.

Solution: Always specify the probe ID explicitly:

# List available probes
probe-rs list

# Flash with specific probe
cargo run --release -- --probe 0483:374e:003E00463234510A33353533

Probe IDs for this project:

Node Probe ID
lora-1 0483:374e:003E00463234510A33353533
lora-2 0483:374e:0026003A3234510A33353533
modbus-1 0483:374b:0671FF3833554B3043164817
modbus-2 0483:374b:066DFF3833584B3043115433

Firmware Running in Debug Mode

Symptom: After flashing, device behaves strangely or firmware doesn't persist after USB disconnect.

Solution: Power cycle the device after flashing. The probe-rs debug session can leave the device in an inconsistent state.


Quick Diagnostic Commands

# Check all services are running
docker compose ps

# View all logs (follow mode)
docker compose logs -f

# Check LoRaWAN data flow
docker compose logs mqtt-bridge --tail 30

# Check Modbus data flow
docker compose logs modbus-bridge --tail 30

# Query InfluxDB for recent data
docker exec unified-influxdb influx query '
  from(bucket: "sensors")
  |> range(start: -5m)
  |> filter(fn: (r) => r._field == "temperature")
  |> last()
' --org my-org --token my-super-secret-auth-token

# Test gateway MQTT (raw JSON)
mosquitto_sub -h 10.10.10.254 -t "application/#" -v

# View decoded LoRaWAN sensor readings
python3 mqtt_subscriber.py

# List connected probes
probe-rs list

Document Version: 1.0 Last Updated: 2026-01-22