-
Notifications
You must be signed in to change notification settings - Fork 83
Description
Moved from: victronenergy/dbus-ble-sensors#12
(filed there initially but this is a Venus OS platform-level issue affecting all BLE services)
Summary
dbus-ble-sensors uses raw HCI sockets (hci_open_dev / hci_le_set_scan_enable) to perform LE passive scanning, bypassing BlueZ's D-Bus management API entirely. This causes BlueZ's internal discovery session tracking to become corrupted, making StartDiscovery via the D-Bus API return org.bluez.Error.InProgress even when Discovering=False on the adapter. Any other service on the system that uses BlueZ's D-Bus API for BLE scanning (e.g. custom BLE battery monitors, BLE switches) is permanently blocked from discovering devices.
Root Cause
In src/ble-scan.c, the scan lifecycle bypasses BlueZ completely:
ble_scan_setup()disables any existing LE scan, then re-enables with its own parameters:
hci_le_set_scan_enable(dev->sock, 0, 1, 1000); // disable existing scan
hci_le_set_scan_parameters(dev->sock, 0, ...); // set own params
hci_le_set_scan_enable(dev->sock, 1, 0, 1000); // enable own scanble_scan_tick()re-enables the scan every 10 seconds (10 * TICKS_PER_SEC, whereTICKS_PER_SEC=20intask.h):
hci_le_set_scan_enable(devices[i].sock, 1, 0, 1000);ble_scan_open()iteratesHCIGETDEVLISTand opens every HCI adapter on the system, running raw scans on all of them simultaneously.
These raw HCI commands go directly to the controller, but BlueZ's bluetoothd is unaware they happened. When ble_scan_setup() calls hci_le_set_scan_enable(sock, 0, ...) to disable scanning, it kills any BlueZ-managed discovery session at the controller level without notifying bluetoothd. This leaves bluetoothd's internal state inconsistent — it may still believe a discovery session is active (or in a partially torn-down state), causing subsequent StartDiscovery calls via D-Bus to fail with InProgress.
Observed Behavior on Cerbo GX
Environment: Cerbo GX running Venus OS with kernel 6.12.23-venus-5, BlueZ 5.72, two HCI adapters (hci0 internal, hci1 external USB).
dbus-ble-sensors opens both adapters at startup (from its log):
opening hci1
opening hci0
While dbus-ble-sensors is running alongside other BLE services, both adapters become stuck:
hci0 (Discovering=True variant):
$ dbus -y org.bluez /org/bluez/hci0 org.freedesktop.DBus.Properties.Get org.bluez.Adapter1 Discovering
value = True
$ btmgmt --index 0 find -l
Unable to start discovery. status 0x0a (Busy)
hci1 (Discovering=False variant — more insidious):
$ dbus -y org.bluez /org/bluez/hci1 org.freedesktop.DBus.Properties.Get org.bluez.Adapter1 Discovering
value = False
$ btmgmt --index 1 find -l
Unable to start discovery. status 0x0a (Busy)
Both adapters report Busy (0x0a) at the management layer despite different Discovering states. This state persists indefinitely and the only way to clear it is to power-cycle the adapter (Powered=False then Powered=True) or restart bluetoothd — both of which drop all active BLE connections on that adapter. And since dbus-ble-sensors re-corrupts the state every 10 seconds via ble_scan_tick(), the fix is temporary.
We verified via btmon and raw HCI commands that the state corruption originates at the kernel HCI management layer — sending a raw LE_Set_Scan_Enable 0x00 clears the hardware scan but does not fix BlueZ's daemon-level state.
Impact
Any Venus OS add-on or third-party service that uses BlueZ's standard D-Bus API for BLE operations (scanning, discovery) is effectively blocked while dbus-ble-sensors is running. This includes:
- Custom BLE battery monitors (using
bleak/BleakScanner) - BLE switch controllers
- Any service calling
org.bluez.Adapter1.StartDiscovery
The only workaround is for these services to avoid StartDiscovery entirely and rely on the BlueZ D-Bus cache being passively populated by the raw scan's advertising events (which does work, since the kernel's management event path processes all advertising reports regardless of who started the scan). However, this is fragile and means new/previously-unseen devices cannot be actively discovered.
Relationship to #1587
This issue is closely related to #1587 (vesmart-server disconnects ALL BLE devices). Together, these two bugs make it effectively impossible for third-party BLE services to operate reliably on Venus OS:
- vesmart-server disconnects ALL BLE devices on ALL adapters when keepalive timer fires #1587: vesmart-server disconnects all BLE devices every 60 seconds
- This issue: dbus-ble-sensors corrupts BlueZ discovery state every 10 seconds, blocking reconnection
A third-party BLE service gets disconnected by vesmart-server, then cannot rediscover or reconnect because dbus-ble-sensors has corrupted the BlueZ adapter state.
Suggested Fix
Replace the raw HCI socket approach with BlueZ's D-Bus API for scan management:
- Use
org.bluez.Adapter1.SetDiscoveryFilterto configure passive scanning with the desired parameters - Use
org.bluez.Adapter1.StartDiscovery/org.bluez.Adapter1.StopDiscoveryto control the scan lifecycle - Read advertisement data via BlueZ's
org.bluez.Device1properties or theorg.bluez.LEAdvertisingMonitor1interface
This would allow dbus-ble-sensors to coexist with other BlueZ clients. BlueZ is designed to multiplex discovery sessions from multiple D-Bus clients — each client gets its own session, and the adapter scans as long as any session is active.
Alternatively, if raw HCI access is needed for performance or compatibility reasons, coordinate with bluetoothd by:
- Not disabling scans that
dbus-ble-sensorsdidn't start (the initialhci_le_set_scan_enable(sock, 0, ...)inble_scan_setup) - Using the BlueZ management API (
MGMT_OP_START_DISCOVERY) instead of direct HCI commands
Reproduction
- On a Cerbo GX with two HCI adapters and
dbus-ble-sensorsrunning (it opens all adapters by default) - From another process, attempt:
dbus -y org.bluez /org/bluez/hci0 org.bluez.Adapter1.StartDiscovery - Observe
org.bluez.Error.InProgresseven thoughDiscoveringproperty may beFalse - Confirm with:
btmgmt --index 0 find -l→Busy (0x0a)