Skip to content

dbus-ble-sensors raw HCI socket scanning corrupts BlueZ D-Bus discovery state, causing org.bluez.Error.InProgress for other BLE services #1597

@cgoudie

Description

@cgoudie

Moved from: victronenergy/dbus-ble-sensors#12
(filed there initially but this is a Venus OS platform-level issue affecting all BLE services)

Summary

dbus-ble-sensors uses raw HCI sockets (hci_open_dev / hci_le_set_scan_enable) to perform LE passive scanning, bypassing BlueZ's D-Bus management API entirely. This causes BlueZ's internal discovery session tracking to become corrupted, making StartDiscovery via the D-Bus API return org.bluez.Error.InProgress even when Discovering=False on the adapter. Any other service on the system that uses BlueZ's D-Bus API for BLE scanning (e.g. custom BLE battery monitors, BLE switches) is permanently blocked from discovering devices.

Root Cause

In src/ble-scan.c, the scan lifecycle bypasses BlueZ completely:

  1. ble_scan_setup() disables any existing LE scan, then re-enables with its own parameters:
hci_le_set_scan_enable(dev->sock, 0, 1, 1000);   // disable existing scan
hci_le_set_scan_parameters(dev->sock, 0, ...);     // set own params
hci_le_set_scan_enable(dev->sock, 1, 0, 1000);     // enable own scan
  1. ble_scan_tick() re-enables the scan every 10 seconds (10 * TICKS_PER_SEC, where TICKS_PER_SEC=20 in task.h):
hci_le_set_scan_enable(devices[i].sock, 1, 0, 1000);
  1. ble_scan_open() iterates HCIGETDEVLIST and opens every HCI adapter on the system, running raw scans on all of them simultaneously.

These raw HCI commands go directly to the controller, but BlueZ's bluetoothd is unaware they happened. When ble_scan_setup() calls hci_le_set_scan_enable(sock, 0, ...) to disable scanning, it kills any BlueZ-managed discovery session at the controller level without notifying bluetoothd. This leaves bluetoothd's internal state inconsistent — it may still believe a discovery session is active (or in a partially torn-down state), causing subsequent StartDiscovery calls via D-Bus to fail with InProgress.

Observed Behavior on Cerbo GX

Environment: Cerbo GX running Venus OS with kernel 6.12.23-venus-5, BlueZ 5.72, two HCI adapters (hci0 internal, hci1 external USB).

dbus-ble-sensors opens both adapters at startup (from its log):

opening hci1
opening hci0

While dbus-ble-sensors is running alongside other BLE services, both adapters become stuck:

hci0 (Discovering=True variant):

$ dbus -y org.bluez /org/bluez/hci0 org.freedesktop.DBus.Properties.Get org.bluez.Adapter1 Discovering
value = True
$ btmgmt --index 0 find -l
Unable to start discovery. status 0x0a (Busy)

hci1 (Discovering=False variant — more insidious):

$ dbus -y org.bluez /org/bluez/hci1 org.freedesktop.DBus.Properties.Get org.bluez.Adapter1 Discovering
value = False
$ btmgmt --index 1 find -l
Unable to start discovery. status 0x0a (Busy)

Both adapters report Busy (0x0a) at the management layer despite different Discovering states. This state persists indefinitely and the only way to clear it is to power-cycle the adapter (Powered=False then Powered=True) or restart bluetoothd — both of which drop all active BLE connections on that adapter. And since dbus-ble-sensors re-corrupts the state every 10 seconds via ble_scan_tick(), the fix is temporary.

We verified via btmon and raw HCI commands that the state corruption originates at the kernel HCI management layer — sending a raw LE_Set_Scan_Enable 0x00 clears the hardware scan but does not fix BlueZ's daemon-level state.

Impact

Any Venus OS add-on or third-party service that uses BlueZ's standard D-Bus API for BLE operations (scanning, discovery) is effectively blocked while dbus-ble-sensors is running. This includes:

  • Custom BLE battery monitors (using bleak / BleakScanner)
  • BLE switch controllers
  • Any service calling org.bluez.Adapter1.StartDiscovery

The only workaround is for these services to avoid StartDiscovery entirely and rely on the BlueZ D-Bus cache being passively populated by the raw scan's advertising events (which does work, since the kernel's management event path processes all advertising reports regardless of who started the scan). However, this is fragile and means new/previously-unseen devices cannot be actively discovered.

Relationship to #1587

This issue is closely related to #1587 (vesmart-server disconnects ALL BLE devices). Together, these two bugs make it effectively impossible for third-party BLE services to operate reliably on Venus OS:

A third-party BLE service gets disconnected by vesmart-server, then cannot rediscover or reconnect because dbus-ble-sensors has corrupted the BlueZ adapter state.

Suggested Fix

Replace the raw HCI socket approach with BlueZ's D-Bus API for scan management:

  • Use org.bluez.Adapter1.SetDiscoveryFilter to configure passive scanning with the desired parameters
  • Use org.bluez.Adapter1.StartDiscovery / org.bluez.Adapter1.StopDiscovery to control the scan lifecycle
  • Read advertisement data via BlueZ's org.bluez.Device1 properties or the org.bluez.LEAdvertisingMonitor1 interface

This would allow dbus-ble-sensors to coexist with other BlueZ clients. BlueZ is designed to multiplex discovery sessions from multiple D-Bus clients — each client gets its own session, and the adapter scans as long as any session is active.

Alternatively, if raw HCI access is needed for performance or compatibility reasons, coordinate with bluetoothd by:

  • Not disabling scans that dbus-ble-sensors didn't start (the initial hci_le_set_scan_enable(sock, 0, ...) in ble_scan_setup)
  • Using the BlueZ management API (MGMT_OP_START_DISCOVERY) instead of direct HCI commands

Reproduction

  1. On a Cerbo GX with two HCI adapters and dbus-ble-sensors running (it opens all adapters by default)
  2. From another process, attempt: dbus -y org.bluez /org/bluez/hci0 org.bluez.Adapter1.StartDiscovery
  3. Observe org.bluez.Error.InProgress even though Discovering property may be False
  4. Confirm with: btmgmt --index 0 find -lBusy (0x0a)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions