-
-
Notifications
You must be signed in to change notification settings - Fork 8
Description
Problem
On systems running multiple BLE services, connection attempts from different processes collide on the same adapter. BlueZ allows only one LE Create Connection per adapter at a time. When two processes call Device1.Connect() simultaneously, one gets org.bluez.Error.InProgress. With 3-5 services sharing 2 adapters, ~40% of connection attempts fail this way -- not because of stale state, but because another process is genuinely using the adapter.
establish_connection() has no mechanism to serialize operations across processes. Each process retries independently, creating a thundering-herd pattern where retries collide again.
Environment
- Victron Cerbo GX, Venus OS v3.67, BlueZ 5.x
- 2 USB BLE adapters (hci0, hci1)
- 3-5 concurrent BLE services: BMS batteries (2 instances), power monitor, relay switches (7 devices), advertisement scanner
Production Evidence
[device] hci1 busy (InProgress, direct)
[device] hci0 busy (InProgress, scan)
[device] hci1 busy (InProgress, scan)
[device] InProgress dominated (3), clearing BlueZ
Both adapters return InProgress simultaneously because a battery service is connecting on hci1 and an advertisement scanner is scanning on hci0. The service exhausts its retry budget without getting a free adapter timeslot. On a single-service system this never happens.
Proposed Approach
An opt-in per-adapter file lock using fcntl.flock, configured through a LockConfig dataclass:
@dataclass
class LockConfig:
enabled: bool = False
lock_dir: str = "/run"
lock_template: str = "bleak-retry-connector-{adapter}.lock"
lock_timeout: float = 15.0
async def establish_connection(
...,
lock_config: LockConfig | None = None,
in_process_semaphore: asyncio.Semaphore | None = None,
**kwargs,
) -> AnyBleakClient:Key design decisions:
fcntl.flockis released automatically when the fd is closed, including on process crash. No stale locks, no cleanup.- Non-blocking with async retry: Uses
LOCK_NBso the event loop isn't blocked. Retries withasyncio.sleep(0.25s). - Graceful degradation: If the lock isn't acquired within
lock_timeout, the attempt proceeds without it. Prevents deadlocks. - Per-adapter granularity: Each adapter gets its own lock file (e.g.,
/run/bleak-retry-connector-hci0.lock). Services on different adapters don't block each other. - Released between retries: Lock held only during a single connection attempt, then released so other processes can interleave.
- Optional
in_process_semaphore: For asyncio tasks sharing an adapter within one process. Can combine with the file lock. - Linux-only: No-ops when
fcntlis absent.
This only covers connect() operations via establish_connection(). External services not using this library (e.g., Victron's advertisement scanner) would need their own participation for full system-wide serialization.
No behavior change when lock_config is not provided.
What This Fixes
- Adapter saturation on multi-service systems: Per-adapter lock ensures only one process uses a given adapter at a time, eliminating
InProgresscollisions between services using this library. - Thundering-herd retries: Forces sequential access instead of competing retries that keep colliding.
Reference Implementation
Branch with code and tests: feat/cross-process-lock
Related Upstream Issues
- #65 — HA will retry the connection to all esp32 bluetooth proxies at once: Reported connection slot exhaustion when multiple proxies retry simultaneously, creating a thundering-herd loop. A per-adapter file lock serializes connection attempts across processes, preventing the simultaneous retry storm.