Skip to content

Commit 42d1940

Browse files
committed
Add documentation for PCM Audio WebSocket API
Refs openhab/openhab-core#4032. Signed-off-by: Florian Hotze <[email protected]>
1 parent b0db020 commit 42d1940

File tree

1 file changed

+80
-14
lines changed

1 file changed

+80
-14
lines changed

configuration/websocket.md

Lines changed: 80 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -3,16 +3,17 @@ layout: documentation
33
title: openHAB WebSocket API
44
---
55

6-
# openHAB WebSocket API
6+
# openHAB WebSocket APIs
77

8-
openHAB provides direct access to the [Event Bus](../developer/utils/events.html) through WebSockets.
9-
The WebSocket API allows subscribing to all events (by default) or a sub-set of events which are configurable at runtime via messages.
10-
All messages on the WebSocket connection are JSON encoded text-messages.
8+
openHAB provides access to a variety of functionality via WebSockets.
9+
This page describes the WebSocket APIs that are currently available and how to use them.
10+
11+
[[toc]]
1112

1213
## Establishing a connection
1314

1415
WebSockets are available on the same ports as the REST API, usually port 8080 for unsecured (ws-protocol) and port 8443 for secured (wss-protocol) connections.
15-
The connection is established by connecting to `ws[s]://{URL}:{PORT}/ws`.
16+
The connection is established by connecting to `ws[s]://{URL}:{PORT}/ws/{ADAPTER_ID}`.
1617

1718
To prevent unauthorized use of the connection, an access token has to be sent with the initial request.
1819
There are two options to send the access token:
@@ -24,20 +25,27 @@ There are two options to send the access token:
2425

2526
1. Through the `accessToken` query parameter: `ws[s]://{URL}:{PORT}/ws?accessToken={TOKEN}`.
2627

27-
`${TOKEN}` can be one of these two:
28+
`{TOKEN}` can be one of these two:
2829

2930
1. An API token: `oh.ohwstest.tz1IDPniKLxc0VU4t9tz4GiAiKmc0ZDdMKxhlD5tfviQStM4oNsywrcrUTktPbBE9YQ3wnMBrCqVEIhg7Q`
3031

3132
1. Basic Auth with base64 encoded `{USER}:{PASSWORD}`: `dXNlcjpwYXNzd29yZA==`
3233

33-
## Using the WebSocket connection
34+
## Event WebSocket API (`ADAPTER_ID` = `events`)
35+
36+
openHAB provides direct access to the [Event Bus](../developer/utils/events.html) through WebSockets.
37+
The WebSocket API allows subscribing to all events (by default) or a subset of events which are configurable at runtime via messages.
38+
All messages on the WebSocket connection are JSON-encoded text-messages.
39+
40+
The Event WebSocket is available at `ws[s]://{URL}:{PORT}/ws/events`.
41+
Authentication is handled as [described above](#establishing-a-connection).
3442

3543
### Receiving events (openHAB -> client)
3644

3745
By default, all events on the event bus will be sent as individual messages.
3846
An overview of the most common events can be found [here](../developer/utils/events.html#the-core-events).
3947
The JSON representation of the event always contains the type, topic and payload of the event.
40-
Optionally a source is present.
48+
Optionally, a source is present.
4149

4250
```json
4351
{
@@ -80,11 +88,11 @@ If a message can't be understood by openHAB (e.g. because of a wrong payload enc
8088
}
8189
```
8290

83-
## WebSocket Management
91+
### WebSocket Management
8492

8593
The WebSocket connection is managed through messages with the type `WebSocketEvent`.
8694

87-
### Keeping the connection alive (`openhab/websocket/heartbeat`)
95+
#### Keeping the connection alive (`openhab/websocket/heartbeat`)
8896

8997
All connections have an idle timeout of 10s to prevent dead connections.
9098
It is recommended that clients send a heartbeat message at a shorter interval (e.g. 5s).
@@ -108,7 +116,7 @@ The reception of a heartbeat message is acknowledged with a `PONG`:
108116
}
109117
```
110118

111-
### Applying filters to events
119+
#### Applying filters to events
112120

113121
To prevent unnecessary traffic on the connection filters can be applied to the connection.
114122
Filters only work in the direction from openHAB to the client, i.e. even if events from type `ItemCommandEvent` are not subscribed, they can still be sent by the client.
@@ -119,7 +127,7 @@ A new filter message always overrides the settings before.
119127
The default setting is no filter, i.e. all events from all sources.
120128
It is recommended to at least set a source filter for the client itself to prevent event reflection.
121129

122-
#### Filter by topic (`openhab/websocket/filter/topic`)
130+
##### Filter by topic (`openhab/websocket/filter/topic`)
123131

124132
Topic filters can be used to include and/or exclude events of a specific topic from the event stream.
125133
They can be applied both inclusive and exclusive, and provide API compatibility with the existing topic filter functionality of the SSE event stream.
@@ -160,7 +168,7 @@ The reception is acknowledged with the filter that is applied:
160168
}
161169
```
162170

163-
#### Filter by source (`openhab/websocket/filter/source`)
171+
##### Filter by source (`openhab/websocket/filter/source`)
164172

165173
Source filters can be used to remove events from a specific source from the event stream.
166174
They are exclusive, that means those sources that are part of the filter are removed.
@@ -189,7 +197,7 @@ The reception is acknowledged with the filter that is applied:
189197
}
190198
```
191199

192-
#### Filter by type (`openhab/websocket/filter/type`)
200+
##### Filter by type (`openhab/websocket/filter/type`)
193201

194202
Type filters are used to select a specific sub-set of all available events.
195203
They are a inclusive, that means only those event types sent in the filter message are sent.
@@ -216,3 +224,61 @@ The reception is acknowledged with the filter that is applied:
216224
"eventId": "5"
217225
}
218226
```
227+
228+
## Audio PCM WebSocket API (`ADAPTER_ID` = `audio-pcm`)
229+
230+
The Audio PCM WebSocket API allows for low-latency bidirectional transmission of raw PCM audio data between openHAB and a client.
231+
The WebSocket API allows connecting to openHAB's dialog processor remotely, refer to [Voice Assistant]({{base}}/configuration/multimedia.html#voice-assistant) for setup.
232+
233+
The Audio PCM WebSocket is available at `ws[s]://{URL}:{PORT}/ws/audio-pcm`.
234+
Authentication is handled as [described above](#establishing-a-connection).
235+
236+
The Audio PCM WebSocket protocol uses both JSON messages for control signaling and binary messages for audio data.
237+
238+
### Control Signaling (JSON)
239+
240+
Control messages use a JSON structure with a `cmd` field and an optional `args` object.
241+
242+
#### Client to Server Commands
243+
244+
- **`INITIALIZE`**: Sent by the client to initialize the audio session.
245+
- `args`:
246+
- `id`: (String) The ID of the audio sink/source (e.g., speaker ID).
247+
- `forceSampleRate`: (Number, optional) Request a specific sample rate.
248+
- `startDialog`: (Boolean, optional) Whether to start a voice dialog immediately.
249+
- `listeningItem`: (String, optional) Name of an item to reflect the listening state.
250+
- `locationItem`: (String, optional) Name of an item representing the location.
251+
- **`ON_SPOT`**: Sent by the client to signal an "on spot" event (e.g., a physical button press to start listening).
252+
253+
#### Server to Client Commands
254+
255+
- **`INITIALIZED`**: Sent by the server to confirm successful session initialization.
256+
- **`START_LISTENING`**: Sent by the server to instruct the client to start sending audio data.
257+
- **`STOP_LISTENING`**: Sent by the server to instruct the client to stop sending audio data.
258+
- **`SINK_VOLUME`**: Sent by the server to update the playback volume.
259+
- `args`: `{ "value": Number }` (0-100).
260+
- **`SOURCE_VOLUME`**: Sent by the server to update the recording volume.
261+
- `args`: `{ "value": Number }` (0-100).
262+
263+
### Audio Data (Binary)
264+
265+
Audio data is transmitted as binary messages.
266+
Each message consists of an 8-byte header using little-endian byte ordering followed by raw PCM audio samples.
267+
268+
#### Binary Header
269+
270+
The header is structured as follows:
271+
272+
| Offset | Size | Type | Description |
273+
|--------|------|-------|----------------------------------|
274+
| 0 | 2 | Bytes | Stream ID (Randomly generated) |
275+
| 2 | 4 | Int32 | Sample Rate (e.g., 16000, 44100) |
276+
| 6 | 1 | Uint8 | Bit Depth (e.g., 16) |
277+
| 7 | 1 | Uint8 | Number of Channels (e.g., 1, 2) |
278+
279+
#### Audio Payload
280+
281+
- **Source Audio (Client -> Server)**: The payload contains raw PCM data.
282+
- **Sink Audio (Server -> Client)**: The payload contains raw PCM data. A single-byte payload with value `254` (0xFE) immediately following the 8-byte header indicates the end of the audio stream.
283+
284+
The raw PCM data is encoded according to the format specified in the header.

0 commit comments

Comments
 (0)