feat: add MQTT remote control with Home Assistant auto-discovery by oleost · Pull Request #712 · damongolding/immich-kiosk

oleost · 2026-03-30T08:45:32Z

Add MQTT remote control with Home Assistant auto-discovery

Summary

This PR adds MQTT-based remote control to Immich Kiosk, enabling integration with Home Assistant and other MQTT-capable systems. It also fixes a PWA bug where iOS strips query parameters when adding to the home screen.

How MQTT control works

The kiosk connects to a configured MQTT broker and subscribes to command topics. Commands are forwarded to connected browsers via Server-Sent Events (SSE), which trigger the existing HTMX navigation — no polling, no page reload.

Supported commands: next / previous

Topics:

<prefix>/command              → all connected clients
<prefix>/<client>/command     → a specific screen only

Per-screen targeting: Open the kiosk with ?client=living-room in the URL. Each named client gets its own MQTT topic and its own device in Home Assistant.

Home Assistant auto-discovery

On connect, the kiosk publishes MQTT Discovery messages so it appears automatically in HA as a device with Next and Previous buttons — no manual HA configuration needed. Availability (online/offline) is handled via MQTT Last Will and Testament.

Configuration

kiosk:
  mqtt_enabled: true
  mqtt_broker: "192.168.1.x"
  mqtt_port: 1883
  mqtt_username: ""        # optional
  mqtt_password: ""        # optional
  mqtt_client_id: "immich-kiosk"
  mqtt_topic_prefix: "immich-kiosk"

All settings are also available as environment variables (KIOSK_MQTT_ENABLED, KIOSK_MQTT_BROKER etc).

PWA fix

start_url in the web app manifest now includes query parameters, so ?client=living-room is preserved when the page is added to the iOS home screen.

Changes

internal/mqtt/ — new package: MQTT client, HA discovery, LWT
internal/routes/routes_sse.go — SSE hub grouped by client name
frontend/src/ts/mqtt-sse.ts — EventSource listener → HTMX triggers
internal/config/config.go + config.schema.json — new MQTT settings
internal/routes/routes_manifest.go — PWA manifest fix
config.example.yaml — documented MQTT options

Dependency added: github.com/eclipse/paho.mqtt.golang

Summary by CodeRabbit

New Features
- MQTT-based remote control for kiosk navigation (next/previous) with Home Assistant discovery.
- Real-time server-sent events (SSE) endpoint and hub for broadcasting and queuing commands.
- Frontend SSE integration and optional commented MQTT kiosk settings plus schema validation.
Bug Fixes
- Manifest start URL now preserves relevant client query parameter for correct start behaviour.
Chores
- Automated multi-architecture Docker build & publish workflow for main branch.

Adds MQTT-based remote control so the kiosk can be controlled from Home Assistant or any other MQTT-capable system. How it works: - The kiosk connects to a configured MQTT broker on startup - It subscribes to <topic_prefix>/command and accepts "next" / "previous" - Commands are forwarded to connected browsers via a Server-Sent Events (SSE) endpoint at /events, which triggers the existing HTMX navigation - On connect, the kiosk publishes Home Assistant MQTT Discovery messages so it appears automatically as a device in HA with Next and Previous buttons, without any manual HA configuration - Availability (online/offline) is handled via MQTT Last Will and Testament New config options (under kiosk:): mqtt_enabled: true mqtt_broker: "192.168.1.x" mqtt_port: 1883 mqtt_username: "" mqtt_password: "" mqtt_client_id: "immich-kiosk" mqtt_topic_prefix: "immich-kiosk" Also adds a GitHub Actions workflow to build and push a multi-platform Docker image (amd64 + arm64) to ghcr.io on push to main or version tags. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

iOS strips query parameters when adding a page to the home screen, causing ?client= to be lost on subsequent launches. Using RequestURI() instead of Path in the manifest start_url ensures the full URL including query string is saved, so the client name persists. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-03-30T08:45:49Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds MQTT-based remote control and Home Assistant discovery, an SSE hub to relay MQTT commands to connected kiosks, frontend SSE handling mapping commands to htmx events, config/schema and env bindings, backend wiring (endpoint, middleware), and a CI workflow to build/publish multi‑arch Docker images.

Changes

Cohort / File(s)	Summary
CI / Docker publish `/.github/workflows/docker-publish.yml`	New GitHub Actions workflow to build and push multi‑arch Docker images to `ghcr.io` on pushes to `main` (Buildx, QEMU, metadata tags, cache, VERSION build-arg).
Configuration & Schema `config.example.yaml`, `config.schema.json`, `internal/config/config.go`	Added commented example MQTT kiosk settings, JSON Schema entries with conditional validation (require `mqtt_broker` when enabled), and environment bindings for all new kiosk MQTT fields.
MQTT Backend `internal/mqtt/mqtt.go`	New MQTT client type: validates broker config, connects with LWT and auto-reconnect, subscribes to global and per‑client command topics, maps payloads to `next`/`previous`, dispatches to registered handlers, and publishes Home Assistant discovery for devices.
SSE & Routing `internal/routes/routes_sse.go`, `internal/routes/routes_manifest.go`	Added SSEHub for per‑client subscriptions, pending event queue and TTL, non‑blocking broadcast, `/events` SSE handler, MQTT→SSE handler, and manifest StartURL generation preserving filtered `client` query param.
Frontend Integration `frontend/src/ts/mqtt-sse.ts`, `frontend/src/ts/kiosk.ts`	New `initMqttSSE()` opens EventSource to `/events` (preserves URL query), listens for `kiosk-command` SSE events (`next`/`previous`) and triggers htmx events; kiosk init now calls it.
Application Wiring `main.go`	Conditional MQTT initialisation when enabled, registers MQTT→SSE handler and new-client discovery hook, exposes `GET /events`, and exempts `/events` from Gzip compression.
Dependencies `go.mod`	Added direct dependency `github.com/eclipse/paho.mqtt.golang v1.5.0` (and an indirect `github.com/gorilla/websocket v1.5.3`).

Sequence Diagram

sequenceDiagram
    participant HA as Home Assistant
    participant MQTT as MQTT Broker
    participant Backend as Backend Server
    participant SSE as SSE Hub
    participant FE as Frontend Client
    participant DOM as DOM

    HA->>MQTT: Publish command (topic: <prefix>/command or <prefix>/<client>/command)
    MQTT->>Backend: Deliver message to subscribed backend client
    Backend->>Backend: Parse topic & payload → determine Command + target
    Backend->>SSE: Broadcast "kiosk-command" event (target or all)
    SSE->>FE: EventSource sends "kiosk-command" SSE event
    FE->>FE: Parse event data ("next"/"previous")
    FE->>DOM: Trigger htmx event (kiosk-new-asset / kiosk-prev-asset)
    DOM->>FE: Frontend reacts (navigate assets)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Task/release #615: Modifies frontend/src/ts/kiosk.ts initialization flow — touches the same frontend init sequence where initMqttSSE() is inserted.
Fix/config validation #520: Changes to configuration schema and internal/config/config.go — overlaps with the added MQTT schema and env bindings.

Suggested labels

enhancement

Poem

🐰🥕 I tuned my whiskers to the line,
MQTT whispers, SSE hums in time.
"Next" or "previous" — a tiny hop, so neat,
Kiosks leap forward with every beat. ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main feature added: MQTT remote control with Home Assistant auto-discovery, which aligns with the primary objectives and substantial code changes across configuration, backend MQTT handling, SSE integration, and frontend implementation.
Docstring Coverage	✅ Passed	Docstring coverage is 92.31% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

damongolding · 2026-03-30T08:53:48Z

Correct me if I am mistaken but this PR appears to be completely AI generated. This that the case?

coderabbitai

Actionable comments posted: 10

🧹 Nitpick comments (3)

internal/routes/routes_manifest.go (1)

52-52: Restrict persisted query parameters in start_url.

Line 52 currently persists the full referer query string. Consider allow-listing only stable params (for example client) so transient/sensitive query keys are not baked into the installed PWA launch URL.

Suggested hardening patch

 func Manifest(c *echo.Context) error {
 	refererURL := c.Request().Referer()
 	if refererURL == "" {
 		refererURL = "/"
 	}

 	referer, err := url.Parse(refererURL)
 	if err != nil {
 		log.Error("parsing URL", "url", refererURL, "err", err)
 		return errors.New("could not read URL. Is it formatted correctly?")
 	}
+
+	startURL := referer.EscapedPath()
+	if startURL == "" {
+		startURL = "/"
+	}
+	q := referer.Query()
+	filtered := url.Values{}
+	if client := q.Get("client"); client != "" {
+		filtered.Set("client", client)
+	}
+	if encoded := filtered.Encode(); encoded != "" {
+		startURL += "?" + encoded
+	}

 	manifest := &ManifestJSON{
 		Name:            "Immich Kiosk",
 		ShortName:       "Kiosk",
 		Description:     "Immich Kiosk is a lightweight slideshow for running on kiosk devices and browsers that uses Immich as a data source.",
-		StartURL:        referer.RequestURI(),
+		StartURL:        startURL,
 		Scope:           "/",
 		Display:         "fullscreen",
 		BackgroundColor: "#000000",
 		ThemeColor:      "#1f262f",

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/routes/routes_manifest.go` at line 52, The manifest currently sets
StartURL to the full referer.RequestURI(), which can persist transient or
sensitive query params; change the code that constructs StartURL (where
StartURL: referer.RequestURI() is set) to parse the referer URL, filter its
query parameters to an allow-list (e.g. only keep "client" or other stable
keys), rebuild the request URI using the original path plus the filtered query
string, and assign that sanitized URI to StartURL so transient/sensitive params
are not persisted.

.github/workflows/docker-publish.yml (2)

59-60: Consider an immutable VERSION value for branch builds.

On main, github.ref_name resolves to main, which makes the embedded version less traceable than a commit SHA. A tag-or-SHA fallback is usually easier to debug in production.
Suggested tweak
           build-args: |
-            VERSION=${{ github.ref_name }}
+            VERSION=${{ startsWith(github.ref, 'refs/tags/') && github.ref_name || github.sha }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/docker-publish.yml around lines 59 - 60, Replace the
mutable VERSION=${{ github.ref_name }} with a conditional that uses tag names
when building a tag but falls back to an immutable commit SHA for branch builds
(e.g. use github.ref_name for tags, otherwise github.sha); update the build-args
line so VERSION is set from github.ref_name when the ref is a tag and from
github.sha for branch/main builds (reference VERSION, github.ref_name, and
github.sha when making the change).
23-50: Pin GitHub Actions to immutable commit SHAs.

All actions in this workflow use mutable version tags (@v3, @v4, etc.) rather than full commit SHA hashes. This is a supply-chain security hardening gap. Replace the following tags with their corresponding commit SHAs:

Line 23: actions/checkout@v4

Line 26: docker/login-action@v3

Line 34: docker/metadata-action@v5

Line 44: docker/setup-qemu-action@v3

Line 47: docker/setup-buildx-action@v3

Line 50: docker/build-push-action@v6

For example: actions/checkout@c85c95e3d7251135ab7dc9cb2c7369c640f9abf1
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/docker-publish.yml around lines 23 - 50, The workflow is
using mutable tags for GitHub Actions (actions/checkout@v4,
docker/login-action@v3, docker/metadata-action@v5, docker/setup-qemu-action@v3,
docker/setup-buildx-action@v3, docker/build-push-action@v6); replace each
referenced action tag with the corresponding immutable commit SHA (e.g.,
actions/checkout@<sha>) throughout the workflow file so the steps Log in to
GitHub Container Registry, Extract metadata (id: meta), Set up QEMU, Set up
Docker Buildx, and Build and push use pinned SHAs; verify the SHAs are from the
official repos/releases and update the uses lines for those actions to the full
commit SHAs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/docker-publish.yml:
- Around line 3-9: The workflow trigger includes on.push.tags: "v*" which
duplicates the same tag trigger in ghcr-release.yaml causing double GHCR pushes;
fix by consolidating or gating one publisher: either remove the tags: - "v*"
trigger from docker-publish.yml (keep branch main) and let ghcr-release.yaml
handle tagged releases, or add a conditional guard in docker-publish.yml (e.g.,
check an explicit environment variable or use if: github.event_name == 'push' &&
!contains(github.ref, 'refs/tags/')) so only one workflow publishes on tag
pushes; update the on.push.tags entry or add the conditional in
docker-publish.yml to prevent duplicate GHCR pushes.

In `@config.example.yaml`:
- Around line 170-177: Update the MQTT docs to mention the per-client topic
format in addition to the global topic: document that commands can be sent to
both <mqtt_topic_prefix>/command and <mqtt_topic_prefix>/<client>/command and
that the kiosk accepts a ?client=<name> query parameter to map a URL to a
specific screen; add a short example using mqtt_topic_prefix (and mqtt_client_id
if relevant) showing a sample per-client topic and the corresponding kiosk URL
with ?client for clarity.

In `@config.schema.json`:
- Around line 568-588: Add JSON Schema validation so MQTT configs are rejected
when unusable: add "minimum":1 and "maximum":65535 to "mqtt_port" and add an
"if"/"then" rule that when "mqtt_enabled" is true requires non-empty
"mqtt_broker" and "mqtt_topic_prefix" (and optionally "mqtt_client_id" if you
want client id enforced). Use the JSON Schema "if": { "properties": {
"mqtt_enabled": { "const": true } } } with a "then": { "required": [...],
"properties": { "mqtt_broker": { "type":"string", "minLength":1 },
"mqtt_topic_prefix": { "type":"string", "minLength":1 } } } so validation fails
early instead of at runtime.

In `@frontend/src/ts/mqtt-sse.ts`:
- Around line 32-37: The handler for the "previous" MQTT/SSE message triggers a
non-existent event ("kiosk-prev-asset") on the element with id
"navigation-interaction-area--previous-asset", so nothing happens; update the
code in mqtt-sse.ts to trigger the actual click event (use htmx.trigger(prevEl,
"click")) or alternatively add "kiosk-prev-asset" to the hx-trigger list in the
template for `#navigation-interaction-area--previous-asset`
(internal/templates/partials/menu.templ) so the element listens for that custom
event; modify whichever approach you choose and ensure the "previous" case in
the switch uses the matching event name.
- Around line 16-19: The SSE URL construction drops existing auth query params;
modify the logic that builds url (params, client, url, evtSource) so you reuse
the full URLSearchParams from window.location.search instead of rebuilding from
client alone—set or update the "client" param on params (e.g.,
params.set("client", client)) and then use `/events?${params.toString()}` (or
`/events` when params is empty) so password/authsecret and any other query keys
are preserved for the EventSource request.

In `@internal/config/config.go`:
- Around line 159-162: The MqttClientID field currently uses a fixed default
"immich-kiosk" which causes client-ID collisions; remove the hardcoded default
in the MqttClientID struct tag (or make it empty) and ensure the code that
initializes or validates configuration (where the config is constructed/loaded)
sets a unique client ID when none is provided by calling the existing
GenerateUUID() utility or deriving one from the hostname; update/mention the
MqttClientID field and the config initialization/validation path so a unique ID
is generated at runtime (or alternatively require an explicit non-empty override
when MQTT is enabled).

In `@internal/mqtt/mqtt.go`:
- Around line 154-160: Validate and normalize clientName before using it to
build MQTT/HA identifiers in PublishClientDiscovery and any other places calling
publishDiscovery (e.g., where uniqueID/displayName are constructed);
specifically, reject empty strings and names containing path separators ("/") or
other MQTT/HA-invalid characters, or normalize them to a safe single-segment
token (e.g., return early with no publish or replace illegal characters with
"_"), and then use the validated value when creating uniqueID
("immich-kiosk-"+clientName), displayName, and when calling c.publishDiscovery
to ensure discovery topics and unique_id are well-formed.
- Around line 154-160: PublishClientDiscovery is advertising per-client devices
but using the server-wide availability topic in publishDiscovery, causing stale
"available" state; either stop auto-discovering per-client devices or change
discovery to use a per-screen availability topic driven by the SSE lifecycle.
Locate Client.PublishClientDiscovery and the publishDiscovery call and: 1) if
opting to defer discovery, skip calling publishDiscovery for per-client names
(e.g., clientName != "_global") until you can represent lifecycle; or 2) if
keeping discovery, modify publishDiscovery to accept and publish an availability
topic scoped to the specific client (e.g., include clientName or uniqueID in the
availability topic) and ensure the SSE connect/disconnect handlers update that
per-client availability (remove retained discovery or publish "offline" on
disconnect) so device availability follows the screen lifecycle.
- Around line 132-137: The constructor currently calls token := client.Connect()
followed by token.Wait(), which blocks and ignores the provided ctx; change this
to an interruptible wait that respects ctx cancellation: use token.WaitTimeout
in a short loop (or equivalent) to periodically check if the token completed,
and inside the loop select on ctx.Done() to abort the connect, call
client.Disconnect(0) (or appropriate cleanup), and return ctx.Err(); after the
loop check token.Error() as before. Update the code around client.Connect,
token, token.Wait/token.WaitTimeout and client.Disconnect to implement this
behavior.

In `@internal/routes/routes_sse.go`:
- Around line 59-67: The unsubscribe function currently removes the last channel
and client entry from h.clients but never prunes h.knownClients, allowing
unbounded growth from bogus ?client values; update SSEHub.unsubscribe to also
delete h.knownClients[clientName] when you remove the final channel (i.e.,
inside the block where you delete(h.clients, clientName) check and delete the
same key from h.knownClients), or alternatively implement a bounded cache/LRU or
TTL for h.knownClients to prevent unbounded growth if you prefer eviction over
immediate deletion; reference SSEHub.unsubscribe, SSEHub.subscribe, clients and
knownClients to locate the changes.

---

Nitpick comments:
In @.github/workflows/docker-publish.yml:
- Around line 59-60: Replace the mutable VERSION=${{ github.ref_name }} with a
conditional that uses tag names when building a tag but falls back to an
immutable commit SHA for branch builds (e.g. use github.ref_name for tags,
otherwise github.sha); update the build-args line so VERSION is set from
github.ref_name when the ref is a tag and from github.sha for branch/main builds
(reference VERSION, github.ref_name, and github.sha when making the change).
- Around line 23-50: The workflow is using mutable tags for GitHub Actions
(actions/checkout@v4, docker/login-action@v3, docker/metadata-action@v5,
docker/setup-qemu-action@v3, docker/setup-buildx-action@v3,
docker/build-push-action@v6); replace each referenced action tag with the
corresponding immutable commit SHA (e.g., actions/checkout@<sha>) throughout the
workflow file so the steps Log in to GitHub Container Registry, Extract metadata
(id: meta), Set up QEMU, Set up Docker Buildx, and Build and push use pinned
SHAs; verify the SHAs are from the official repos/releases and update the uses
lines for those actions to the full commit SHAs.

In `@internal/routes/routes_manifest.go`:
- Line 52: The manifest currently sets StartURL to the full
referer.RequestURI(), which can persist transient or sensitive query params;
change the code that constructs StartURL (where StartURL: referer.RequestURI()
is set) to parse the referer URL, filter its query parameters to an allow-list
(e.g. only keep "client" or other stable keys), rebuild the request URI using
the original path plus the filtered query string, and assign that sanitized URI
to StartURL so transient/sensitive params are not persisted.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 11797c60-f248-4269-b464-ba5d368d8975

📥 Commits

Reviewing files that changed from the base of the PR and between 5827b8f and 02a6e41.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (11)

.github/workflows/docker-publish.yml
config.example.yaml
config.schema.json
frontend/src/ts/kiosk.ts
frontend/src/ts/mqtt-sse.ts
go.mod
internal/config/config.go
internal/mqtt/mqtt.go
internal/routes/routes_manifest.go
internal/routes/routes_sse.go
main.go

coderabbitai · 2026-03-30T09:00:50Z

+            case "previous": {
+                const prevEl = document.getElementById(
+                    "navigation-interaction-area--previous-asset",
+                );
+                if (prevEl) {
+                    htmx.trigger(prevEl, "kiosk-prev-asset");


⚠️ Potential issue | 🟠 Major

previous is wired to an event the button does not listen for.

internal/templates/partials/menu.templ:52-60 only declares hx-trigger="click ..." on #navigation-interaction-area--previous-asset, so kiosk-prev-asset is a no-op here. Trigger click instead, or add kiosk-prev-asset to the template trigger list.

↩️ Minimal fix

- htmx.trigger(prevEl, "kiosk-prev-asset"); + htmx.trigger(prevEl, "click");

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

case "previous": {

const prevEl = document.getElementById(

"navigation-interaction-area--previous-asset",

);

if (prevEl) {

htmx.trigger(prevEl, "kiosk-prev-asset");

case "previous": {

const prevEl = document.getElementById(

"navigation-interaction-area--previous-asset",

);

if (prevEl) {

htmx.trigger(prevEl, "click");

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@frontend/src/ts/mqtt-sse.ts` around lines 32 - 37, The handler for the "previous" MQTT/SSE message triggers a non-existent event ("kiosk-prev-asset") on the element with id "navigation-interaction-area--previous-asset", so nothing happens; update the code in mqtt-sse.ts to trigger the actual click event (use htmx.trigger(prevEl, "click")) or alternatively add "kiosk-prev-asset" to the hx-trigger list in the template for `#navigation-interaction-area--previous-asset` (internal/templates/partials/menu.templ) so the element listens for that custom event; modify whichever approach you choose and ensure the "previous" case in the switch uses the matching event name.

coderabbitai · 2026-03-30T09:00:50Z

+	// MqttClientID is the MQTT client identifier
+	MqttClientID string `json:"mqttClientId" yaml:"mqtt_client_id" mapstructure:"mqtt_client_id" default:"immich-kiosk"`
+	// MqttTopicPrefix is the MQTT topic prefix. Commands are received on <prefix>/command
+	MqttTopicPrefix string `json:"mqttTopicPrefix" yaml:"mqtt_topic_prefix" mapstructure:"mqtt_topic_prefix" default:"immich-kiosk"`


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In MQTT, what happens when two clients connect to the same broker using the same client ID? Is the earlier connection dropped?

💡 Result:

In MQTT, client IDs must be unique per broker. When two clients attempt to connect using the same client ID, the broker drops the earlier (existing) connection and accepts the new one. This behavior is mandated by the MQTT specifications for both versions 3.1.1 and 5.0 to ensure session integrity and allow session recovery after network failures. For MQTT 3.1.1: If the ClientId represents a Client already connected to the Server then the Server MUST disconnect the existing Client [MQTT-3.1.4-2] (referenced across multiple sources including broker logs and discussions). For MQTT 5.0: If the ClientID represents a Client already connected to the Server, the Server sends a DISCONNECT packet to the existing Client with Reason Code 0x8E (Session taken over) and MUST close the Network Connection of the existing Client [MQTT-3.1.4-3]. This can lead to connection flapping if both clients auto-reconnect. Some brokers offer configuration options (e.g., Mosquitto's allow_duplicate_clients) to reject duplicates instead, but standard compliant behavior is to drop the older connection. Always use unique client IDs, such as by appending random suffixes or UUIDs.

Citations:

1: https://infoneva.com/en/knowledge/mqtt-client-id-issues-resolution

2: https://docs.aws.amazon.com/iot-device-defender/latest/devguide/audit-chk-conflicting-client-ids.html

3: https://docs.oasis-open.org/mqtt/mqtt/v3.1.1/mqtt-v3.1.1.pdf

4: https://docs.oasis-open.org/mqtt/mqtt/v5.0/csprd02/mqtt-v5.0-csprd02.html

5: https://stackoverflow.com/questions/59625907/how-to-handle-multiple-clients-with-same-id-in-mosquitto-mqtt

6: https://groups.google.com/g/mqtt/c/J-_iBkefJo8

7: https://docs.oasis-open.org/mqtt/mqtt/v3.1.1/mqtt-v3.1.1.html

8: https://stackoverflow.com/questions/38140892/cant-connect-multiple-nodemcu-to-same-broker

9: https://stackoverflow.com/questions/69921167/why-does-a-second-mqtt-connect-result-in-an-mqtt-disconnect

10: https://stackoverflow.com/questions/40745547/mqtt-broker-closing-old-connection

11: https://docs.oasis-open.org/mqtt/mqtt/v5.0/os/mqtt-v5.0-os.html

🏁 Script executed:

# Search for MqttClientID usage across the codebase rg "MqttClientID" -A 3 -B 3 # Also check for any UUID or unique ID generation logic related to MQTT rg -i "mqtt.*id|clientid" --type go -A 2 | head -100

Repository: damongolding/immich-kiosk

Length of output: 3637

🏁 Script executed:

# Check for initialization/startup code that might modify MqttClientID rg -i "hostname|uuid|machine.*id" --type go internal/ | head -50 # Also check main.go or any startup sequence find . -name "main.go" -o -name "init.go" | xargs ls -la 2>/dev/null # Look for any config post-processing that might add uniqueness rg "MqttClientID\s*=" --type go -A 5 -B 5

Repository: damongolding/immich-kiosk

Length of output: 4557

🏁 Script executed:

# Check main.go to see the startup sequence head -100 main.go # Look for where config is loaded and if it's post-processed rg "NewConfig|LoadConfig" --type go -A 10 -B 2 # Check if KioskSettings or MqttClientID is modified after config load rg "MqttClientID\s*=" --type go -B 5 -A 5 | grep -v "default:"

Repository: damongolding/immich-kiosk

Length of output: 2732

🏁 Script executed:

# Check the config Load method to see if it modifies MqttClientID rg "func.*Load" internal/config/ --type go -A 15 | head -80 # Check mqtt.New() function rg "func.*New.*Client|type Client struct" internal/mqtt/ --type go -A 10

Repository: damongolding/immich-kiosk

Length of output: 3240

Use a unique identifier for the MQTT client ID instead of a fixed default.

Multiple kiosk instances using the default "immich-kiosk" will disconnect each other on the same MQTT broker—the MQTT specification requires the broker to drop the earlier connection when a duplicate client ID connects. Generate a unique ID at runtime (e.g., using the existing GenerateUUID() utility or hostname), or require an explicit override when MQTT is enabled.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/config/config.go` around lines 159 - 162, The MqttClientID field currently uses a fixed default "immich-kiosk" which causes client-ID collisions; remove the hardcoded default in the MqttClientID struct tag (or make it empty) and ensure the code that initializes or validates configuration (where the config is constructed/loaded) sets a unique client ID when none is provided by calling the existing GenerateUUID() utility or deriving one from the hostname; update/mention the MqttClientID field and the config initialization/validation path so a unique ID is generated at runtime (or alternatively require an explicit non-empty override when MQTT is enabled).

coderabbitai · 2026-03-30T09:00:50Z

+	client := pahomqtt.NewClient(opts)
+	token := client.Connect()
+	token.Wait()
+	if err := token.Error(); err != nil {
+		return nil, fmt.Errorf("mqtt connect: %w", err)
+	}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd /root && find . -type f -name "*.go" | head -20

Repository: damongolding/immich-kiosk

Length of output: 124

🏁 Script executed:

find . -type f -name "mqtt.go" -o -name "mqtt_test.go"

Repository: damongolding/immich-kiosk

Length of output: 92

🏁 Script executed:

if [ -f "internal/mqtt/mqtt.go" ]; then wc -l internal/mqtt/mqtt.go fi

Repository: damongolding/immich-kiosk

Length of output: 94

🏁 Script executed:

if [ -f "internal/mqtt/mqtt.go" ]; then sed -n '120,150p' internal/mqtt/mqtt.go fi

Repository: damongolding/immich-kiosk

Length of output: 776

🏁 Script executed:

if [ -f "internal/mqtt/mqtt.go" ]; then cat -n internal/mqtt/mqtt.go | head -50 fi

Repository: damongolding/immich-kiosk

Length of output: 1943

🏁 Script executed:

sed -n '1,100p' internal/mqtt/mqtt.go | cat -n

Repository: damongolding/immich-kiosk

Length of output: 3875

🏁 Script executed:

sed -n '100,160p' internal/mqtt/mqtt.go | cat -n

Repository: damongolding/immich-kiosk

Length of output: 2477

🏁 Script executed:

grep -n "^func New" internal/mqtt/mqtt.go

Repository: damongolding/immich-kiosk

Length of output: 151

🏁 Script executed:

grep -r "WaitTimeout" --include="*.go"

Repository: damongolding/immich-kiosk

Length of output: 51

🏁 Script executed:

rg "github.com/eclipse/paho.mqtt.golang" -A 5 -B 5 | head -30

Repository: damongolding/immich-kiosk

Length of output: 1919

🏁 Script executed:

grep -n "go func()" internal/mqtt/mqtt.go

Repository: damongolding/immich-kiosk

Length of output: 85

🌐 Web query:

eclipse paho mqtt golang token WaitTimeout API

💡 Result:

The Eclipse Paho MQTT Golang client library (github.com/eclipse/paho.mqtt.golang, latest v1.5.1 as of 2025-09-16) provides the Token interface for tracking asynchronous MQTT operations like Connect, Publish, Subscribe, etc. The WaitTimeout method is part of this interface. Signature: func (t *baseToken) WaitTimeout(d time.Duration) bool Behavior: Waits up to duration d for the associated operation (token flow) to complete. Returns true if completed before timeout (success or error), false if timeout occurs. Importantly, on timeout (false), token.Error remains nil intentionally, as the operation may still be ongoing and complete later. Users must handle the false case separately (e.g., cancel or retry). From source (token.go): func (b baseToken) WaitTimeout(d time.Duration) bool { timer := time.NewTimer(d) select { case <-b.complete: if !timer.Stop { <-timer.C } return true case <-timer.C: return false } } Documentation comment: "WaitTimeout takes a time.Duration to wait for the flow associated with the Token to complete, returns true if it returned before the timeout or returns false if the timeout occurred. In the case of a timeout the Token does not have an error set in case the caller wishes to wait again." Common misuse (warned in issues): if token.WaitTimeout(d) && token.Error != nil { ... } – this skips timeout handling since false short-circuits. Recommended usage (from library, added in recent updates): Use WaitTokenTimeout(t Token, d time.Duration) error for convenience: if err := WaitTokenTimeout(token, 5time.Second); err != nil { if errors.Is(err, TimedOut) { // handle timeout } else { // handle other error } } Where TimedOut = errors.New("context canceled") – but note it says operation may still succeed later. Token also provides: - Wait bool: Waits indefinitely, always returns true. - Done <-chan struct{}: For select{} with contexts/timeouts. - Error error: Error if operation failed (not on timeout). Examples: token := client.Connect if !token.WaitTimeout(5 * time.Second) { // timeout - decide to return error, retry, etc. return fmt.Errorf("connect timeout") } if token.Error != nil { return token.Error } Official docs: pkg.go.dev/github.com/eclipse/paho.mqtt.golang (links to godoc with full API). Repo: github.com/eclipse/paho.mqtt.golang. Note: Separate MQTT v5 client at github.com/eclipse/paho.golang (no tokens, uses contexts).

Citations:

1: https://github.com/eclipse-paho/paho.mqtt.golang/blob/master/token.go

2: token.WaitTimeout There is an error missing. eclipse-paho/paho.mqtt.golang#656

3: token.WaitTimeout There is an error missing. eclipse-paho/paho.mqtt.golang#656

4: https://github.com/eclipse-paho/paho.mqtt.golang

5: https://pkg.go.dev/github.com/eclipse/paho.mqtt.golang

6: https://github.com/eclipse/paho.mqtt.golang/blob/master/client.go

Make the initial connect path respect context cancellation.

New accepts context.Context but token.Wait() never checks it. If the broker is slow or unavailable, the constructor can block indefinitely even after ctx is cancelled, preventing graceful shutdown or timeout enforcement.

Suggested fix

client := pahomqtt.NewClient(opts) token := client.Connect() -token.Wait() -if err := token.Error(); err != nil { - return nil, fmt.Errorf("mqtt connect: %w", err) +for { + if token.WaitTimeout(500 * time.Millisecond) { + if err := token.Error(); err != nil { + return nil, fmt.Errorf("mqtt connect: %w", err) + } + break + } + select { + case <-ctx.Done(): + client.Disconnect(0) + return nil, ctx.Err() + default: + } }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/mqtt/mqtt.go` around lines 132 - 137, The constructor currently calls token := client.Connect() followed by token.Wait(), which blocks and ignores the provided ctx; change this to an interruptible wait that respects ctx cancellation: use token.WaitTimeout in a short loop (or equivalent) to periodically check if the token completed, and inside the loop select on ctx.Done() to abort the connect, call client.Disconnect(0) (or appropriate cleanup), and return ctx.Err(); after the loop check token.Error() as before. Update the code around client.Connect, token, token.Wait/token.WaitTimeout and client.Disconnect to implement this behavior.

coderabbitai · 2026-03-30T09:00:50Z

+func (c *Client) PublishClientDiscovery(clientName string) {
+	if clientName == "_global" {
+		return
+	}
+	uniqueID := "immich-kiosk-" + clientName
+	displayName := "Immich Kiosk - " + clientName
+	c.publishDiscovery(c.client, uniqueID, clientName, displayName)


⚠️ Potential issue | 🟠 Major

Validate clientName before generating MQTT and HA identifiers.

This code interpolates the raw screen name into prefix/<client>/command, the discovery topic, and unique_id. "" produces broken discovery identifiers, and foo/bar produces a different topic shape from the single-segment form the rest of this package expects, so the discovered controls silently stop working. Reject or normalise invalid names here instead of relying on upstream callers.

Possible fix

func (c *Client) PublishClientDiscovery(clientName string) { - if clientName == "_global" { + clientName = strings.TrimSpace(clientName) + if clientName == "" || clientName == "_global" { return } + if strings.ContainsAny(clientName, "/+#") { + log.Error("MQTT client name must be a single safe topic segment", "client", clientName) + return + } uniqueID := "immich-kiosk-" + clientName displayName := "Immich Kiosk - " + clientName c.publishDiscovery(c.client, uniqueID, clientName, displayName)

Also applies to: 171-172, 190-193

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/mqtt/mqtt.go` around lines 154 - 160, Validate and normalize clientName before using it to build MQTT/HA identifiers in PublishClientDiscovery and any other places calling publishDiscovery (e.g., where uniqueID/displayName are constructed); specifically, reject empty strings and names containing path separators ("/") or other MQTT/HA-invalid characters, or normalize them to a safe single-segment token (e.g., return early with no publish or replace illegal characters with "_"), and then use the validated value when creating uniqueID ("immich-kiosk-"+clientName), displayName, and when calling c.publishDiscovery to ensure discovery topics and unique_id are well-formed.

⚠️ Potential issue | 🟠 Major

Per-client devices are advertised as server-wide, not screen-wide.

PublishClientDiscovery adds a retained device for each named client, but publishDiscovery points every one of them at the global availability topic. After a screen disconnects, Home Assistant will still show that device as available whenever the server is online, which leaves stale controls in the UI. Please drive per-client availability/removal from the SSE lifecycle, or avoid auto-discovering per-client devices until that lifecycle can be represented.

Also applies to: 173-175

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/mqtt/mqtt.go` around lines 154 - 160, PublishClientDiscovery is advertising per-client devices but using the server-wide availability topic in publishDiscovery, causing stale "available" state; either stop auto-discovering per-client devices or change discovery to use a per-screen availability topic driven by the SSE lifecycle. Locate Client.PublishClientDiscovery and the publishDiscovery call and: 1) if opting to defer discovery, skip calling publishDiscovery for per-client names (e.g., clientName != "_global") until you can represent lifecycle; or 2) if keeping discovery, modify publishDiscovery to accept and publish an availability topic scoped to the specific client (e.g., include clientName or uniqueID in the availability topic) and ensure the SSE connect/disconnect handlers update that per-client availability (remove retained discovery or publish "offline" on disconnect) so device availability follows the screen lifecycle.

oleost · 2026-03-30T09:19:41Z

Correct me if I am mistaken but this PR appears to be completely AI generated. This that the case?

Hi, this is correct. But I have it already in use in my home. Please feel free to discard it if wanted.

- docker-publish.yml: remove tag trigger to avoid double-publish with existing ghcr-release.yaml; use commit SHA as VERSION for branch builds - mqtt-sse.ts: forward all existing query params (including auth) to the SSE endpoint instead of rebuilding the URL from client only - routes_manifest.go: filter start_url to only keep the "client" param so transient/sensitive query params are not baked into the PWA manifest - routes_sse.go: delete from knownClients on last disconnect to prevent unbounded memory growth - mqtt.go: validate clientName before using it as an MQTT topic segment; reject names containing '/', '#', or '+' - config.schema.json: add port range (1-65535) and if/then rule requiring mqtt_broker when mqtt_enabled is true - config.example.yaml: document per-client topic format and ?client= usage Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

internal/routes/routes_sse.go (1)
115-115: Unused parameter _ *config.Config.

The config.Config parameter is not used in this handler. If it's required for interface consistency with other route handlers, this is fine. Otherwise, consider removing it to clarify the function signature.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/routes/routes_sse.go` at line 115, The SSEEvents handler declares an
unused parameter `_ *config.Config`; either remove the parameter from the
SSEEvents signature if it's not required for interface consistency, or if you
must keep the signature, rename the parameter to a named but unused identifier
(e.g., cfg *config.Config) and use the blank identifier only if deliberate;
locate the SSEEvents function declaration and adjust its signature accordingly
to eliminate the unused parameter warning while preserving any required
interface compatibility.
internal/mqtt/mqtt.go (2)
222-222: Publish errors are not checked.

client.Publish() returns a token that could indicate a failure. Although the operation is fire-and-forget for discovery, consider at least debug-logging if the publish fails to aid troubleshooting.
Optional: log publish failures
-		client.Publish(discoveryTopic, 1, true, payload)
+		if t := client.Publish(discoveryTopic, 1, true, payload); t.Wait() && t.Error() != nil {
+			log.Debug("MQTT discovery publish failed", "topic", discoveryTopic, "err", t.Error())
+		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/mqtt/mqtt.go` at line 222, client.Publish(...) return token is
ignored; capture the token from client.Publish(discoveryTopic, 1, true,
payload), wait (or Check) for completion and if token.Error() != nil, log the
error (at debug or warn level) including discoveryTopic and payload/context to
aid troubleshooting; update the publish call site that currently uses
client.Publish(discoveryTopic, 1, true, payload) to store the returned token,
call token.Wait() (or WaitTimeout) and then log token.Error() when non-nil.
100-121: Consider using Info level for successful operations.

log.Warn is typically reserved for concerning conditions. Successful connection and subscription events at lines 101, 112, and 120 would be more appropriate at Info level, whilst Warn remains suitable for connection lost (line 125) and reconnecting (line 129) events.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/mqtt/mqtt.go` around lines 100 - 121, The success logs inside the
MQTT on-connect handler use log.Warn but should be Info: update the log level
calls in the opts.SetOnConnectHandler block—specifically the "MQTT connected"
message (currently using log.Warn), the "MQTT subscribed" message after
subscribing to c.topic, and the "MQTT subscribed" message after subscribing to
wildcardTopic—to use log.Info instead; leave other warning logs (e.g.,
connection lost/reconnecting) as log.Warn. Ensure you update the calls near
client.Publish, c.publishDiscovery, the Subscribe result checks for c.topic and
wildcardTopic that reference c.messageHandler.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/docker-publish.yml:
- Around line 12-14: The workflow lacks concurrency control for the
build-and-push job so overlapping runs on main can finish out of order; add a
concurrency block (either at the workflow or job level) for the build-and-push
job using a deterministic group like "build-and-push-${{ github.ref }}" or "${{
github.workflow }}-${{ github.ref }}" and set cancel-in-progress: true so any
in-flight build-and-push for the same ref is canceled when a new run starts,
preventing stale latest/main image tags.

In `@internal/mqtt/mqtt.go`:
- Around line 226-229: The handlers slice (Client.handlers) is accessed
concurrently by AddHandler and messageHandler causing a data race; add a
sync.RWMutex (e.g., handlersMu) to the Client struct and use it to protect
access: acquire handlersMu.Lock() when modifying the slice in AddHandler (and
append), and acquire handlersMu.RLock()/RUnlock() when iterating or reading the
slice in messageHandler; ensure you initialize the mutex via the zero value and
avoid holding locks while calling user handlers (copy the slice under the lock
then iterate/call handlers outside the lock).

---

Nitpick comments:
In `@internal/mqtt/mqtt.go`:
- Line 222: client.Publish(...) return token is ignored; capture the token from
client.Publish(discoveryTopic, 1, true, payload), wait (or Check) for completion
and if token.Error() != nil, log the error (at debug or warn level) including
discoveryTopic and payload/context to aid troubleshooting; update the publish
call site that currently uses client.Publish(discoveryTopic, 1, true, payload)
to store the returned token, call token.Wait() (or WaitTimeout) and then log
token.Error() when non-nil.
- Around line 100-121: The success logs inside the MQTT on-connect handler use
log.Warn but should be Info: update the log level calls in the
opts.SetOnConnectHandler block—specifically the "MQTT connected" message
(currently using log.Warn), the "MQTT subscribed" message after subscribing to
c.topic, and the "MQTT subscribed" message after subscribing to wildcardTopic—to
use log.Info instead; leave other warning logs (e.g., connection
lost/reconnecting) as log.Warn. Ensure you update the calls near client.Publish,
c.publishDiscovery, the Subscribe result checks for c.topic and wildcardTopic
that reference c.messageHandler.

In `@internal/routes/routes_sse.go`:
- Line 115: The SSEEvents handler declares an unused parameter `_
*config.Config`; either remove the parameter from the SSEEvents signature if
it's not required for interface consistency, or if you must keep the signature,
rename the parameter to a named but unused identifier (e.g., cfg *config.Config)
and use the blank identifier only if deliberate; locate the SSEEvents function
declaration and adjust its signature accordingly to eliminate the unused
parameter warning while preserving any required interface compatibility.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 03876182-c7ba-4f5f-b55d-de16e63d0446

📥 Commits

Reviewing files that changed from the base of the PR and between 02a6e41 and cfe9498.

📒 Files selected for processing (7)

.github/workflows/docker-publish.yml
config.example.yaml
config.schema.json
frontend/src/ts/mqtt-sse.ts
internal/mqtt/mqtt.go
internal/routes/routes_manifest.go
internal/routes/routes_sse.go

✅ Files skipped from review due to trivial changes (1)

config.example.yaml

🚧 Files skipped from review as they are similar to previous changes (3)

internal/routes/routes_manifest.go
frontend/src/ts/mqtt-sse.ts
config.schema.json

- docker-publish.yml: add concurrency group so rapid pushes to main cancel the previous in-progress build, preventing stale image tags - mqtt.go: protect the handlers slice with a RWMutex to eliminate the data race between AddHandler() and messageHandler() which can be called from different goroutines after Connect() is established Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

damongolding · 2026-03-30T09:33:46Z

Correct me if I am mistaken but this PR appears to be completely AI generated. This that the case?

Hi, this is correct. But I have it already in use in my home. Please feel free to discard it if wanted.

Does it work well for you?

oleost · 2026-03-30T09:34:58Z

Correct me if I am mistaken but this PR appears to be completely AI generated. This that the case?

Hi, this is correct. But I have it already in use in my home. Please feel free to discard it if wanted.

Does it work well for you?

Only had it for short while, no issue so far. Can report back in a couple of days. Currently making it work through some of coderabbitai complaints.

But if you see something that obviously need changing please let me know.

coderabbitai

🧹 Nitpick comments (1)

internal/mqtt/mqtt.go (1)
85-86: Consider adding TLS support for secure MQTT connections.

The broker URL is hardcoded to use tcp://. For production deployments, especially with authentication credentials, TLS (ssl:// or tls://) is recommended. Consider adding a configuration option like mqtt_tls to enable secure connections.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/mqtt/mqtt.go` around lines 85 - 86, The broker URL is hardcoded to
"tcp://" — add TLS support by introducing a configuration flag (e.g.
settings.MqttTLS or mqtt_tls) and use it to choose the scheme ("tcp://" vs
"ssl://" or "tls://") when building the broker URL where opts :=
pahomqtt.NewClientOptions() and opts.AddBroker(...) is called; if TLS is enabled
also set a TLS configuration on the client options (opts.SetTLSConfig(...)) and
consider defaulting the port to 8883 when settings.MqttTLS is true to ensure
secure connections.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/mqtt/mqtt.go`:
- Around line 85-86: The broker URL is hardcoded to "tcp://" — add TLS support
by introducing a configuration flag (e.g. settings.MqttTLS or mqtt_tls) and use
it to choose the scheme ("tcp://" vs "ssl://" or "tls://") when building the
broker URL where opts := pahomqtt.NewClientOptions() and opts.AddBroker(...) is
called; if TLS is enabled also set a TLS configuration on the client options
(opts.SetTLSConfig(...)) and consider defaulting the port to 8883 when
settings.MqttTLS is true to ensure secure connections.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ce6fcb6b-9442-4b7a-ab31-e289c2505d23

📥 Commits

Reviewing files that changed from the base of the PR and between cfe9498 and 727e0e0.

📒 Files selected for processing (2)

.github/workflows/docker-publish.yml
internal/mqtt/mqtt.go

✅ Files skipped from review due to trivial changes (1)

.github/workflows/docker-publish.yml

Two problems caused MQTT commands to be silently lost: 1. Proxies and iOS suspend idle SSE connections after inactivity. Fix: send a keepalive comment every 30 s to keep the connection alive. 2. If a command arrived while the browser was reconnecting (no active SSE client), it was dropped with no retry. Fix: queue the last command with a 10 s TTL. When a client reconnects the queued command is delivered immediately, then discarded. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/routes/routes_sse.go`:
- Around line 68-82: The current replay logic in the SSE handler uses a single
shared pending entry map h.pending with keys clientName and "" and deletes the
matched slot immediately, which causes a global broadcast (key "") to be
consumed by the first reconnecting client and hides valid per-client entries;
change this to either maintain per-client replay state or use a sequence-based
catch-up so broadcasts are replayable to every client. Specifically, update the
logic around h.pending, the pending entry type (p.expires, p.event) and the
replay loop (which currently iterates over []string{clientName, ""} and
deletes+breaks) so that global entries under "" are not removed when replayed to
one client (instead mark per-client delivery or track last-seen sequence IDs per
client), and ensure per-client slots do not short-circuit visibility of a
still-valid global slot; keep the TTL check (p.expires) but do not delete the
global entry until all clients have acknowledged or its TTL expires.
- Around line 112-143: The code marks delivered=true even when send drops the
event; change send to return bool (true if the channel send succeeded) and
update both loops to set delivered |= send(ch) instead of unconditionally
setting delivered=true; ensure the fallback that writes h.pending[""] (for
target=="" case) or h.pending[target] only runs when delivered is false; keep
the pendingCmd construction and expiry logic (time.Now().Add(ssePendingTTL)) and
use h.clients, pending, pendingCmd, send and target to locate and implement the
changes.
- Around line 176-190: Validate and bound the client name before using it as a
hub key or writing it into the stream: when reading clientName from
c.QueryParam("client") enforce a length cap (e.g., max 32) and allow-list
characters (letters, digits, underscore, hyphen) via a helper like
validSSEClientName; if validation fails or empty, set clientName = "_global";
then use that sanitized clientName in sseHub.subscribe/unsubscribe and in the
fmt.Fprintf call so no newlines or arbitrary long/junk names can create buckets
or inject extra SSE lines.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 638a4ce3-8f57-4bee-aefb-89f0165cacec

📥 Commits

Reviewing files that changed from the base of the PR and between 727e0e0 and 79951a7.

📒 Files selected for processing (1)

internal/routes/routes_sse.go

oleost · 2026-03-30T10:33:57Z

SSE reliability — known limitation

During testing we observed that MQTT commands occasionally fail to reach the kiosk even after the heartbeat + command-queue fix landed in the latest commit.

Root cause: zombie SSE connections

The sequence that causes a lost command:

The SSE connection drops silently (no TCP FIN reaches the server — common with iOS or reverse proxies).
The browser's EventSource detects the drop quickly (~3 s) and reconnects — a new channel is registered in the hub.
The server has not yet detected that the old connection is dead, so the old ("zombie") channel is still in the hub alongside the new live one.
An MQTT command arrives while both channels exist for the same clientName. Go map iteration is random, so the command may be dispatched to the zombie channel.
The zombie goroutine attempts to write to the dead TCP connection — the write fails silently. The command was already consumed from the channel, so it is never placed in the pending queue, and is lost.

The 30 s heartbeat eventually forces context cancellation and cleans up the zombie, but a command arriving in the window between drop and cleanup will be lost.

Why SSE instead of WebSockets?

SSE was chosen for simplicity: no extra library needed, one-way push is all we require, and proxies handle it well. The downside is that SSE has no built-in ping/pong mechanism, so dead connections can only be detected indirectly (via write errors or context cancellation), creating this zombie window.

WebSockets would solve the problem cleanly — the server can actively ping clients and detect dead connections within seconds — but would require additional reconnect logic in the frontend since EventSource's automatic reconnection is not available for WebSockets.

Current status

No action taken for now. The heartbeat + command-queue fix (added in the latest commit) reduces the frequency significantly. A switch to WebSockets remains a possible future improvement if reliability becomes a bigger concern.

- Global pending entry is no longer deleted on first client reconnect; each reconnecting client can now receive a broadcast command within its TTL - send() returns bool so delivered is only set when the channel accept succeeds, ensuring a fallback pending entry is created when all channels are full - Client names read from ?client= are validated (letters, digits, hyphens, underscores, max 32 chars) before use as hub keys or written into the SSE stream; invalid names fall back to _global Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

🧹 Nitpick comments (3)

internal/routes/routes_sse.go (3)
74-75: Consider documenting the channel buffer size rationale.

The buffer size of 4 appears reasonable but lacks documentation explaining the choice. A brief comment would help future maintainers understand whether this value was chosen for specific reasons (e.g., expected burst size, memory trade-offs).
📝 Suggested documentation
 func (h *SSEHub) subscribe(clientName string) chan string {
+	// Buffer allows a small burst of commands to queue without blocking the
+	// sender; 4 is sufficient for typical navigation bursts.
 	ch := make(chan string, 4)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/routes/routes_sse.go` around lines 74 - 75, The channel buffer size
in subscribe (func (h *SSEHub) subscribe(clientName string) chan string) is
undocumented; add a concise comment above the ch := make(chan string, 4) line
explaining why buffer size 4 was chosen (e.g., expected event burst size,
trade-off between latency and memory, or avoidance of blocking on short bursts)
so future maintainers understand the rationale and can adjust it confidently.
230-235: Consider escaping event data for SSE format safety.

The event data is written directly into the SSE stream. Currently safe because mqtt.Command only defines "next" and "previous", but if future commands contain newlines or other special characters, this could corrupt the SSE format or allow injection of additional events.
🛡️ Defensive encoding suggestion
 			case event, ok := <-ch:
 				if !ok {
 					return nil
 				}
-				fmt.Fprintf(w, "event: kiosk-command\ndata: %s\n\n", event)
+				// SSE data lines cannot contain raw newlines; replace if present.
+				safeEvent := strings.ReplaceAll(event, "\n", "\\n")
+				fmt.Fprintf(w, "event: kiosk-command\ndata: %s\n\n", safeEvent)
 				flusher.Flush()
Add "strings" to imports if applying this change.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/routes/routes_sse.go` around lines 230 - 235, The SSE payload writes
event directly via fmt.Fprintf(w, "event: kiosk-command\ndata: %s\n\n", event),
which can break the SSE stream if event contains newlines or special characters;
update the send logic in the goroutine that reads from ch to escape or sanitize
event payloads before writing — e.g., replace actual newline characters with
literal "\n" or split the string and write each line prefixed with "data: " so
multi-line events remain valid SSE (use strings.ReplaceAll or strings.Split and
loop), then call flusher.Flush() as before.
217-218: Write errors are silently ignored.

fmt.Fprintf can return errors if the underlying connection is broken, but these are not checked. For SSE, this is generally acceptable since the context cancellation will eventually trigger cleanup, and the next write would also fail. However, checking write errors could enable faster detection of dead connections, reducing the zombie window mentioned in the PR discussion.
💡 Optional: check write errors for faster cleanup
-		fmt.Fprintf(w, ": connected client=%s\n\n", clientName)
-		flusher.Flush()
+		if _, err := fmt.Fprintf(w, ": connected client=%s\n\n", clientName); err != nil {
+			return nil
+		}
+		flusher.Flush()
Apply similar pattern to heartbeat and event writes in the loop.
Also applies to: 227-229
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/routes/routes_sse.go` around lines 217 - 218, The SSE write calls
(e.g., fmt.Fprintf(w, ": connected client=%s\n\n", clientName)) currently ignore
returned errors; update these to capture and handle the error (check the error
from fmt.Fprintf and from flusher.Flush), log the error and break/return to
terminate the SSE handler promptly so the connection is cleaned up, and apply
the same pattern to the heartbeat and event write sites in the SSE loop (check
write errors, log context including clientName, and exit the handler on error).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/routes/routes_sse.go`:
- Around line 74-75: The channel buffer size in subscribe (func (h *SSEHub)
subscribe(clientName string) chan string) is undocumented; add a concise comment
above the ch := make(chan string, 4) line explaining why buffer size 4 was
chosen (e.g., expected event burst size, trade-off between latency and memory,
or avoidance of blocking on short bursts) so future maintainers understand the
rationale and can adjust it confidently.
- Around line 230-235: The SSE payload writes event directly via fmt.Fprintf(w,
"event: kiosk-command\ndata: %s\n\n", event), which can break the SSE stream if
event contains newlines or special characters; update the send logic in the
goroutine that reads from ch to escape or sanitize event payloads before writing
— e.g., replace actual newline characters with literal "\n" or split the string
and write each line prefixed with "data: " so multi-line events remain valid SSE
(use strings.ReplaceAll or strings.Split and loop), then call flusher.Flush() as
before.
- Around line 217-218: The SSE write calls (e.g., fmt.Fprintf(w, ": connected
client=%s\n\n", clientName)) currently ignore returned errors; update these to
capture and handle the error (check the error from fmt.Fprintf and from
flusher.Flush), log the error and break/return to terminate the SSE handler
promptly so the connection is cleaned up, and apply the same pattern to the
heartbeat and event write sites in the SSE loop (check write errors, log context
including clientName, and exit the handler on error).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a1b2cbb9-6552-4878-a877-5e97b2ea3d41

📥 Commits

Reviewing files that changed from the base of the PR and between 79951a7 and c41d671.

📒 Files selected for processing (1)

internal/routes/routes_sse.go

oleost · 2026-03-30T16:53:20Z

Been testing today and have no issues.

Nice to be able to control from home assistant and even setup an automation when having family and we are going though the pictures to make them change more often.

If you want to have a look and maybe merge?

oleost and others added 2 commits March 30, 2026 10:43

coderabbitai Bot reviewed Mar 30, 2026

View reviewed changes

Comment thread .github/workflows/docker-publish.yml

Comment thread internal/mqtt/mqtt.go

coderabbitai Bot reviewed Mar 30, 2026

View reviewed changes

Comment thread internal/routes/routes_sse.go

Comment thread internal/routes/routes_sse.go Outdated

Comment thread internal/routes/routes_sse.go Outdated

coderabbitai Bot reviewed Mar 30, 2026

View reviewed changes

Uh oh!

Conversation

oleost commented Mar 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add MQTT remote control with Home Assistant auto-discovery

Summary

How MQTT control works

Home Assistant auto-discovery

Configuration

PWA fix

Changes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

Uh oh!

damongolding commented Mar 30, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

oleost commented Mar 30, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

damongolding commented Mar 30, 2026

Uh oh!

oleost commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oleost commented Mar 30, 2026

SSE reliability — known limitation

Root cause: zombie SSE connections

Why SSE instead of WebSockets?

Current status

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

oleost commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

oleost commented Mar 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 30, 2026 •

edited

Loading

oleost commented Mar 30, 2026 •

edited

Loading