Skip to content

feat: add MQTT remote control with Home Assistant auto-discovery#712

Open
oleost wants to merge 6 commits into
damongolding:mainfrom
oleost:main
Open

feat: add MQTT remote control with Home Assistant auto-discovery#712
oleost wants to merge 6 commits into
damongolding:mainfrom
oleost:main

Conversation

@oleost
Copy link
Copy Markdown

@oleost oleost commented Mar 30, 2026

Add MQTT remote control with Home Assistant auto-discovery

Summary

This PR adds MQTT-based remote control to Immich Kiosk, enabling integration with Home Assistant and other MQTT-capable systems. It also fixes a PWA bug where iOS strips query parameters when adding to the home screen.

How MQTT control works

The kiosk connects to a configured MQTT broker and subscribes to command topics. Commands are forwarded to connected browsers via Server-Sent Events (SSE), which trigger the existing HTMX navigation — no polling, no page reload.

Supported commands: next / previous

Topics:

<prefix>/command              → all connected clients
<prefix>/<client>/command     → a specific screen only

Per-screen targeting: Open the kiosk with ?client=living-room in the URL. Each named client gets its own MQTT topic and its own device in Home Assistant.

Home Assistant auto-discovery

On connect, the kiosk publishes MQTT Discovery messages so it appears automatically in HA as a device with Next and Previous buttons — no manual HA configuration needed. Availability (online/offline) is handled via MQTT Last Will and Testament.

Configuration

kiosk:
  mqtt_enabled: true
  mqtt_broker: "192.168.1.x"
  mqtt_port: 1883
  mqtt_username: ""        # optional
  mqtt_password: ""        # optional
  mqtt_client_id: "immich-kiosk"
  mqtt_topic_prefix: "immich-kiosk"

All settings are also available as environment variables (KIOSK_MQTT_ENABLED, KIOSK_MQTT_BROKER etc).

PWA fix

start_url in the web app manifest now includes query parameters, so ?client=living-room is preserved when the page is added to the iOS home screen.

Changes

  • internal/mqtt/ — new package: MQTT client, HA discovery, LWT
  • internal/routes/routes_sse.go — SSE hub grouped by client name
  • frontend/src/ts/mqtt-sse.ts — EventSource listener → HTMX triggers
  • internal/config/config.go + config.schema.json — new MQTT settings
  • internal/routes/routes_manifest.go — PWA manifest fix
  • config.example.yaml — documented MQTT options

Dependency added: github.com/eclipse/paho.mqtt.golang

Summary by CodeRabbit

  • New Features

    • MQTT-based remote control for kiosk navigation (next/previous) with Home Assistant discovery.
    • Real-time server-sent events (SSE) endpoint and hub for broadcasting and queuing commands.
    • Frontend SSE integration and optional commented MQTT kiosk settings plus schema validation.
  • Bug Fixes

    • Manifest start URL now preserves relevant client query parameter for correct start behaviour.
  • Chores

    • Automated multi-architecture Docker build & publish workflow for main branch.

oleost and others added 2 commits March 30, 2026 10:43
Adds MQTT-based remote control so the kiosk can be controlled from
Home Assistant or any other MQTT-capable system.

How it works:
- The kiosk connects to a configured MQTT broker on startup
- It subscribes to <topic_prefix>/command and accepts "next" / "previous"
- Commands are forwarded to connected browsers via a Server-Sent Events
  (SSE) endpoint at /events, which triggers the existing HTMX navigation
- On connect, the kiosk publishes Home Assistant MQTT Discovery messages
  so it appears automatically as a device in HA with Next and Previous
  buttons, without any manual HA configuration
- Availability (online/offline) is handled via MQTT Last Will and Testament

New config options (under kiosk:):
  mqtt_enabled: true
  mqtt_broker: "192.168.1.x"
  mqtt_port: 1883
  mqtt_username: ""
  mqtt_password: ""
  mqtt_client_id: "immich-kiosk"
  mqtt_topic_prefix: "immich-kiosk"

Also adds a GitHub Actions workflow to build and push a multi-platform
Docker image (amd64 + arm64) to ghcr.io on push to main or version tags.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
iOS strips query parameters when adding a page to the home screen,
causing ?client= to be lost on subsequent launches. Using RequestURI()
instead of Path in the manifest start_url ensures the full URL
including query string is saved, so the client name persists.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 30, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds MQTT-based remote control and Home Assistant discovery, an SSE hub to relay MQTT commands to connected kiosks, frontend SSE handling mapping commands to htmx events, config/schema and env bindings, backend wiring (endpoint, middleware), and a CI workflow to build/publish multi‑arch Docker images.

Changes

Cohort / File(s) Summary
CI / Docker publish
/.github/workflows/docker-publish.yml
New GitHub Actions workflow to build and push multi‑arch Docker images to ghcr.io on pushes to main (Buildx, QEMU, metadata tags, cache, VERSION build-arg).
Configuration & Schema
config.example.yaml, config.schema.json, internal/config/config.go
Added commented example MQTT kiosk settings, JSON Schema entries with conditional validation (require mqtt_broker when enabled), and environment bindings for all new kiosk MQTT fields.
MQTT Backend
internal/mqtt/mqtt.go
New MQTT client type: validates broker config, connects with LWT and auto-reconnect, subscribes to global and per‑client command topics, maps payloads to next/previous, dispatches to registered handlers, and publishes Home Assistant discovery for devices.
SSE & Routing
internal/routes/routes_sse.go, internal/routes/routes_manifest.go
Added SSEHub for per‑client subscriptions, pending event queue and TTL, non‑blocking broadcast, /events SSE handler, MQTT→SSE handler, and manifest StartURL generation preserving filtered client query param.
Frontend Integration
frontend/src/ts/mqtt-sse.ts, frontend/src/ts/kiosk.ts
New initMqttSSE() opens EventSource to /events (preserves URL query), listens for kiosk-command SSE events (next/previous) and triggers htmx events; kiosk init now calls it.
Application Wiring
main.go
Conditional MQTT initialisation when enabled, registers MQTT→SSE handler and new-client discovery hook, exposes GET /events, and exempts /events from Gzip compression.
Dependencies
go.mod
Added direct dependency github.com/eclipse/paho.mqtt.golang v1.5.0 (and an indirect github.com/gorilla/websocket v1.5.3).

Sequence Diagram

sequenceDiagram
    participant HA as Home Assistant
    participant MQTT as MQTT Broker
    participant Backend as Backend Server
    participant SSE as SSE Hub
    participant FE as Frontend Client
    participant DOM as DOM

    HA->>MQTT: Publish command (topic: <prefix>/command or <prefix>/<client>/command)
    MQTT->>Backend: Deliver message to subscribed backend client
    Backend->>Backend: Parse topic & payload → determine Command + target
    Backend->>SSE: Broadcast "kiosk-command" event (target or all)
    SSE->>FE: EventSource sends "kiosk-command" SSE event
    FE->>FE: Parse event data ("next"/"previous")
    FE->>DOM: Trigger htmx event (kiosk-new-asset / kiosk-prev-asset)
    DOM->>FE: Frontend reacts (navigate assets)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • Task/release #615: Modifies frontend/src/ts/kiosk.ts initialization flow — touches the same frontend init sequence where initMqttSSE() is inserted.
  • Fix/config validation #520: Changes to configuration schema and internal/config/config.go — overlaps with the added MQTT schema and env bindings.

Suggested labels

enhancement

Poem

🐰🥕 I tuned my whiskers to the line,
MQTT whispers, SSE hums in time.
"Next" or "previous" — a tiny hop, so neat,
Kiosks leap forward with every beat. ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main feature added: MQTT remote control with Home Assistant auto-discovery, which aligns with the primary objectives and substantial code changes across configuration, backend MQTT handling, SSE integration, and frontend implementation.
Docstring Coverage ✅ Passed Docstring coverage is 92.31% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@damongolding
Copy link
Copy Markdown
Owner

Correct me if I am mistaken but this PR appears to be completely AI generated. This that the case?

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (3)
internal/routes/routes_manifest.go (1)

52-52: Restrict persisted query parameters in start_url.

Line 52 currently persists the full referer query string. Consider allow-listing only stable params (for example client) so transient/sensitive query keys are not baked into the installed PWA launch URL.

Suggested hardening patch
 func Manifest(c *echo.Context) error {
 	refererURL := c.Request().Referer()
 	if refererURL == "" {
 		refererURL = "/"
 	}

 	referer, err := url.Parse(refererURL)
 	if err != nil {
 		log.Error("parsing URL", "url", refererURL, "err", err)
 		return errors.New("could not read URL. Is it formatted correctly?")
 	}
+
+	startURL := referer.EscapedPath()
+	if startURL == "" {
+		startURL = "/"
+	}
+	q := referer.Query()
+	filtered := url.Values{}
+	if client := q.Get("client"); client != "" {
+		filtered.Set("client", client)
+	}
+	if encoded := filtered.Encode(); encoded != "" {
+		startURL += "?" + encoded
+	}

 	manifest := &ManifestJSON{
 		Name:            "Immich Kiosk",
 		ShortName:       "Kiosk",
 		Description:     "Immich Kiosk is a lightweight slideshow for running on kiosk devices and browsers that uses Immich as a data source.",
-		StartURL:        referer.RequestURI(),
+		StartURL:        startURL,
 		Scope:           "/",
 		Display:         "fullscreen",
 		BackgroundColor: "#000000",
 		ThemeColor:      "#1f262f",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/routes/routes_manifest.go` at line 52, The manifest currently sets
StartURL to the full referer.RequestURI(), which can persist transient or
sensitive query params; change the code that constructs StartURL (where
StartURL: referer.RequestURI() is set) to parse the referer URL, filter its
query parameters to an allow-list (e.g. only keep "client" or other stable
keys), rebuild the request URI using the original path plus the filtered query
string, and assign that sanitized URI to StartURL so transient/sensitive params
are not persisted.
.github/workflows/docker-publish.yml (2)

59-60: Consider an immutable VERSION value for branch builds.

On main, github.ref_name resolves to main, which makes the embedded version less traceable than a commit SHA. A tag-or-SHA fallback is usually easier to debug in production.

Suggested tweak
           build-args: |
-            VERSION=${{ github.ref_name }}
+            VERSION=${{ startsWith(github.ref, 'refs/tags/') && github.ref_name || github.sha }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/docker-publish.yml around lines 59 - 60, Replace the
mutable VERSION=${{ github.ref_name }} with a conditional that uses tag names
when building a tag but falls back to an immutable commit SHA for branch builds
(e.g. use github.ref_name for tags, otherwise github.sha); update the build-args
line so VERSION is set from github.ref_name when the ref is a tag and from
github.sha for branch/main builds (reference VERSION, github.ref_name, and
github.sha when making the change).

23-50: Pin GitHub Actions to immutable commit SHAs.

All actions in this workflow use mutable version tags (@v3, @v4, etc.) rather than full commit SHA hashes. This is a supply-chain security hardening gap. Replace the following tags with their corresponding commit SHAs:

  • Line 23: actions/checkout@v4
  • Line 26: docker/login-action@v3
  • Line 34: docker/metadata-action@v5
  • Line 44: docker/setup-qemu-action@v3
  • Line 47: docker/setup-buildx-action@v3
  • Line 50: docker/build-push-action@v6

For example: actions/checkout@c85c95e3d7251135ab7dc9cb2c7369c640f9abf1

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/docker-publish.yml around lines 23 - 50, The workflow is
using mutable tags for GitHub Actions (actions/checkout@v4,
docker/login-action@v3, docker/metadata-action@v5, docker/setup-qemu-action@v3,
docker/setup-buildx-action@v3, docker/build-push-action@v6); replace each
referenced action tag with the corresponding immutable commit SHA (e.g.,
actions/checkout@<sha>) throughout the workflow file so the steps Log in to
GitHub Container Registry, Extract metadata (id: meta), Set up QEMU, Set up
Docker Buildx, and Build and push use pinned SHAs; verify the SHAs are from the
official repos/releases and update the uses lines for those actions to the full
commit SHAs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/docker-publish.yml:
- Around line 3-9: The workflow trigger includes on.push.tags: "v*" which
duplicates the same tag trigger in ghcr-release.yaml causing double GHCR pushes;
fix by consolidating or gating one publisher: either remove the tags: - "v*"
trigger from docker-publish.yml (keep branch main) and let ghcr-release.yaml
handle tagged releases, or add a conditional guard in docker-publish.yml (e.g.,
check an explicit environment variable or use if: github.event_name == 'push' &&
!contains(github.ref, 'refs/tags/')) so only one workflow publishes on tag
pushes; update the on.push.tags entry or add the conditional in
docker-publish.yml to prevent duplicate GHCR pushes.

In `@config.example.yaml`:
- Around line 170-177: Update the MQTT docs to mention the per-client topic
format in addition to the global topic: document that commands can be sent to
both <mqtt_topic_prefix>/command and <mqtt_topic_prefix>/<client>/command and
that the kiosk accepts a ?client=<name> query parameter to map a URL to a
specific screen; add a short example using mqtt_topic_prefix (and mqtt_client_id
if relevant) showing a sample per-client topic and the corresponding kiosk URL
with ?client for clarity.

In `@config.schema.json`:
- Around line 568-588: Add JSON Schema validation so MQTT configs are rejected
when unusable: add "minimum":1 and "maximum":65535 to "mqtt_port" and add an
"if"/"then" rule that when "mqtt_enabled" is true requires non-empty
"mqtt_broker" and "mqtt_topic_prefix" (and optionally "mqtt_client_id" if you
want client id enforced). Use the JSON Schema "if": { "properties": {
"mqtt_enabled": { "const": true } } } with a "then": { "required": [...],
"properties": { "mqtt_broker": { "type":"string", "minLength":1 },
"mqtt_topic_prefix": { "type":"string", "minLength":1 } } } so validation fails
early instead of at runtime.

In `@frontend/src/ts/mqtt-sse.ts`:
- Around line 32-37: The handler for the "previous" MQTT/SSE message triggers a
non-existent event ("kiosk-prev-asset") on the element with id
"navigation-interaction-area--previous-asset", so nothing happens; update the
code in mqtt-sse.ts to trigger the actual click event (use htmx.trigger(prevEl,
"click")) or alternatively add "kiosk-prev-asset" to the hx-trigger list in the
template for `#navigation-interaction-area--previous-asset`
(internal/templates/partials/menu.templ) so the element listens for that custom
event; modify whichever approach you choose and ensure the "previous" case in
the switch uses the matching event name.
- Around line 16-19: The SSE URL construction drops existing auth query params;
modify the logic that builds url (params, client, url, evtSource) so you reuse
the full URLSearchParams from window.location.search instead of rebuilding from
client alone—set or update the "client" param on params (e.g.,
params.set("client", client)) and then use `/events?${params.toString()}` (or
`/events` when params is empty) so password/authsecret and any other query keys
are preserved for the EventSource request.

In `@internal/config/config.go`:
- Around line 159-162: The MqttClientID field currently uses a fixed default
"immich-kiosk" which causes client-ID collisions; remove the hardcoded default
in the MqttClientID struct tag (or make it empty) and ensure the code that
initializes or validates configuration (where the config is constructed/loaded)
sets a unique client ID when none is provided by calling the existing
GenerateUUID() utility or deriving one from the hostname; update/mention the
MqttClientID field and the config initialization/validation path so a unique ID
is generated at runtime (or alternatively require an explicit non-empty override
when MQTT is enabled).

In `@internal/mqtt/mqtt.go`:
- Around line 154-160: Validate and normalize clientName before using it to
build MQTT/HA identifiers in PublishClientDiscovery and any other places calling
publishDiscovery (e.g., where uniqueID/displayName are constructed);
specifically, reject empty strings and names containing path separators ("/") or
other MQTT/HA-invalid characters, or normalize them to a safe single-segment
token (e.g., return early with no publish or replace illegal characters with
"_"), and then use the validated value when creating uniqueID
("immich-kiosk-"+clientName), displayName, and when calling c.publishDiscovery
to ensure discovery topics and unique_id are well-formed.
- Around line 154-160: PublishClientDiscovery is advertising per-client devices
but using the server-wide availability topic in publishDiscovery, causing stale
"available" state; either stop auto-discovering per-client devices or change
discovery to use a per-screen availability topic driven by the SSE lifecycle.
Locate Client.PublishClientDiscovery and the publishDiscovery call and: 1) if
opting to defer discovery, skip calling publishDiscovery for per-client names
(e.g., clientName != "_global") until you can represent lifecycle; or 2) if
keeping discovery, modify publishDiscovery to accept and publish an availability
topic scoped to the specific client (e.g., include clientName or uniqueID in the
availability topic) and ensure the SSE connect/disconnect handlers update that
per-client availability (remove retained discovery or publish "offline" on
disconnect) so device availability follows the screen lifecycle.
- Around line 132-137: The constructor currently calls token := client.Connect()
followed by token.Wait(), which blocks and ignores the provided ctx; change this
to an interruptible wait that respects ctx cancellation: use token.WaitTimeout
in a short loop (or equivalent) to periodically check if the token completed,
and inside the loop select on ctx.Done() to abort the connect, call
client.Disconnect(0) (or appropriate cleanup), and return ctx.Err(); after the
loop check token.Error() as before. Update the code around client.Connect,
token, token.Wait/token.WaitTimeout and client.Disconnect to implement this
behavior.

In `@internal/routes/routes_sse.go`:
- Around line 59-67: The unsubscribe function currently removes the last channel
and client entry from h.clients but never prunes h.knownClients, allowing
unbounded growth from bogus ?client values; update SSEHub.unsubscribe to also
delete h.knownClients[clientName] when you remove the final channel (i.e.,
inside the block where you delete(h.clients, clientName) check and delete the
same key from h.knownClients), or alternatively implement a bounded cache/LRU or
TTL for h.knownClients to prevent unbounded growth if you prefer eviction over
immediate deletion; reference SSEHub.unsubscribe, SSEHub.subscribe, clients and
knownClients to locate the changes.

---

Nitpick comments:
In @.github/workflows/docker-publish.yml:
- Around line 59-60: Replace the mutable VERSION=${{ github.ref_name }} with a
conditional that uses tag names when building a tag but falls back to an
immutable commit SHA for branch builds (e.g. use github.ref_name for tags,
otherwise github.sha); update the build-args line so VERSION is set from
github.ref_name when the ref is a tag and from github.sha for branch/main builds
(reference VERSION, github.ref_name, and github.sha when making the change).
- Around line 23-50: The workflow is using mutable tags for GitHub Actions
(actions/checkout@v4, docker/login-action@v3, docker/metadata-action@v5,
docker/setup-qemu-action@v3, docker/setup-buildx-action@v3,
docker/build-push-action@v6); replace each referenced action tag with the
corresponding immutable commit SHA (e.g., actions/checkout@<sha>) throughout the
workflow file so the steps Log in to GitHub Container Registry, Extract metadata
(id: meta), Set up QEMU, Set up Docker Buildx, and Build and push use pinned
SHAs; verify the SHAs are from the official repos/releases and update the uses
lines for those actions to the full commit SHAs.

In `@internal/routes/routes_manifest.go`:
- Line 52: The manifest currently sets StartURL to the full
referer.RequestURI(), which can persist transient or sensitive query params;
change the code that constructs StartURL (where StartURL: referer.RequestURI()
is set) to parse the referer URL, filter its query parameters to an allow-list
(e.g. only keep "client" or other stable keys), rebuild the request URI using
the original path plus the filtered query string, and assign that sanitized URI
to StartURL so transient/sensitive params are not persisted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 11797c60-f248-4269-b464-ba5d368d8975

📥 Commits

Reviewing files that changed from the base of the PR and between 5827b8f and 02a6e41.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (11)
  • .github/workflows/docker-publish.yml
  • config.example.yaml
  • config.schema.json
  • frontend/src/ts/kiosk.ts
  • frontend/src/ts/mqtt-sse.ts
  • go.mod
  • internal/config/config.go
  • internal/mqtt/mqtt.go
  • internal/routes/routes_manifest.go
  • internal/routes/routes_sse.go
  • main.go

Comment thread .github/workflows/docker-publish.yml
Comment thread config.example.yaml Outdated
Comment thread config.schema.json
Comment thread frontend/src/ts/mqtt-sse.ts
Comment on lines +32 to +37
case "previous": {
const prevEl = document.getElementById(
"navigation-interaction-area--previous-asset",
);
if (prevEl) {
htmx.trigger(prevEl, "kiosk-prev-asset");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

previous is wired to an event the button does not listen for.

internal/templates/partials/menu.templ:52-60 only declares hx-trigger="click ..." on #navigation-interaction-area--previous-asset, so kiosk-prev-asset is a no-op here. Trigger click instead, or add kiosk-prev-asset to the template trigger list.

↩️ Minimal fix
-                    htmx.trigger(prevEl, "kiosk-prev-asset");
+                    htmx.trigger(prevEl, "click");
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
case "previous": {
const prevEl = document.getElementById(
"navigation-interaction-area--previous-asset",
);
if (prevEl) {
htmx.trigger(prevEl, "kiosk-prev-asset");
case "previous": {
const prevEl = document.getElementById(
"navigation-interaction-area--previous-asset",
);
if (prevEl) {
htmx.trigger(prevEl, "click");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/src/ts/mqtt-sse.ts` around lines 32 - 37, The handler for the
"previous" MQTT/SSE message triggers a non-existent event ("kiosk-prev-asset")
on the element with id "navigation-interaction-area--previous-asset", so nothing
happens; update the code in mqtt-sse.ts to trigger the actual click event (use
htmx.trigger(prevEl, "click")) or alternatively add "kiosk-prev-asset" to the
hx-trigger list in the template for `#navigation-interaction-area--previous-asset`
(internal/templates/partials/menu.templ) so the element listens for that custom
event; modify whichever approach you choose and ensure the "previous" case in
the switch uses the matching event name.

Comment thread internal/config/config.go
Comment on lines +159 to +162
// MqttClientID is the MQTT client identifier
MqttClientID string `json:"mqttClientId" yaml:"mqtt_client_id" mapstructure:"mqtt_client_id" default:"immich-kiosk"`
// MqttTopicPrefix is the MQTT topic prefix. Commands are received on <prefix>/command
MqttTopicPrefix string `json:"mqttTopicPrefix" yaml:"mqtt_topic_prefix" mapstructure:"mqtt_topic_prefix" default:"immich-kiosk"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In MQTT, what happens when two clients connect to the same broker using the same client ID? Is the earlier connection dropped?

💡 Result:

In MQTT, client IDs must be unique per broker. When two clients attempt to connect using the same client ID, the broker drops the earlier (existing) connection and accepts the new one. This behavior is mandated by the MQTT specifications for both versions 3.1.1 and 5.0 to ensure session integrity and allow session recovery after network failures. For MQTT 3.1.1: If the ClientId represents a Client already connected to the Server then the Server MUST disconnect the existing Client [MQTT-3.1.4-2] (referenced across multiple sources including broker logs and discussions). For MQTT 5.0: If the ClientID represents a Client already connected to the Server, the Server sends a DISCONNECT packet to the existing Client with Reason Code 0x8E (Session taken over) and MUST close the Network Connection of the existing Client [MQTT-3.1.4-3]. This can lead to connection flapping if both clients auto-reconnect. Some brokers offer configuration options (e.g., Mosquitto's allow_duplicate_clients) to reject duplicates instead, but standard compliant behavior is to drop the older connection. Always use unique client IDs, such as by appending random suffixes or UUIDs.

Citations:


🏁 Script executed:

# Search for MqttClientID usage across the codebase
rg "MqttClientID" -A 3 -B 3

# Also check for any UUID or unique ID generation logic related to MQTT
rg -i "mqtt.*id|clientid" --type go -A 2 | head -100

Repository: damongolding/immich-kiosk

Length of output: 3637


🏁 Script executed:

# Check for initialization/startup code that might modify MqttClientID
rg -i "hostname|uuid|machine.*id" --type go internal/ | head -50

# Also check main.go or any startup sequence
find . -name "main.go" -o -name "init.go" | xargs ls -la 2>/dev/null

# Look for any config post-processing that might add uniqueness
rg "MqttClientID\s*=" --type go -A 5 -B 5

Repository: damongolding/immich-kiosk

Length of output: 4557


🏁 Script executed:

# Check main.go to see the startup sequence
head -100 main.go

# Look for where config is loaded and if it's post-processed
rg "NewConfig|LoadConfig" --type go -A 10 -B 2

# Check if KioskSettings or MqttClientID is modified after config load
rg "MqttClientID\s*=" --type go -B 5 -A 5 | grep -v "default:"

Repository: damongolding/immich-kiosk

Length of output: 2732


🏁 Script executed:

# Check the config Load method to see if it modifies MqttClientID
rg "func.*Load" internal/config/ --type go -A 15 | head -80

# Check mqtt.New() function
rg "func.*New.*Client|type Client struct" internal/mqtt/ --type go -A 10

Repository: damongolding/immich-kiosk

Length of output: 3240


Use a unique identifier for the MQTT client ID instead of a fixed default.

Multiple kiosk instances using the default "immich-kiosk" will disconnect each other on the same MQTT broker—the MQTT specification requires the broker to drop the earlier connection when a duplicate client ID connects. Generate a unique ID at runtime (e.g., using the existing GenerateUUID() utility or hostname), or require an explicit override when MQTT is enabled.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/config/config.go` around lines 159 - 162, The MqttClientID field
currently uses a fixed default "immich-kiosk" which causes client-ID collisions;
remove the hardcoded default in the MqttClientID struct tag (or make it empty)
and ensure the code that initializes or validates configuration (where the
config is constructed/loaded) sets a unique client ID when none is provided by
calling the existing GenerateUUID() utility or deriving one from the hostname;
update/mention the MqttClientID field and the config initialization/validation
path so a unique ID is generated at runtime (or alternatively require an
explicit non-empty override when MQTT is enabled).

Comment thread internal/mqtt/mqtt.go
Comment on lines +132 to +137
client := pahomqtt.NewClient(opts)
token := client.Connect()
token.Wait()
if err := token.Error(); err != nil {
return nil, fmt.Errorf("mqtt connect: %w", err)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd /root && find . -type f -name "*.go" | head -20

Repository: damongolding/immich-kiosk

Length of output: 124


🏁 Script executed:

find . -type f -name "mqtt.go" -o -name "mqtt_test.go"

Repository: damongolding/immich-kiosk

Length of output: 92


🏁 Script executed:

if [ -f "internal/mqtt/mqtt.go" ]; then
  wc -l internal/mqtt/mqtt.go
fi

Repository: damongolding/immich-kiosk

Length of output: 94


🏁 Script executed:

if [ -f "internal/mqtt/mqtt.go" ]; then
  sed -n '120,150p' internal/mqtt/mqtt.go
fi

Repository: damongolding/immich-kiosk

Length of output: 776


🏁 Script executed:

if [ -f "internal/mqtt/mqtt.go" ]; then
  cat -n internal/mqtt/mqtt.go | head -50
fi

Repository: damongolding/immich-kiosk

Length of output: 1943


🏁 Script executed:

sed -n '1,100p' internal/mqtt/mqtt.go | cat -n

Repository: damongolding/immich-kiosk

Length of output: 3875


🏁 Script executed:

sed -n '100,160p' internal/mqtt/mqtt.go | cat -n

Repository: damongolding/immich-kiosk

Length of output: 2477


🏁 Script executed:

grep -n "^func New" internal/mqtt/mqtt.go

Repository: damongolding/immich-kiosk

Length of output: 151


🏁 Script executed:

grep -r "WaitTimeout" --include="*.go"

Repository: damongolding/immich-kiosk

Length of output: 51


🏁 Script executed:

rg "github.com/eclipse/paho.mqtt.golang" -A 5 -B 5 | head -30

Repository: damongolding/immich-kiosk

Length of output: 1919


🏁 Script executed:

grep -n "go func()" internal/mqtt/mqtt.go

Repository: damongolding/immich-kiosk

Length of output: 85


🌐 Web query:

eclipse paho mqtt golang token WaitTimeout API

💡 Result:

The Eclipse Paho MQTT Golang client library (github.com/eclipse/paho.mqtt.golang, latest v1.5.1 as of 2025-09-16) provides the Token interface for tracking asynchronous MQTT operations like Connect, Publish, Subscribe, etc. The WaitTimeout method is part of this interface. Signature: func (t *baseToken) WaitTimeout(d time.Duration) bool Behavior: Waits up to duration d for the associated operation (token flow) to complete. Returns true if completed before timeout (success or error), false if timeout occurs. Importantly, on timeout (false), token.Error remains nil intentionally, as the operation may still be ongoing and complete later. Users must handle the false case separately (e.g., cancel or retry). From source (token.go): func (b baseToken) WaitTimeout(d time.Duration) bool { timer := time.NewTimer(d) select { case <-b.complete: if !timer.Stop { <-timer.C } return true case <-timer.C: return false } } Documentation comment: "WaitTimeout takes a time.Duration to wait for the flow associated with the Token to complete, returns true if it returned before the timeout or returns false if the timeout occurred. In the case of a timeout the Token does not have an error set in case the caller wishes to wait again." Common misuse (warned in issues): if token.WaitTimeout(d) && token.Error != nil { ... } – this skips timeout handling since false short-circuits. Recommended usage (from library, added in recent updates): Use WaitTokenTimeout(t Token, d time.Duration) error for convenience: if err := WaitTokenTimeout(token, 5time.Second); err != nil { if errors.Is(err, TimedOut) { // handle timeout } else { // handle other error } } Where TimedOut = errors.New("context canceled") – but note it says operation may still succeed later. Token also provides: - Wait bool: Waits indefinitely, always returns true. - Done <-chan struct{}: For select{} with contexts/timeouts. - Error error: Error if operation failed (not on timeout). Examples: token := client.Connect if !token.WaitTimeout(5 * time.Second) { // timeout - decide to return error, retry, etc. return fmt.Errorf("connect timeout") } if token.Error != nil { return token.Error } Official docs: pkg.go.dev/github.com/eclipse/paho.mqtt.golang (links to godoc with full API). Repo: github.com/eclipse/paho.mqtt.golang. Note: Separate MQTT v5 client at github.com/eclipse/paho.golang (no tokens, uses contexts).

Citations:


Make the initial connect path respect context cancellation.

New accepts context.Context but token.Wait() never checks it. If the broker is slow or unavailable, the constructor can block indefinitely even after ctx is cancelled, preventing graceful shutdown or timeout enforcement.

Suggested fix
 client := pahomqtt.NewClient(opts)
 token := client.Connect()
-token.Wait()
-if err := token.Error(); err != nil {
-	return nil, fmt.Errorf("mqtt connect: %w", err)
+for {
+	if token.WaitTimeout(500 * time.Millisecond) {
+		if err := token.Error(); err != nil {
+			return nil, fmt.Errorf("mqtt connect: %w", err)
+		}
+		break
+	}
+	select {
+	case <-ctx.Done():
+		client.Disconnect(0)
+		return nil, ctx.Err()
+	default:
+	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/mqtt/mqtt.go` around lines 132 - 137, The constructor currently
calls token := client.Connect() followed by token.Wait(), which blocks and
ignores the provided ctx; change this to an interruptible wait that respects ctx
cancellation: use token.WaitTimeout in a short loop (or equivalent) to
periodically check if the token completed, and inside the loop select on
ctx.Done() to abort the connect, call client.Disconnect(0) (or appropriate
cleanup), and return ctx.Err(); after the loop check token.Error() as before.
Update the code around client.Connect, token, token.Wait/token.WaitTimeout and
client.Disconnect to implement this behavior.

Comment thread internal/mqtt/mqtt.go
Comment on lines +154 to +160
func (c *Client) PublishClientDiscovery(clientName string) {
if clientName == "_global" {
return
}
uniqueID := "immich-kiosk-" + clientName
displayName := "Immich Kiosk - " + clientName
c.publishDiscovery(c.client, uniqueID, clientName, displayName)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate clientName before generating MQTT and HA identifiers.

This code interpolates the raw screen name into prefix/<client>/command, the discovery topic, and unique_id. "" produces broken discovery identifiers, and foo/bar produces a different topic shape from the single-segment form the rest of this package expects, so the discovered controls silently stop working. Reject or normalise invalid names here instead of relying on upstream callers.

Possible fix
 func (c *Client) PublishClientDiscovery(clientName string) {
-	if clientName == "_global" {
+	clientName = strings.TrimSpace(clientName)
+	if clientName == "" || clientName == "_global" {
 		return
 	}
+	if strings.ContainsAny(clientName, "/+#") {
+		log.Error("MQTT client name must be a single safe topic segment", "client", clientName)
+		return
+	}
 	uniqueID := "immich-kiosk-" + clientName
 	displayName := "Immich Kiosk - " + clientName
 	c.publishDiscovery(c.client, uniqueID, clientName, displayName)

Also applies to: 171-172, 190-193

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/mqtt/mqtt.go` around lines 154 - 160, Validate and normalize
clientName before using it to build MQTT/HA identifiers in
PublishClientDiscovery and any other places calling publishDiscovery (e.g.,
where uniqueID/displayName are constructed); specifically, reject empty strings
and names containing path separators ("/") or other MQTT/HA-invalid characters,
or normalize them to a safe single-segment token (e.g., return early with no
publish or replace illegal characters with "_"), and then use the validated
value when creating uniqueID ("immich-kiosk-"+clientName), displayName, and when
calling c.publishDiscovery to ensure discovery topics and unique_id are
well-formed.

⚠️ Potential issue | 🟠 Major

Per-client devices are advertised as server-wide, not screen-wide.

PublishClientDiscovery adds a retained device for each named client, but publishDiscovery points every one of them at the global availability topic. After a screen disconnects, Home Assistant will still show that device as available whenever the server is online, which leaves stale controls in the UI. Please drive per-client availability/removal from the SSE lifecycle, or avoid auto-discovering per-client devices until that lifecycle can be represented.

Also applies to: 173-175

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/mqtt/mqtt.go` around lines 154 - 160, PublishClientDiscovery is
advertising per-client devices but using the server-wide availability topic in
publishDiscovery, causing stale "available" state; either stop auto-discovering
per-client devices or change discovery to use a per-screen availability topic
driven by the SSE lifecycle. Locate Client.PublishClientDiscovery and the
publishDiscovery call and: 1) if opting to defer discovery, skip calling
publishDiscovery for per-client names (e.g., clientName != "_global") until you
can represent lifecycle; or 2) if keeping discovery, modify publishDiscovery to
accept and publish an availability topic scoped to the specific client (e.g.,
include clientName or uniqueID in the availability topic) and ensure the SSE
connect/disconnect handlers update that per-client availability (remove retained
discovery or publish "offline" on disconnect) so device availability follows the
screen lifecycle.

Comment thread internal/routes/routes_sse.go
@oleost
Copy link
Copy Markdown
Author

oleost commented Mar 30, 2026

Correct me if I am mistaken but this PR appears to be completely AI generated. This that the case?

Hi, this is correct. But I have it already in use in my home. Please feel free to discard it if wanted.

- docker-publish.yml: remove tag trigger to avoid double-publish with
  existing ghcr-release.yaml; use commit SHA as VERSION for branch builds
- mqtt-sse.ts: forward all existing query params (including auth) to the
  SSE endpoint instead of rebuilding the URL from client only
- routes_manifest.go: filter start_url to only keep the "client" param
  so transient/sensitive query params are not baked into the PWA manifest
- routes_sse.go: delete from knownClients on last disconnect to prevent
  unbounded memory growth
- mqtt.go: validate clientName before using it as an MQTT topic segment;
  reject names containing '/', '#', or '+'
- config.schema.json: add port range (1-65535) and if/then rule requiring
  mqtt_broker when mqtt_enabled is true
- config.example.yaml: document per-client topic format and ?client= usage

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
internal/routes/routes_sse.go (1)

115-115: Unused parameter _ *config.Config.

The config.Config parameter is not used in this handler. If it's required for interface consistency with other route handlers, this is fine. Otherwise, consider removing it to clarify the function signature.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/routes/routes_sse.go` at line 115, The SSEEvents handler declares an
unused parameter `_ *config.Config`; either remove the parameter from the
SSEEvents signature if it's not required for interface consistency, or if you
must keep the signature, rename the parameter to a named but unused identifier
(e.g., cfg *config.Config) and use the blank identifier only if deliberate;
locate the SSEEvents function declaration and adjust its signature accordingly
to eliminate the unused parameter warning while preserving any required
interface compatibility.
internal/mqtt/mqtt.go (2)

222-222: Publish errors are not checked.

client.Publish() returns a token that could indicate a failure. Although the operation is fire-and-forget for discovery, consider at least debug-logging if the publish fails to aid troubleshooting.

Optional: log publish failures
-		client.Publish(discoveryTopic, 1, true, payload)
+		if t := client.Publish(discoveryTopic, 1, true, payload); t.Wait() && t.Error() != nil {
+			log.Debug("MQTT discovery publish failed", "topic", discoveryTopic, "err", t.Error())
+		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/mqtt/mqtt.go` at line 222, client.Publish(...) return token is
ignored; capture the token from client.Publish(discoveryTopic, 1, true,
payload), wait (or Check) for completion and if token.Error() != nil, log the
error (at debug or warn level) including discoveryTopic and payload/context to
aid troubleshooting; update the publish call site that currently uses
client.Publish(discoveryTopic, 1, true, payload) to store the returned token,
call token.Wait() (or WaitTimeout) and then log token.Error() when non-nil.

100-121: Consider using Info level for successful operations.

log.Warn is typically reserved for concerning conditions. Successful connection and subscription events at lines 101, 112, and 120 would be more appropriate at Info level, whilst Warn remains suitable for connection lost (line 125) and reconnecting (line 129) events.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/mqtt/mqtt.go` around lines 100 - 121, The success logs inside the
MQTT on-connect handler use log.Warn but should be Info: update the log level
calls in the opts.SetOnConnectHandler block—specifically the "MQTT connected"
message (currently using log.Warn), the "MQTT subscribed" message after
subscribing to c.topic, and the "MQTT subscribed" message after subscribing to
wildcardTopic—to use log.Info instead; leave other warning logs (e.g.,
connection lost/reconnecting) as log.Warn. Ensure you update the calls near
client.Publish, c.publishDiscovery, the Subscribe result checks for c.topic and
wildcardTopic that reference c.messageHandler.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/docker-publish.yml:
- Around line 12-14: The workflow lacks concurrency control for the
build-and-push job so overlapping runs on main can finish out of order; add a
concurrency block (either at the workflow or job level) for the build-and-push
job using a deterministic group like "build-and-push-${{ github.ref }}" or "${{
github.workflow }}-${{ github.ref }}" and set cancel-in-progress: true so any
in-flight build-and-push for the same ref is canceled when a new run starts,
preventing stale latest/main image tags.

In `@internal/mqtt/mqtt.go`:
- Around line 226-229: The handlers slice (Client.handlers) is accessed
concurrently by AddHandler and messageHandler causing a data race; add a
sync.RWMutex (e.g., handlersMu) to the Client struct and use it to protect
access: acquire handlersMu.Lock() when modifying the slice in AddHandler (and
append), and acquire handlersMu.RLock()/RUnlock() when iterating or reading the
slice in messageHandler; ensure you initialize the mutex via the zero value and
avoid holding locks while calling user handlers (copy the slice under the lock
then iterate/call handlers outside the lock).

---

Nitpick comments:
In `@internal/mqtt/mqtt.go`:
- Line 222: client.Publish(...) return token is ignored; capture the token from
client.Publish(discoveryTopic, 1, true, payload), wait (or Check) for completion
and if token.Error() != nil, log the error (at debug or warn level) including
discoveryTopic and payload/context to aid troubleshooting; update the publish
call site that currently uses client.Publish(discoveryTopic, 1, true, payload)
to store the returned token, call token.Wait() (or WaitTimeout) and then log
token.Error() when non-nil.
- Around line 100-121: The success logs inside the MQTT on-connect handler use
log.Warn but should be Info: update the log level calls in the
opts.SetOnConnectHandler block—specifically the "MQTT connected" message
(currently using log.Warn), the "MQTT subscribed" message after subscribing to
c.topic, and the "MQTT subscribed" message after subscribing to wildcardTopic—to
use log.Info instead; leave other warning logs (e.g., connection
lost/reconnecting) as log.Warn. Ensure you update the calls near client.Publish,
c.publishDiscovery, the Subscribe result checks for c.topic and wildcardTopic
that reference c.messageHandler.

In `@internal/routes/routes_sse.go`:
- Line 115: The SSEEvents handler declares an unused parameter `_
*config.Config`; either remove the parameter from the SSEEvents signature if
it's not required for interface consistency, or if you must keep the signature,
rename the parameter to a named but unused identifier (e.g., cfg *config.Config)
and use the blank identifier only if deliberate; locate the SSEEvents function
declaration and adjust its signature accordingly to eliminate the unused
parameter warning while preserving any required interface compatibility.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 03876182-c7ba-4f5f-b55d-de16e63d0446

📥 Commits

Reviewing files that changed from the base of the PR and between 02a6e41 and cfe9498.

📒 Files selected for processing (7)
  • .github/workflows/docker-publish.yml
  • config.example.yaml
  • config.schema.json
  • frontend/src/ts/mqtt-sse.ts
  • internal/mqtt/mqtt.go
  • internal/routes/routes_manifest.go
  • internal/routes/routes_sse.go
✅ Files skipped from review due to trivial changes (1)
  • config.example.yaml
🚧 Files skipped from review as they are similar to previous changes (3)
  • internal/routes/routes_manifest.go
  • frontend/src/ts/mqtt-sse.ts
  • config.schema.json

Comment thread .github/workflows/docker-publish.yml
Comment thread internal/mqtt/mqtt.go
- docker-publish.yml: add concurrency group so rapid pushes to main
  cancel the previous in-progress build, preventing stale image tags
- mqtt.go: protect the handlers slice with a RWMutex to eliminate the
  data race between AddHandler() and messageHandler() which can be
  called from different goroutines after Connect() is established

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@damongolding
Copy link
Copy Markdown
Owner

Correct me if I am mistaken but this PR appears to be completely AI generated. This that the case?

Hi, this is correct. But I have it already in use in my home. Please feel free to discard it if wanted.

Does it work well for you?

@oleost
Copy link
Copy Markdown
Author

oleost commented Mar 30, 2026

Correct me if I am mistaken but this PR appears to be completely AI generated. This that the case?

Hi, this is correct. But I have it already in use in my home. Please feel free to discard it if wanted.

Does it work well for you?

Only had it for short while, no issue so far. Can report back in a couple of days. Currently making it work through some of coderabbitai complaints.

But if you see something that obviously need changing please let me know.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/mqtt/mqtt.go (1)

85-86: Consider adding TLS support for secure MQTT connections.

The broker URL is hardcoded to use tcp://. For production deployments, especially with authentication credentials, TLS (ssl:// or tls://) is recommended. Consider adding a configuration option like mqtt_tls to enable secure connections.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/mqtt/mqtt.go` around lines 85 - 86, The broker URL is hardcoded to
"tcp://" — add TLS support by introducing a configuration flag (e.g.
settings.MqttTLS or mqtt_tls) and use it to choose the scheme ("tcp://" vs
"ssl://" or "tls://") when building the broker URL where opts :=
pahomqtt.NewClientOptions() and opts.AddBroker(...) is called; if TLS is enabled
also set a TLS configuration on the client options (opts.SetTLSConfig(...)) and
consider defaulting the port to 8883 when settings.MqttTLS is true to ensure
secure connections.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/mqtt/mqtt.go`:
- Around line 85-86: The broker URL is hardcoded to "tcp://" — add TLS support
by introducing a configuration flag (e.g. settings.MqttTLS or mqtt_tls) and use
it to choose the scheme ("tcp://" vs "ssl://" or "tls://") when building the
broker URL where opts := pahomqtt.NewClientOptions() and opts.AddBroker(...) is
called; if TLS is enabled also set a TLS configuration on the client options
(opts.SetTLSConfig(...)) and consider defaulting the port to 8883 when
settings.MqttTLS is true to ensure secure connections.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ce6fcb6b-9442-4b7a-ab31-e289c2505d23

📥 Commits

Reviewing files that changed from the base of the PR and between cfe9498 and 727e0e0.

📒 Files selected for processing (2)
  • .github/workflows/docker-publish.yml
  • internal/mqtt/mqtt.go
✅ Files skipped from review due to trivial changes (1)
  • .github/workflows/docker-publish.yml

Two problems caused MQTT commands to be silently lost:

1. Proxies and iOS suspend idle SSE connections after inactivity.
   Fix: send a keepalive comment every 30 s to keep the connection alive.

2. If a command arrived while the browser was reconnecting (no active SSE
   client), it was dropped with no retry.
   Fix: queue the last command with a 10 s TTL. When a client reconnects
   the queued command is delivered immediately, then discarded.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/routes/routes_sse.go`:
- Around line 68-82: The current replay logic in the SSE handler uses a single
shared pending entry map h.pending with keys clientName and "" and deletes the
matched slot immediately, which causes a global broadcast (key "") to be
consumed by the first reconnecting client and hides valid per-client entries;
change this to either maintain per-client replay state or use a sequence-based
catch-up so broadcasts are replayable to every client. Specifically, update the
logic around h.pending, the pending entry type (p.expires, p.event) and the
replay loop (which currently iterates over []string{clientName, ""} and
deletes+breaks) so that global entries under "" are not removed when replayed to
one client (instead mark per-client delivery or track last-seen sequence IDs per
client), and ensure per-client slots do not short-circuit visibility of a
still-valid global slot; keep the TTL check (p.expires) but do not delete the
global entry until all clients have acknowledged or its TTL expires.
- Around line 112-143: The code marks delivered=true even when send drops the
event; change send to return bool (true if the channel send succeeded) and
update both loops to set delivered |= send(ch) instead of unconditionally
setting delivered=true; ensure the fallback that writes h.pending[""] (for
target=="" case) or h.pending[target] only runs when delivered is false; keep
the pendingCmd construction and expiry logic (time.Now().Add(ssePendingTTL)) and
use h.clients, pending, pendingCmd, send and target to locate and implement the
changes.
- Around line 176-190: Validate and bound the client name before using it as a
hub key or writing it into the stream: when reading clientName from
c.QueryParam("client") enforce a length cap (e.g., max 32) and allow-list
characters (letters, digits, underscore, hyphen) via a helper like
validSSEClientName; if validation fails or empty, set clientName = "_global";
then use that sanitized clientName in sseHub.subscribe/unsubscribe and in the
fmt.Fprintf call so no newlines or arbitrary long/junk names can create buckets
or inject extra SSE lines.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 638a4ce3-8f57-4bee-aefb-89f0165cacec

📥 Commits

Reviewing files that changed from the base of the PR and between 727e0e0 and 79951a7.

📒 Files selected for processing (1)
  • internal/routes/routes_sse.go

Comment thread internal/routes/routes_sse.go
Comment thread internal/routes/routes_sse.go Outdated
Comment thread internal/routes/routes_sse.go Outdated
@oleost
Copy link
Copy Markdown
Author

oleost commented Mar 30, 2026

SSE reliability — known limitation

During testing we observed that MQTT commands occasionally fail to reach the kiosk even after the heartbeat + command-queue fix landed in the latest commit.

Root cause: zombie SSE connections

The sequence that causes a lost command:

  1. The SSE connection drops silently (no TCP FIN reaches the server — common with iOS or reverse proxies).
  2. The browser's EventSource detects the drop quickly (~3 s) and reconnects — a new channel is registered in the hub.
  3. The server has not yet detected that the old connection is dead, so the old ("zombie") channel is still in the hub alongside the new live one.
  4. An MQTT command arrives while both channels exist for the same clientName. Go map iteration is random, so the command may be dispatched to the zombie channel.
  5. The zombie goroutine attempts to write to the dead TCP connection — the write fails silently. The command was already consumed from the channel, so it is never placed in the pending queue, and is lost.

The 30 s heartbeat eventually forces context cancellation and cleans up the zombie, but a command arriving in the window between drop and cleanup will be lost.

Why SSE instead of WebSockets?

SSE was chosen for simplicity: no extra library needed, one-way push is all we require, and proxies handle it well. The downside is that SSE has no built-in ping/pong mechanism, so dead connections can only be detected indirectly (via write errors or context cancellation), creating this zombie window.

WebSockets would solve the problem cleanly — the server can actively ping clients and detect dead connections within seconds — but would require additional reconnect logic in the frontend since EventSource's automatic reconnection is not available for WebSockets.

Current status

No action taken for now. The heartbeat + command-queue fix (added in the latest commit) reduces the frequency significantly. A switch to WebSockets remains a possible future improvement if reliability becomes a bigger concern.

- Global pending entry is no longer deleted on first client reconnect;
  each reconnecting client can now receive a broadcast command within its TTL
- send() returns bool so delivered is only set when the channel accept
  succeeds, ensuring a fallback pending entry is created when all channels
  are full
- Client names read from ?client= are validated (letters, digits, hyphens,
  underscores, max 32 chars) before use as hub keys or written into the
  SSE stream; invalid names fall back to _global

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
internal/routes/routes_sse.go (3)

74-75: Consider documenting the channel buffer size rationale.

The buffer size of 4 appears reasonable but lacks documentation explaining the choice. A brief comment would help future maintainers understand whether this value was chosen for specific reasons (e.g., expected burst size, memory trade-offs).

📝 Suggested documentation
 func (h *SSEHub) subscribe(clientName string) chan string {
+	// Buffer allows a small burst of commands to queue without blocking the
+	// sender; 4 is sufficient for typical navigation bursts.
 	ch := make(chan string, 4)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/routes/routes_sse.go` around lines 74 - 75, The channel buffer size
in subscribe (func (h *SSEHub) subscribe(clientName string) chan string) is
undocumented; add a concise comment above the ch := make(chan string, 4) line
explaining why buffer size 4 was chosen (e.g., expected event burst size,
trade-off between latency and memory, or avoidance of blocking on short bursts)
so future maintainers understand the rationale and can adjust it confidently.

230-235: Consider escaping event data for SSE format safety.

The event data is written directly into the SSE stream. Currently safe because mqtt.Command only defines "next" and "previous", but if future commands contain newlines or other special characters, this could corrupt the SSE format or allow injection of additional events.

🛡️ Defensive encoding suggestion
 			case event, ok := <-ch:
 				if !ok {
 					return nil
 				}
-				fmt.Fprintf(w, "event: kiosk-command\ndata: %s\n\n", event)
+				// SSE data lines cannot contain raw newlines; replace if present.
+				safeEvent := strings.ReplaceAll(event, "\n", "\\n")
+				fmt.Fprintf(w, "event: kiosk-command\ndata: %s\n\n", safeEvent)
 				flusher.Flush()

Add "strings" to imports if applying this change.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/routes/routes_sse.go` around lines 230 - 235, The SSE payload writes
event directly via fmt.Fprintf(w, "event: kiosk-command\ndata: %s\n\n", event),
which can break the SSE stream if event contains newlines or special characters;
update the send logic in the goroutine that reads from ch to escape or sanitize
event payloads before writing — e.g., replace actual newline characters with
literal "\n" or split the string and write each line prefixed with "data: " so
multi-line events remain valid SSE (use strings.ReplaceAll or strings.Split and
loop), then call flusher.Flush() as before.

217-218: Write errors are silently ignored.

fmt.Fprintf can return errors if the underlying connection is broken, but these are not checked. For SSE, this is generally acceptable since the context cancellation will eventually trigger cleanup, and the next write would also fail. However, checking write errors could enable faster detection of dead connections, reducing the zombie window mentioned in the PR discussion.

💡 Optional: check write errors for faster cleanup
-		fmt.Fprintf(w, ": connected client=%s\n\n", clientName)
-		flusher.Flush()
+		if _, err := fmt.Fprintf(w, ": connected client=%s\n\n", clientName); err != nil {
+			return nil
+		}
+		flusher.Flush()

Apply similar pattern to heartbeat and event writes in the loop.

Also applies to: 227-229

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/routes/routes_sse.go` around lines 217 - 218, The SSE write calls
(e.g., fmt.Fprintf(w, ": connected client=%s\n\n", clientName)) currently ignore
returned errors; update these to capture and handle the error (check the error
from fmt.Fprintf and from flusher.Flush), log the error and break/return to
terminate the SSE handler promptly so the connection is cleaned up, and apply
the same pattern to the heartbeat and event write sites in the SSE loop (check
write errors, log context including clientName, and exit the handler on error).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/routes/routes_sse.go`:
- Around line 74-75: The channel buffer size in subscribe (func (h *SSEHub)
subscribe(clientName string) chan string) is undocumented; add a concise comment
above the ch := make(chan string, 4) line explaining why buffer size 4 was
chosen (e.g., expected event burst size, trade-off between latency and memory,
or avoidance of blocking on short bursts) so future maintainers understand the
rationale and can adjust it confidently.
- Around line 230-235: The SSE payload writes event directly via fmt.Fprintf(w,
"event: kiosk-command\ndata: %s\n\n", event), which can break the SSE stream if
event contains newlines or special characters; update the send logic in the
goroutine that reads from ch to escape or sanitize event payloads before writing
— e.g., replace actual newline characters with literal "\n" or split the string
and write each line prefixed with "data: " so multi-line events remain valid SSE
(use strings.ReplaceAll or strings.Split and loop), then call flusher.Flush() as
before.
- Around line 217-218: The SSE write calls (e.g., fmt.Fprintf(w, ": connected
client=%s\n\n", clientName)) currently ignore returned errors; update these to
capture and handle the error (check the error from fmt.Fprintf and from
flusher.Flush), log the error and break/return to terminate the SSE handler
promptly so the connection is cleaned up, and apply the same pattern to the
heartbeat and event write sites in the SSE loop (check write errors, log context
including clientName, and exit the handler on error).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a1b2cbb9-6552-4878-a877-5e97b2ea3d41

📥 Commits

Reviewing files that changed from the base of the PR and between 79951a7 and c41d671.

📒 Files selected for processing (1)
  • internal/routes/routes_sse.go

@oleost
Copy link
Copy Markdown
Author

oleost commented Mar 30, 2026

Been testing today and have no issues.

Nice to be able to control from home assistant and even setup an automation when having family and we are going though the pictures to make them change more often.

If you want to have a look and maybe merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants