Skip to content

Switch mDNS publisher to brutella/dnssd, pre-probe, and use a distinct service name#198

Open
agners wants to merge 5 commits into
mainfrom
use-brutella-dnssd
Open

Switch mDNS publisher to brutella/dnssd, pre-probe, and use a distinct service name#198
agners wants to merge 5 commits into
mainfrom
use-brutella-dnssd

Conversation

@agners

@agners agners commented Jun 9, 2026

Copy link
Copy Markdown
Member

Follow-up to #195. Four small commits, each independently reviewable.

Commits

da5a3de — Switch from libp2p/zeroconf/v2 to brutella/dnssd

brutella/dnssd is a more active and DNS-SD-focused library:

  • Last tagged release Oct 2024 vs libp2p/zeroconf/v2's Aug 2022. Last commit Feb 2026 vs Aug 2025.
  • Purpose-built for RFC 6762/6763 — full §8 probing, §9 conflict detection with rename, two-stage announce per §8.3, §10.1 goodbye on context cancel. libp2p/zeroconf inherited a partial implementation from grandcat and still has the // TODO: implement a proper probing & conflict resolution comment (server.go:552).
  • Subscribes to Linux netlink LinkUpdate and re-announces on interface changes — covers the late-cable-plug-in / network-flip case TimoPtr flagged in the Reintroduce mDNS using libp2p/zeroconf/v2 #195 review for free.
  • Takes interface names rather than *net.Interface pointers, which is exactly what Supervisor reports — no net.InterfaceByName step needed.
  • TXT records as map[string]string instead of []string{"k=v"}.

The lifecycle now hangs off a context.Context rather than a separate Shutdown() call: cancelling the context makes Respond() send the goodbye and return. That removes the atomic.Pointer dance from main.go that the previous library required.

publishHomeAssistant(ctx) is still launched in a goroutine because the Supervisor-info call can block for several seconds while Supervisor comes up; that behavior is preserved.

d28f4aa — Pre-probe the service instance name to surface conflict-renames

brutella's Respond() already does RFC 6762 §8 probing and renames on conflict via incrementServiceName ("Home""Home (2)"). But the rename happens inside Respond(), so the log line we emit just before shows the intended name, not the post-probing one.

Probe explicitly with dnssd.ProbeService() up-front, then add the probed service to the responder and log the final ServiceInstanceName (which includes any rename). brutella re-probes inside Respond(), but the second probe sees no conflict (the name we just picked is free) and completes immediately.

Note: brutella's rename style is "Name (N)", which differs from Core's "Name-N". An earlier draft drove its own loop to match Core's style exactly; reverted because the additional probe rounds it required were not worth the cosmetic match — see the next commit.

806b4b4 — Change service instance name from "Home" to "Home (preparing setup)"

"Home" is python-zeroconf's default when no location_name is set, so on a LAN with one or more configured Core instances the landing page collides on this name and brutella renames it to "Home (2)"-style.

The rename works, but the user sees an entry that's hard to tell apart from a real Core install. Renaming our default to "Home (preparing setup)" achieves two things:

  • The instance is immediately distinguishable in onboarding lists as "the install you're configuring right now, not one of the already-configured ones."
  • The name is unlikely to collide with anything else on the LAN, so the rename path is rarely taken at all.

location_name in TXT is derived from the same constant, so once Core takes over it replaces the friendly name with whatever the user configures (defaulting back to "Home").

91c4ae5 — Silence brutella/dnssd's chatty INFO logger

brutella ships its own logger in github.com/brutella/dnssd/log with Info enabled to stdout by default. Almost everything it logs at INFO level is either an observation about other hosts' RFC-6762 violations or a transient network event:

  • dnssd: invalid source address — packets with weird source IPs
  • dnssd: ... MUST be ... (RFC6762 18.x) — peer policy nags
  • route ip+net: no such network interface — netlink delivered a link-update for an iface that was already torn down (e.g. a docker veth coming and going, which happens constantly on HA OS as add-on containers start/stop)

None of those are actionable for the landing page; they just clutter the journal. Disable the Info logger so only our own log.Printf lines end up in container output. Debug stays off by default.

What we tested

  • go build ./... and go vet ./... pass.
  • On a LAN with 12+ existing Core installs publishing as Home*, the landing page now announces as Home (preparing setup)._home-assistant._tcp.local. (no conflict, no rename) and shows up in Android NSD's onboarding list — fixes the symptom reported in the post-Reintroduce mDNS using libp2p/zeroconf/v2 #195 discussion where Core's "Home" was winning the cache-flush race against the landing page's "Home".
  • Container restarts produce a clean RFC 6762 §10.1 goodbye (TTL=0 PTR) via context cancellation in main().
  • The route ip+net: no such network interface INFO log line is gone.

agners added 4 commits June 9, 2026 15:12
brutella/dnssd is a more active and DNS-SD-focused library:

- Last tagged release Oct 2024 vs libp2p/zeroconf/v2's Aug 2022. Last
  commit Feb 2026 vs Aug 2025.
- Purpose-built for RFC 6762/6763 — full §8 probing, §9 conflict
  detection with rename, two-stage announce per §8.3, §10.1 goodbye on
  ctx cancel. libp2p/zeroconf inherited a partial implementation from
  grandcat and still has rough edges (no real conflict handling).
- Subscribes to Linux netlink LinkUpdate and re-announces on interface
  changes — covers the late-cable-plug-in / network-flip case TimoPtr
  flagged in the previous PR review for free.
- Takes interface names rather than *net.Interface pointers, which is
  exactly what Supervisor reports — no net.InterfaceByName step needed.
- TXT records as map[string]string instead of []string{"k=v"}.

The lifecycle now hangs off a context.Context rather than a separate
Shutdown() call: cancelling the context makes Respond() send the
goodbye and return. That removes the atomic.Pointer dance from main.go
that the previous library required.

publishHomeAssistant(ctx) is still launched in a goroutine because the
Supervisor-info call can block for several seconds while Supervisor
comes up; that behavior is preserved.
brutella's Respond() already implements RFC 6762 §8 probing and renames
on conflict via incrementServiceName ("Home" → "Home (2)"). But the
rename happens inside Respond(), so the log line we emit just before
shows the *intended* name, not the post-probing one.

Probe explicitly with dnssd.ProbeService() up-front, then add the
probed service to the responder and log the final ServiceInstanceName
(which includes any rename). brutella re-probes inside Respond(), but
the second probe sees no conflict (the name we just picked is free)
and completes immediately.

Note: brutella's rename style is "Name (N)", which differs from
Core's "Name-N". An earlier draft of this commit drove its own loop
to match Core's style exactly; reverted because the additional probe
rounds it required were not worth the cosmetic match — the service
instance name we use by default makes real conflicts vanishingly rare
anyway (see the serviceInstance constant).
"Home" is python-zeroconf's default when no location_name is set, so
on a LAN with one or more configured Core instances the landing page
collides on this name and brutella renames it to "Home (2)"-style.

The rename works, but the user sees an entry that's hard to tell apart
from a real Core install. Renaming our default to "Home (preparing
setup)" achieves two things:

- The instance is immediately distinguishable in onboarding lists as
  "the install you're configuring right now, not one of the already-
  configured ones."
- The name is unlikely to collide with anything else on the LAN, so
  the rename path is rarely taken at all.

location_name in TXT is derived from the same constant, so once Core
takes over it replaces the friendly name with whatever the user
configures (defaulting back to "Home").
brutella ships its own logger in github.com/brutella/dnssd/log with
Info enabled to stdout by default. Almost everything it logs at INFO
level is either an observation about other hosts' RFC-6762 violations
or a transient network event:

  - "dnssd: invalid source address" — packets with weird source IPs
  - "dnssd: ... MUST be ... (RFC6762 18.x)" — peer policy nags
  - "route ip+net: no such network interface" — netlink delivered a
    link-update for an iface that was already torn down (e.g. a docker
    veth coming and going, which happens constantly on HA OS as add-on
    containers start/stop)

None of those are actionable for the landing page; they just clutter
the journal. Disable the Info logger so only our own log.Printf lines
end up in container output. Debug stays off by default.
@agners agners requested review from Copilot and sairon and removed request for sairon June 9, 2026 13:52

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the landing page’s mDNS/DNS-SD publisher implementation to use brutella/dnssd, adds an explicit pre-probe step so the final (possibly conflict-renamed) instance name is known before announcing, and changes the default service instance name to be distinct from Home Assistant Core’s typical "Home" entry.

Changes:

  • Replace libp2p/zeroconf/v2 publishing with brutella/dnssd using a context.Context lifecycle.
  • Pre-probe the service instance name before responder start and log the final instance name.
  • Update the default service instance name and silence brutella/dnssd INFO-level logging.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 2 comments.

File Description
mdns.go Switches publisher to brutella/dnssd, adds pre-probing, changes default instance name, and disables dnssd INFO logging.
main.go Reworks mDNS lifecycle to be driven by context cancellation rather than explicit server shutdown.
go.mod Replaces libp2p/zeroconf/v2 with github.com/brutella/dnssd and its indirect dependencies.
go.sum Updates checksums to reflect the new dependency graph.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mdns.go Outdated
Comment on lines 70 to 71
time.Sleep(supervisorRetryInterval)
}
Comment thread main.go Outdated
Comment on lines +66 to +69
log.Print("Start mDNS broadcast")
go publishHomeAssistant()
defer func() {
if s := mdns.Load(); s != nil {
s.Shutdown()
}
}()
mdnsCtx, cancelMDNS := context.WithCancel(context.Background())
defer cancelMDNS()
go publishHomeAssistant(mdnsCtx)
main() previously returned immediately after cancelling the mDNS
context on SIGTERM/SIGINT, racing the kernel: the goroutine running
rp.Respond(ctx) needs time to wake up from ctx.Done(), call
r.unannounce() which does a synchronous SendResponse() to write the
TTL=0 goodbye to the UDP socket, and return. If main() returns first
the process exits and the goroutine is killed mid-write, so the
goodbye never leaves the host. Clients (iOS, Android NSD, …) keep
showing the stale landing-page entry until their own cached TTL
expires — visible as the symptom on iPad after the landing page
hands off to Core.

Three changes to fix this and surface what's happening:

1. main() now starts the mDNS goroutine with an explicit done-channel
   and defers cancel-then-wait. The wait ensures Respond() has
   finished unannouncing before main returns.

2. The retry loop in publishHomeAssistant() uses a ctx-aware sleep
   instead of plain time.Sleep so a SIGTERM during the supervisor-
   wait window unblocks immediately, rather than holding the
   shutdown-wait until the next 5 s tick.

3. New log lines mark each transition:
     "Shutting down mDNS broadcast"           (main, on signal)
     "Sent mDNS goodbye for <name>"           (mdns, after Respond)
     "mDNS broadcast stopped"                 (main, after wait)
     "mDNS broadcast cancelled before Supervisor was ready"
                                              (mdns, if cancelled
                                               during retry loop)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants