Skip to content

Refactor slow activities from nm-dispatcher script into separate listener service#680

Open
lmagyar wants to merge 4 commits into
hassio-addons:mainfrom
lmagyar:pr-nm-dispatcher
Open

Refactor slow activities from nm-dispatcher script into separate listener service#680
lmagyar wants to merge 4 commits into
hassio-addons:mainfrom
lmagyar:pr-nm-dispatcher

Conversation

@lmagyar

@lmagyar lmagyar commented May 3, 2026

Copy link
Copy Markdown
Collaborator

Proposed Changes

There is a plain linux pipe between the 2 services, each nm-dispatcher action is 1 line in it, then all the slow thing happen in the listener service sequentially - as currently, but returning to nm-dispatcher quickly.

Related Issues

fixes #678
closes #679

Summary by CodeRabbit

  • Refactor
    • Reorganized subnet protection service architecture to use a listener queue for improved reliability and separation of concerns.
    • Decoupled NetworkManager dispatcher from direct subnet protection handling.
    • Updated service lifecycle management for better initialization and cleanup.

@coderabbitai

coderabbitai Bot commented May 3, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@lmagyar, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 24 minutes and 40 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 354339f8-966b-40bf-99b2-62b88c4be89a

📥 Commits

Reviewing files that changed from the base of the PR and between 666141d and 76935e7.

📒 Files selected for processing (5)
  • tailscale/rootfs/etc/NetworkManager/dispatcher.d/protect-subnets
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher-listener/finish
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher-listener/notification-fd
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher-listener/run
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher/finish

Walkthrough

The dispatcher script refactors from direct synchronous subnet protection to queue-based enqueuing. New s6-overlay services (nm-dispatcher-listener, init-protect-subnets) and a FIFO queue decouple NetworkManager dispatcher events from synchronous Supervisor API calls, enabling non-blocking dispatcher execution while a dedicated listener processes subnet route changes asynchronously.

Changes

Subnet Protection Async Queue Architecture

Layer / File(s) Summary
Service Type Definitions
tailscale/rootfs/etc/s6-overlay/s6-rc.d/protect-subnets/type, tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher/type, tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher-listener/type, tailscale/rootfs/etc/s6-overlay/s6-rc.d/init-protect-subnets/type
protect-subnets transitions from longrun to bundle (service container); nm-dispatcher and nm-dispatcher-listener are declared as longrun; init-protect-subnets is declared as oneshot.
Queue-Based Listener Implementation
tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher-listener/run
FIFO queue at /run/nm-dispatcher-listener-queue is created and kept open on file descriptor 4. Main loop reads NetworkManager actions; for `*up
Dispatcher Enqueue Handler
tailscale/rootfs/etc/NetworkManager/dispatcher.d/protect-subnets
Dispatcher defines NM_DISPATCHER_LISTENER_QUEUE queue path. For `up
Init Service Startup/Shutdown
tailscale/rootfs/etc/s6-overlay/s6-rc.d/init-protect-subnets/run, tailscale/rootfs/etc/s6-overlay/s6-rc.d/init-protect-subnets/down, tailscale/rootfs/etc/s6-overlay/s6-rc.d/init-protect-subnets/finish, tailscale/rootfs/etc/s6-overlay/s6-rc.d/init-protect-subnets/up
run script calls protect-subnet-routes at startup; finish script calls unprotect-subnet-routes at shutdown; up points to run; down triggers finish.
Service Teardown & Exit Propagation
tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher/finish, tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher-listener/finish
nm-dispatcher/finish updated to manage nm-dispatcher service exit state. nm-dispatcher-listener/finish reads shared container exit code, cleans up queue directory, logs termination, and conditionally halts s6 based on listener service exit code and signal.
Dispatcher Service Refactoring
tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher/run
Removes inline protect-subnet-routes call; focuses solely on starting nm-dispatcher daemon via /usr/libexec/nm-dispatcher --persist --debug.

Sequence Diagram

sequenceDiagram
    participant NM as NetworkManager
    participant Disp as Dispatcher Script
    participant Queue as FIFO Queue
    participant Listener as nm-dispatcher-listener
    participant Routes as Route Protection

    rect rgb(0, 100, 200, 0.5)
    Note over NM,Routes: Old Flow (Synchronous, Blocking)
    NM->>Disp: Interface up/down event
    Disp->>Routes: protect-subnet-routes (blocks)
    Routes->>Listener: Wait for Supervisor API
    Listener-->>Routes: Timeout/Delay
    Routes-->>Disp: Complete (slow)
    Disp-->>NM: Dispatcher done (10+ min delay)
    end

    rect rgb(0, 150, 100, 0.5)
    Note over NM,Routes: New Flow (Asynchronous, Non-blocking)
    NM->>Disp: Interface up/down event
    Disp->>Queue: Enqueue action (fast return)
    Disp-->>NM: Dispatcher done (immediate)
    Queue->>Listener: Read action from queue
    Listener->>Routes: protect-subnet-routes (async)
    Routes-->>Listener: Complete
    Listener->>Listener: Update exit status
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

refactor, no-stale

Suggested reviewers

  • frenck

Poem

🐰 A queue takes the stage, no blocking delays,
Events enqueue fast, the dispatcher now plays,
While listeners heed each action's call,
Subnets stay guarded through it all! 🛡️

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: refactoring slow activities from the nm-dispatcher script into a separate listener service via a queue-based communication mechanism.
Linked Issues check ✅ Passed The PR successfully addresses both linked issues by decoupling subnet protection work from synchronous nm-dispatcher execution through a separate listener service and queue mechanism.
Out of Scope Changes check ✅ Passed All changes are directly related to the refactoring objective of moving subnet protection work out of the synchronous dispatcher into a separate listener service.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@lmagyar lmagyar added the refactor Improvement of existing code, not introducing new features. label May 3, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tailscale/rootfs/etc/NetworkManager/dispatcher.d/protect-subnets`:
- Around line 17-21: Before writing to the listener queue, verify
NM_DISPATCHER_LISTENER_QUEUE exists and is a FIFO (use [ -p
"$NM_DISPATCHER_LISTENER_QUEUE" ]) and wait with a short timeout (e.g., loop up
to a few seconds) for it to appear as a FIFO; if it never becomes a FIFO, log
fatal and exit (same behavior as current error path). Do not create a regular
file by redirecting into a non-existent path; instead fail fast when the FIFO is
absent or wrong type. Use the existing variables DEVICE_IP_IFACE and
NM_DISPATCHER_ACTION for the payload and preserve the current exit behavior that
writes to /run/s6-linux-init-container-results/exitcode and calls halt if the
check/write fails.

In `@tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher-listener/run`:
- Around line 22-25: The script must fail fast if mkfifo fails or the path is
not a FIFO: before executing "exec 4<>\"${NM_DISPATCHER_LISTENER_QUEUE}\"",
attempt to create the FIFO with mkfifo and then validate the path is a FIFO
using a POSIX test (e.g., [ -p "$NM_DISPATCHER_LISTENER_QUEUE" ]); if mkfifo
fails and the path is absent or exists but is not a FIFO, write an error and
exit non‑zero so the listener doesn't proceed to "exec 4<>" on a regular file.
Update the run script around the mkfifo/exec 4<> sequence to perform this
validation and early exit when the queue is not a proper FIFO.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 44d4ff22-5e51-49a1-9aa9-a5143aac6027

📥 Commits

Reviewing files that changed from the base of the PR and between 7a10013 and 666141d.

📒 Files selected for processing (19)
  • tailscale/rootfs/etc/NetworkManager/dispatcher.d/protect-subnets
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/init-protect-subnets/dependencies.d/local-network
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/init-protect-subnets/down
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/init-protect-subnets/finish
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/init-protect-subnets/run
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/init-protect-subnets/type
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/init-protect-subnets/up
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher-listener/dependencies.d/init-protect-subnets
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher-listener/finish
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher-listener/run
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher-listener/type
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher/dependencies.d/nm-dispatcher-listener
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher/finish
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher/run
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher/type
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/protect-subnets/contents.d/init-protect-subnets
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/protect-subnets/contents.d/nm-dispatcher
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/protect-subnets/contents.d/nm-dispatcher-listener
  • tailscale/rootfs/etc/s6-overlay/s6-rc.d/protect-subnets/type

Comment thread tailscale/rootfs/etc/NetworkManager/dispatcher.d/protect-subnets
Comment thread tailscale/rootfs/etc/s6-overlay/s6-rc.d/nm-dispatcher-listener/run
@agners

agners commented May 6, 2026

Copy link
Copy Markdown

Looks quite neat, and I've tested it on my end, Tailscale no longer blocks Supervisor restart with this change (even with 90-nm-cloud-setup.sh present).

✔️ LGTM!

@lmagyar

lmagyar commented May 6, 2026

Copy link
Copy Markdown
Collaborator Author

Thank you for the feedback!

@lmagyar lmagyar added the no-stale This issue or PR is exempted from the stable bot. label May 8, 2026
@lmagyar lmagyar added bugfix Inconsistencies or issues which will cause a problem for users or implementors. priority-high After critical issues are fixed, these should be dealt with before any further issues. and removed refactor Improvement of existing code, not introducing new features. labels May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix Inconsistencies or issues which will cause a problem for users or implementors. no-stale This issue or PR is exempted from the stable bot. priority-high After critical issues are fixed, these should be dealt with before any further issues.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tailscale add-on causes Supervisor restart to block 10 minutes

2 participants