Skip to content

Detect concurrent tmux servers via sockets, not ps argv#160

Open
CosminRadu wants to merge 1 commit into
tmux-plugins:masterfrom
CosminRadu:fix/server-detection-via-sockets
Open

Detect concurrent tmux servers via sockets, not ps argv#160
CosminRadu wants to merge 1 commit into
tmux-plugins:masterfrom
CosminRadu:fix/server-detection-via-sockets

Conversation

@CosminRadu

Copy link
Copy Markdown

The "another tmux server running" gate in helpers.sh decides whether continuum injects its save interpolation into status-right. The current check is a ps/grep heuristic that fails in two ways:

  • False negative. ps -u $UID -o "command pid" | grep "^tmux" matches only rows whose argv[0] starts with the literal four characters tmux. tmux invoked via absolute path (e.g. /opt/homebrew/bin/tmux ... on macOS Homebrew) never matches.
  • False positive. Every client process whose argv[0] starts with tmux is counted as a server. Any single tmux server with two or more clients attached trips the gate, and continuum silently skips injecting the save interpolation. Auto-save then stops without any indication that anything is wrong.

The post-startup branch in continuum.tmux tries to compensate by subtracting tmux list-clients, but that only lists clients of the current server and inherits the same ps-matching weaknesses.

This was reproducible on a Homebrew tmux install: a normal tmux attach-session plus a bare tmux invocation (PATH-resolved) on a single server caused continuum to disable auto-save for over a week without notice.

The fix

Detect tmux servers by looking at Unix-domain sockets, not process argv. A tmux server holds its socket FD with the socket's filesystem path in lsof's NAME column; a client only holds a peer-pointer connection (no path). Unique path-shaped NAMEs across tmux processes for the current user equal the set of running servers.

Each candidate path is then probed with tmux -S <path> list-sessions to confirm a live server is listening (this excludes stale socket files left over from crashed servers).

Falls back to scanning the current socket's directory if lsof is unavailable. The fallback covers the default-dir cases (tmux, tmux -L name) but misses cross-directory -S /elsewhere/path servers — strictly better than the original heuristic in either case.

Effects

  • The dual-mode wrapper in continuum.tmux collapses: the same socket-based check is correct on startup and afterward.
  • Public function names are preserved (another_tmux_server_running, another_tmux_server_running_on_startup, current_tmux_server_pid) so continuum_restore.sh and any external callers keep working.
  • Removed: all_tmux_processes, number_tmux_processes_except_current_server, number_current_server_client_processes.
  • Added: current_tmux_socket_path, number_other_live_tmux_servers.

Verified

On macOS Homebrew tmux 3.6a:

  • Single server, multiple clients (the bug case): old heuristic returned > 1 (gate trips), new code returns 0 (gate passes). ✓
  • Synthetic second server started with -S /tmp/continuum-test-XXX/decoy in a directory unrelated to the default socket dir: new code returns 1 (gate trips). ✓
  • After killing the synthetic server: returns 0. ✓
  • Stale socket file with no listening process: not counted (probe fails). ✓

Notes

  • New runtime dependency on lsof for the optimal path. Both macOS and most Linux distros ship it; the directory-scan fallback handles the rest.
  • The lsof output parser relies on NAME being the last column and being an absolute path for server sockets, which has been stable in lsof for decades and matches both the macOS and Linux man pages.

The previous heuristic in `all_tmux_processes` grepped `ps` output for
lines starting with the literal string "tmux". This had two failure
modes that both broke auto-save in practice:

  - False negative: tmux invoked via absolute path
    (e.g. `/opt/homebrew/bin/tmux ...` on macOS Homebrew installs)
    never matched `^tmux` and was invisible to the count.

  - False positive: every client process whose argv[0] started with
    "tmux" was counted as a server. Any single tmux server with two
    or more clients attached would trip the "another server running"
    gate, causing continuum to silently skip injecting its save
    interpolation into status-right and disable auto-save until the
    extra clients went away.

The post-startup branch of `another_tmux_server_running` tried to
compensate by subtracting `tmux list-clients` from the ps count, but
that only listed clients of the current server and inherited the same
ps-matching weaknesses.

Replace with socket-based counting. Each tmux server owns exactly one
Unix domain socket in its socket directory; a live server responds to
`tmux -S <socket> list-sessions`. Stale socket files left over from
crashed servers do not respond and are correctly excluded.

This collapses the dual-mode wrapper in continuum.tmux: the same check
is correct on startup and afterward, so the post-startup branch goes
away.

Removes: all_tmux_processes, number_tmux_processes_except_current_server,
number_current_server_client_processes.
Adds: current_tmux_socket_path, number_other_live_tmux_servers.
Public function names (another_tmux_server_running,
another_tmux_server_running_on_startup, current_tmux_server_pid) are
preserved so external callers like continuum_restore.sh keep working.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant