Skip to content

Commit 99fdae8

Browse files
committed
ops: add startup-cleanup.sh for stale socket cleanup + Slack bridge restart
- New script: startup-cleanup.sh removes stale .sock files by comparing against live session UUIDs, cleans dangling .alias symlinks, and restarts the slack-bridge tmux session with the correct control-agent UUID - Updated SKILL.md: replaced vague pseudo-code in Step 0 with concrete instructions to run the script, simplified the startup checklist, and condensed the Slack bridge section since it's now automated
1 parent 74d7160 commit 99fdae8

2 files changed

Lines changed: 117 additions & 33 deletions

File tree

pi/skills/control-agent/SKILL.md

Lines changed: 26 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -141,31 +141,33 @@ Extract the **Channel** and **Thread** values from the metadata. Use the Thread
141141

142142
## Startup
143143

144-
### Step 0: Clean stale sockets
144+
### Step 0: Clean stale sockets + restart Slack bridge
145145

146-
Dead pi sessions leave behind `.sock` files in `~/.pi/session-control/`. These cause problems:
147-
- The Slack bridge may pick the wrong socket or fail with "multiple sessions found"
148-
- `list_sessions` may show ghost entries
146+
Dead pi sessions leave behind `.sock` files in `~/.pi/session-control/`. These cause:
147+
- The Slack bridge connecting to a dead socket → "Socket error: connect ENOENT"
148+
- `list_sessions` showing ghost entries
149+
- Bridge auto-detect failing with "multiple sessions found"
149150

150-
On every startup, clean them by comparing against live sessions:
151+
**Run the startup-cleanup script** immediately after confirming your session is live:
152+
153+
1. Call `list_sessions` to get live session UUIDs
154+
2. Run the cleanup script, passing all live UUIDs as arguments:
151155
```bash
152-
# Get live session IDs from list_sessions
153-
LIVE_IDS=$(list_sessions output) # use the list_sessions tool, not bash
154-
155-
# Then remove any .sock file whose UUID is NOT in the live set
156-
for sock in ~/.pi/session-control/*.sock; do
157-
[ -e "$sock" ] || continue
158-
uuid=$(basename "$sock" .sock)
159-
# If this UUID is not a live session, remove it
160-
done
156+
bash ~/.pi/agent/skills/control-agent/startup-cleanup.sh UUID1 UUID2 UUID3
161157
```
162158

159+
The script:
160+
- Removes any `.sock` file whose UUID is NOT in the live set
161+
- Cleans stale `.alias` symlinks pointing to removed sockets
162+
- Kills and restarts the `slack-bridge` tmux session with the current `control-agent` UUID
163+
- Verifies the bridge is responsive (HTTP 400 from the API = healthy)
164+
163165
**WARNING**: Do NOT use `socat` or any socket-connect test to check liveness — pi sockets don't respond to raw connections and deleting a live socket is **unrecoverable** (the socket is only created at session start). Only remove sockets for sessions that are confirmed dead via `list_sessions`.
164166

165167
### Checklist
166168

167-
- [ ] Clean stale sockets (Step 0 above)
168-
- [ ] Verify session name shows as `control-agent` in `list_sessions`
169+
- [ ] Run `list_sessions` — note live UUIDs, confirm `control-agent` is listed
170+
- [ ] Run `startup-cleanup.sh` with live UUIDs (cleans sockets + restarts Slack bridge)
169171
- [ ] Verify `HORNET_SECRET` env var is set
170172
- [ ] Create/verify `hornet@agentmail.to` inbox exists
171173
- [ ] Start email monitor (inline mode, **300s / 5 min**)
@@ -181,7 +183,6 @@ done
181183
3. If not found, launch with tmux (see below)
182184
4. Wait ~8 seconds, then send role assignment
183185
- [ ] Send role assignment to the `sentry-agent` session
184-
- [ ] Start Slack bridge (see below)
185186

186187
### Spawning sentry-agent
187188

@@ -197,27 +198,19 @@ The sentry-agent operates in **on-demand mode** — it does NOT poll. Sentry ale
197198

198199
### Starting the Slack Bridge
199200

200-
The Slack bridge (Socket Mode) receives real-time Slack events and forwards them to this session. It also provides an outbound HTTP API on port 7890.
201+
The Slack bridge (Socket Mode) receives real-time Slack events and forwards them to this session via port 7890.
201202

202-
**Always run the bridge in its own tmux session**never inline. Running inline blocks your session while waiting for agent responses.
203+
**The `startup-cleanup.sh` script handles bridge (re)start automatically**it reads the control-agent UUID from the `.alias` symlink and launches the bridge in a `slack-bridge` tmux session.
203204

205+
If you need to restart the bridge manually:
204206
```bash
205-
# Get your own session UUID for PI_SESSION_ID
206-
MY_UUID=$(ls ~/.pi/session-control/by-name/control-agent.sock 2>/dev/null | xargs readlink -f | xargs basename | sed 's/.sock//')
207-
208-
# If by-name symlink doesn't exist, find it from list_sessions output
209-
# and set MY_UUID manually
210-
211-
tmux new-session -d -s slack-bridge "set -a && source ~/.config/.env && set +a && export PATH=\$HOME/opt/node-v22.14.0-linux-x64/bin:\$PATH && export PI_SESSION_ID=$MY_UUID && cd ~/hornet/slack-bridge && exec node bridge.mjs"
207+
MY_UUID=$(readlink ~/.pi/session-control/control-agent.alias | sed 's/.sock$//')
208+
tmux kill-session -t slack-bridge 2>/dev/null || true
209+
tmux new-session -d -s slack-bridge \
210+
"set -a && source ~/.config/.env && set +a && export PATH=\$HOME/opt/node-v22.14.0-linux-x64/bin:\$PATH && export PI_SESSION_ID=$MY_UUID && cd ~/hornet/slack-bridge && exec node bridge.mjs"
212211
```
213212

214-
**Important**: Always set `PI_SESSION_ID` explicitly to your control-agent UUID. Without it, the bridge tries to auto-detect from socket files and will fail if multiple sessions exist.
215-
216-
Verify the bridge is up:
217-
```bash
218-
curl -s -o /dev/null -w '%{http_code}' -X POST http://127.0.0.1:7890/send -H 'Content-Type: application/json' -d '{}'
219-
```
220-
(Should return 400, meaning the API is responding.)
213+
Verify: `curl -s -o /dev/null -w '%{http_code}' -X POST http://127.0.0.1:7890/send -H 'Content-Type: application/json' -d '{}'` → should return `400`.
221214

222215
The bridge forwards:
223216
- **Human @mentions and DMs** from allowed users → delivered to you with security boundaries for handling
Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
#!/usr/bin/env bash
2+
# startup-cleanup.sh — Clean stale sockets and restart the Slack bridge.
3+
# Run this at the start of every control-agent session.
4+
#
5+
# Usage: bash ~/.pi/agent/skills/control-agent/startup-cleanup.sh <live-session-ids...>
6+
#
7+
# Pass the live session UUIDs (from list_sessions) as arguments.
8+
# Any .sock file whose UUID is NOT in the live set gets removed.
9+
# Stale .alias symlinks pointing to removed sockets also get cleaned.
10+
# Then restarts the slack-bridge tmux session with the current control-agent UUID.
11+
12+
set -euo pipefail
13+
14+
SOCKET_DIR="$HOME/.pi/session-control"
15+
16+
if [ $# -eq 0 ]; then
17+
echo "Usage: $0 <live-uuid-1> [live-uuid-2] ..."
18+
echo "Pass the live session UUIDs from list_sessions as arguments."
19+
exit 1
20+
fi
21+
22+
# Build a set of live UUIDs
23+
declare -A LIVE
24+
for uuid in "$@"; do
25+
LIVE["$uuid"]=1
26+
done
27+
28+
echo "=== Stale Socket Cleanup ==="
29+
echo "Live sessions: ${!LIVE[*]}"
30+
31+
# Remove stale .sock files
32+
cleaned=0
33+
for sock in "$SOCKET_DIR"/*.sock; do
34+
[ -e "$sock" ] || continue
35+
uuid=$(basename "$sock" .sock)
36+
if [ -z "${LIVE[$uuid]:-}" ]; then
37+
echo "Removing stale socket: $uuid"
38+
rm -f "$sock"
39+
((cleaned++))
40+
fi
41+
done
42+
43+
# Remove stale .alias symlinks (pointing to non-existent targets)
44+
for alias in "$SOCKET_DIR"/*.alias; do
45+
[ -L "$alias" ] || continue
46+
target=$(readlink "$alias")
47+
if [ ! -e "$SOCKET_DIR/$target" ]; then
48+
echo "Removing stale alias: $(basename "$alias") -> $target"
49+
rm -f "$alias"
50+
fi
51+
done
52+
53+
echo "Cleaned $cleaned stale socket(s)."
54+
55+
# Restart Slack bridge with current control-agent UUID
56+
echo ""
57+
echo "=== Slack Bridge Restart ==="
58+
59+
# Find control-agent UUID from alias
60+
CONTROL_ALIAS="$SOCKET_DIR/control-agent.alias"
61+
if [ -L "$CONTROL_ALIAS" ]; then
62+
MY_UUID=$(readlink "$CONTROL_ALIAS" | sed 's/.sock$//')
63+
echo "Control-agent UUID: $MY_UUID"
64+
else
65+
echo "ERROR: control-agent.alias not found. Cannot start Slack bridge."
66+
exit 1
67+
fi
68+
69+
# Kill existing slack-bridge tmux session if running
70+
if tmux has-session -t slack-bridge 2>/dev/null; then
71+
echo "Killing existing slack-bridge session..."
72+
tmux kill-session -t slack-bridge
73+
sleep 1
74+
fi
75+
76+
# Start fresh slack-bridge
77+
echo "Starting slack-bridge with PI_SESSION_ID=$MY_UUID..."
78+
tmux new-session -d -s slack-bridge \
79+
"set -a && source ~/.config/.env && set +a && export PATH=\$HOME/opt/node-v22.14.0-linux-x64/bin:\$PATH && export PI_SESSION_ID=$MY_UUID && cd ~/hornet/slack-bridge && exec node bridge.mjs"
80+
81+
# Wait for bridge to come up
82+
sleep 3
83+
HTTP_CODE=$(curl -s -o /dev/null -w '%{http_code}' -X POST http://127.0.0.1:7890/send -H 'Content-Type: application/json' -d '{}' 2>/dev/null || echo "000")
84+
if [ "$HTTP_CODE" = "400" ]; then
85+
echo "✅ Slack bridge is up (HTTP $HTTP_CODE)"
86+
else
87+
echo "⚠️ Slack bridge may not be ready yet (HTTP $HTTP_CODE). Check manually."
88+
fi
89+
90+
echo ""
91+
echo "=== Cleanup Complete ==="

0 commit comments

Comments
 (0)