Skip to content

Commit 35bd31a

Browse files
authored
fix: Verify all compose services survive deploy recreate cycle (#2135)
Docker compose can lose sidecar containers during image recreate. When the MCP server receives SIGTERM during compose up, it exits cleanly (code 0) but Docker marks it as "manually stopped" and the unless-stopped restart policy won't bring it back. Add a post-deploy verification step that checks for exited services and retries them before proceeding. Co-authored-by: Ben Coombs <bjcoombs@users.noreply.github.com>
1 parent 679a385 commit 35bd31a

2 files changed

Lines changed: 35 additions & 0 deletions

File tree

.github/workflows/deploy-demo.yml

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -209,6 +209,25 @@ jobs:
209209
docker pull ghcr.io/meridianhub/meridian@${IMAGE_DIGEST}
210210
docker tag ghcr.io/meridianhub/meridian@${IMAGE_DIGEST} ghcr.io/meridianhub/meridian:demo
211211
docker compose up -d --remove-orphans
212+
213+
# Verify all services came up. Docker compose can lose containers
214+
# during recreate due to depends_on race conditions (the MCP server
215+
# receives SIGTERM, exits cleanly, and isn't restarted because Docker
216+
# marks it as "manually stopped").
217+
sleep 5
218+
failed=$(docker compose ps --status exited --format '{{.Service}}')
219+
if [ -n "$failed" ]; then
220+
echo "Services failed to start: $failed - retrying"
221+
docker compose up -d $failed
222+
sleep 3
223+
still_failed=$(docker compose ps --status exited --format '{{.Service}}')
224+
if [ -n "$still_failed" ]; then
225+
echo "Services still down after retry: $still_failed"
226+
exit 1
227+
fi
228+
fi
229+
echo "All services running"
230+
212231
docker compose exec -T caddy caddy reload --config /etc/caddy/Caddyfile
213232
docker image prune -f
214233

.github/workflows/deploy-develop.yml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -220,6 +220,22 @@ jobs:
220220
docker pull ghcr.io/meridianhub/meridian@${IMAGE_DIGEST}
221221
docker tag ghcr.io/meridianhub/meridian@${IMAGE_DIGEST} ghcr.io/meridianhub/meridian:develop
222222
docker compose -f docker-compose.develop.yml up -d --remove-orphans
223+
224+
# Verify all services came up (see deploy-demo.yml for rationale)
225+
sleep 5
226+
failed=$(docker compose -f docker-compose.develop.yml ps --status exited --format '{{.Service}}')
227+
if [ -n "$failed" ]; then
228+
echo "Services failed to start: $failed - retrying"
229+
docker compose -f docker-compose.develop.yml up -d $failed
230+
sleep 3
231+
still_failed=$(docker compose -f docker-compose.develop.yml ps --status exited --format '{{.Service}}')
232+
if [ -n "$still_failed" ]; then
233+
echo "Services still down after retry: $still_failed"
234+
exit 1
235+
fi
236+
fi
237+
echo "All services running"
238+
223239
# Reload Caddy from the demo stack (Caddy runs there, serves both environments)
224240
docker compose -f /opt/meridian/docker-compose.yml exec -T caddy caddy reload --config /etc/caddy/Caddyfile
225241
docker image prune -f

0 commit comments

Comments
 (0)