Skip to content

Commit b577875

Browse files
baijumclaude
andcommitted
docs: add server migration runbook
13-step runbook covering full server migration: inventory, backup, bootstrap, restore, deploy, DNS switch, TLS verification, GitHub secrets update, and decommission with rollback instructions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent c764c5d commit b577875

1 file changed

Lines changed: 297 additions & 0 deletions

File tree

docs/runbooks/migrate-server.md

Lines changed: 297 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,297 @@
1+
# Runbook: Migrate to a New Server
2+
3+
## When to Use
4+
5+
- Server hardware failure or end-of-life
6+
- Cloud provider migration
7+
- Upgrading to a larger instance
8+
9+
## Prerequisites
10+
11+
- SSH access to both old and new servers as the `deploy` user
12+
- DNS control for all app domains and the ops domain
13+
- Backup encryption key (if encrypted backups are enabled)
14+
- GitHub deploy key or SSH key for repo access on the new server
15+
16+
## Steps
17+
18+
### 1. Inventory the old server
19+
20+
SSH into the old server and record what is running:
21+
22+
```bash
23+
ssh deploy@<old-server-ip>
24+
```
25+
26+
List running apps and their deploy slots:
27+
28+
```bash
29+
for dir in /opt/apps/*/; do
30+
app=$(basename "$dir")
31+
slot=$(cat "$dir/.deploy-slot" 2>/dev/null || echo "none")
32+
echo "$app slot=$slot"
33+
done
34+
```
35+
36+
List app domains from the Caddyfile:
37+
38+
```bash
39+
cat /opt/platform/Caddyfile
40+
```
41+
42+
Record platform environment variables:
43+
44+
```bash
45+
cat /opt/platform/.env
46+
```
47+
48+
List per-app credential files:
49+
50+
```bash
51+
ls /opt/platform/credentials/
52+
```
53+
54+
List cron jobs:
55+
56+
```bash
57+
crontab -l
58+
```
59+
60+
### 2. Create fresh backups
61+
62+
Run the backup script for every app database:
63+
64+
```bash
65+
bash /opt/platform/infrastructure/backup-postgres.sh
66+
```
67+
68+
Verify backups were created:
69+
70+
```bash
71+
ls -lh /data/backups/postgres/
72+
```
73+
74+
### 3. Transfer backups and credentials to your local machine
75+
76+
```bash
77+
# From your local machine:
78+
scp -r deploy@<old-server-ip>:/data/backups/postgres/ ./migration-backups/
79+
scp -r deploy@<old-server-ip>:/opt/platform/.env ./migration-platform.env
80+
scp -r deploy@<old-server-ip>:/opt/platform/credentials/ ./migration-credentials/
81+
```
82+
83+
If encrypted backups are enabled, also copy the encryption key:
84+
85+
```bash
86+
scp deploy@<old-server-ip>:<path-to-encryption-key> ./migration-backup-key
87+
```
88+
89+
### 4. Bootstrap the new server
90+
91+
On the new server, run the bootstrap script with the same env vars used for the old server:
92+
93+
```bash
94+
sudo ACME_EMAIL=<your-email> OPS_DOMAIN=<ops.example.com> ALERT_REPO=<org/repo> \
95+
bash infrastructure/bootstrap-server.sh
96+
```
97+
98+
Wait for all platform containers to become healthy:
99+
100+
```bash
101+
docker ps --format "table {{.Names}}\t{{.Status}}"
102+
```
103+
104+
### 5. Copy credentials to the new server
105+
106+
```bash
107+
# From your local machine:
108+
scp ./migration-platform.env deploy@<new-server-ip>:/opt/platform/.env
109+
scp -r ./migration-credentials/ deploy@<new-server-ip>:/opt/platform/credentials/
110+
```
111+
112+
If using backup encryption, copy the key:
113+
114+
```bash
115+
scp ./migration-backup-key deploy@<new-server-ip>:<path-to-encryption-key>
116+
```
117+
118+
Restart platform containers so they pick up the restored credentials:
119+
120+
```bash
121+
ssh deploy@<new-server-ip>
122+
cd /opt/platform
123+
docker compose down && docker compose up -d
124+
```
125+
126+
### 6. Restore databases
127+
128+
Copy backup files to the new server:
129+
130+
```bash
131+
# From your local machine:
132+
scp -r ./migration-backups/ deploy@<new-server-ip>:/data/backups/postgres/
133+
```
134+
135+
On the new server, restore each app database:
136+
137+
```bash
138+
ssh deploy@<new-server-ip>
139+
bash /opt/platform/infrastructure/restore-postgres.sh --yes <backup-file>
140+
```
141+
142+
Verify each restored database:
143+
144+
```bash
145+
bash /opt/platform/infrastructure/verify-backup.sh <database-name>
146+
```
147+
148+
### 7. Clone and configure apps
149+
150+
For each app, clone the repo and set up the deploy directory:
151+
152+
```bash
153+
cd /opt/apps
154+
git clone git@github.com:towlion/<app-name>.git <app-name>
155+
cd <app-name>
156+
```
157+
158+
Write the app's `deploy/.env` using credentials from `/opt/platform/credentials/<app-name>`:
159+
160+
```bash
161+
cp deploy/env.template deploy/.env
162+
# Edit deploy/.env with the correct DATABASE_URL, S3 credentials, JWT_SECRET, etc.
163+
```
164+
165+
Set the initial deploy slot:
166+
167+
```bash
168+
echo "blue" > .deploy-slot
169+
```
170+
171+
### 8. Deploy apps
172+
173+
Run the blue-green deploy script for each app:
174+
175+
```bash
176+
bash /opt/platform/infrastructure/deploy-blue-green.sh \
177+
<app-name> /opt/apps/<app-name> <app-domain> "<caddyfile-content>"
178+
```
179+
180+
Alternatively, trigger deploys via GitHub Actions once GitHub secrets are updated (step 12).
181+
182+
### 9. Verify on the new server
183+
184+
Check that all platform containers are healthy:
185+
186+
```bash
187+
docker ps --format "table {{.Names}}\t{{.Status}}"
188+
```
189+
190+
Check health endpoints for each app (using the server IP directly, since DNS still points to the old server):
191+
192+
```bash
193+
curl -sk --resolve <app-domain>:443:<new-server-ip> https://<app-domain>/health
194+
```
195+
196+
Verify Grafana is accessible:
197+
198+
```bash
199+
curl -sk --resolve <ops-domain>:443:<new-server-ip> https://<ops-domain>/
200+
```
201+
202+
Verify cron jobs are in place:
203+
204+
```bash
205+
crontab -l
206+
```
207+
208+
### 10. Switch DNS
209+
210+
Update A records for all domains to point to the new server IP:
211+
212+
- Each app domain (e.g., `app.example.com`, `app2.example.com`)
213+
- The ops domain (e.g., `ops.example.com`)
214+
- Preview wildcard record (e.g., `*.preview.example.com`)
215+
216+
DNS propagation typically takes minutes but can take up to 48 hours depending on TTL. Consider lowering TTL values a day before the migration.
217+
218+
### 11. Verify TLS
219+
220+
After DNS propagates, Caddy will automatically provision TLS certificates. Monitor the Caddy logs:
221+
222+
```bash
223+
docker logs -f platform-caddy-1
224+
```
225+
226+
Test HTTPS on all domains:
227+
228+
```bash
229+
curl -s https://<app-domain>/health
230+
curl -s https://<ops-domain>/
231+
```
232+
233+
Verify certificates are valid:
234+
235+
```bash
236+
echo | openssl s_client -connect <app-domain>:443 -servername <app-domain> 2>/dev/null | openssl x509 -noout -dates
237+
```
238+
239+
### 12. Update GitHub secrets
240+
241+
In each app repository, update the following secrets to point to the new server:
242+
243+
- `SERVER_HOST` — new server IP
244+
- `SERVER_SSH_KEY` — SSH private key for the new server's `deploy` user
245+
246+
```bash
247+
# Using the GitHub CLI:
248+
gh secret set SERVER_HOST --repo towlion/<app-name> --body "<new-server-ip>"
249+
gh secret set SERVER_SSH_KEY --repo towlion/<app-name> < ~/.ssh/<new-server-key>
250+
```
251+
252+
Trigger a test deploy on one app to confirm the pipeline works end-to-end.
253+
254+
### 13. Decommission the old server
255+
256+
Keep the old server running for 48-72 hours as a safety net. During this period:
257+
258+
- Monitor the new server for errors
259+
- Confirm all deploys go to the new server
260+
- Verify backups run successfully on the new server
261+
262+
Once satisfied, tear down the old server:
263+
264+
```bash
265+
ssh deploy@<old-server-ip>
266+
# Stop all containers
267+
cd /opt/platform && docker compose down
268+
for dir in /opt/apps/*/; do
269+
app=$(basename "$dir")
270+
docker compose -p "$app" -f "$dir/deploy/docker-compose.yml" down
271+
done
272+
```
273+
274+
Then delete or destroy the old server instance through your cloud provider.
275+
276+
## Rollback
277+
278+
If issues arise after the DNS switch:
279+
280+
- **Revert DNS** — Point A records back to the old server IP. The old server remains fully functional until explicitly decommissioned.
281+
- **Investigate** — SSH into the new server and check logs, health endpoints, and container status.
282+
283+
## Verification Checklist
284+
285+
- [ ] All platform containers healthy (`docker ps`)
286+
- [ ] All app health endpoints return 200
287+
- [ ] Grafana accessible at ops domain
288+
- [ ] Backup cron running (`crontab -l`)
289+
- [ ] GitHub Actions deploys targeting new server
290+
- [ ] TLS certificates provisioned for all domains
291+
- [ ] Preview environment DNS (wildcard record) updated
292+
293+
## Notes
294+
295+
- Plan the migration during a low-traffic window to minimize impact.
296+
- If you lower DNS TTL before migration, remember to restore it afterward.
297+
- The old server's backups remain available as an additional safety net during the transition period.

0 commit comments

Comments
 (0)