|
| 1 | +# Longhorn Backup Configuration Summary |
| 2 | + |
| 3 | +Generated on: September 5, 2025 |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +Your k3s-argocd-proxmox cluster now has comprehensive Longhorn backup coverage with MinIO S3 storage on TrueNAS. All PVCs using `storageClassName: longhorn` are configured for automatic backups. |
| 8 | + |
| 9 | +## Backup Infrastructure |
| 10 | + |
| 11 | +### S3 Backend Configuration |
| 12 | +- **Target**: `s3://longhorn-backups@us-east-1/` |
| 13 | +- **Storage**: MinIO on TrueNAS at `192.168.10.133` |
| 14 | +- **Credentials**: Managed via External Secrets (1Password) |
| 15 | +- **Compression**: gzip |
| 16 | +- **Concurrent Limit**: 2 backups at once |
| 17 | + |
| 18 | +### Backup Tiers & Schedules |
| 19 | + |
| 20 | +#### 🔴 Critical Tier (Databases, Core Infrastructure) |
| 21 | +- **Snapshots**: Every hour (24 retained = 1 day) |
| 22 | +- **Backups**: Daily at 2 AM (30 retained = 1 month) |
| 23 | +- **Applications**: Redis, PostgreSQL, Container Registry |
| 24 | + |
| 25 | +#### 🟡 Important Tier (User Data, Configurations) |
| 26 | +- **Snapshots**: Every 4 hours (12 retained = 2 days) |
| 27 | +- **Backups**: Daily at 3 AM (14 retained = 2 weeks) |
| 28 | +- **Applications**: Immich, Home Assistant, Paperless-NGX, AI workloads, Monitoring |
| 29 | + |
| 30 | +#### 🔵 Standard Tier (Cache, Logs, Development) |
| 31 | +- **Snapshots**: Daily at 4 AM (7 retained = 1 week) |
| 32 | +- **Backups**: Weekly on Sunday at 5 AM (4 retained = 1 month) |
| 33 | +- **Applications**: Jellyfin config, Development tools, Cache storage |
| 34 | + |
| 35 | +#### 🌐 Weekly Full System Backup |
| 36 | +- **Schedule**: Sunday at 1 AM |
| 37 | +- **Scope**: ALL volumes (critical + important + standard) |
| 38 | +- **Retention**: 8 weeks (2 months) |
| 39 | + |
| 40 | +## PVC Backup Configuration by Application |
| 41 | + |
| 42 | +### Infrastructure Components |
| 43 | + |
| 44 | +| PVC Name | Namespace | Storage | Backup Tier | Application | |
| 45 | +|----------|-----------|---------|-------------|-------------| |
| 46 | +| `registry-pvc` | kube-system | 10Gi | 🔴 Critical | Container Registry | |
| 47 | +| `redis-data-redis-master-0` | redis-instance | 10Gi | 🔴 Critical | Redis Database | |
| 48 | + |
| 49 | +### Monitoring Stack |
| 50 | + |
| 51 | +| PVC Name | Namespace | Storage | Backup Tier | Application | |
| 52 | +|----------|-----------|---------|-------------|-------------| |
| 53 | +| Prometheus PVC | monitoring | 20Gi | 🟡 Important | Prometheus Metrics | |
| 54 | +| Alertmanager PVC | monitoring | 2Gi | 🟡 Important | Alert Management | |
| 55 | +| Grafana PVC | monitoring | 5Gi | 🟡 Important | Dashboard Data | |
| 56 | + |
| 57 | +### AI Applications |
| 58 | + |
| 59 | +| PVC Name | Namespace | Storage | Backup Tier | Application | |
| 60 | +|----------|-----------|---------|-------------|-------------| |
| 61 | +| `khoj-data` | khoj | 10Gi | 🟡 Important | AI Assistant Data | |
| 62 | +| `khoj-postgres-data` | khoj | 5Gi | 🟡 Important | AI Assistant DB | |
| 63 | +| `ollama-webui-data` | ollama-webui | 5Gi | 🟡 Important | Chat Interface | |
| 64 | +| `ollama-webui-storage-pvc` | ollama-webui | 5Gi | 🟡 Important | Chat Storage | |
| 65 | + |
| 66 | +### Home Automation |
| 67 | + |
| 68 | +| PVC Name | Namespace | Storage | Backup Tier | Application | |
| 69 | +|----------|-----------|---------|-------------|-------------| |
| 70 | +| `home-assistant-config` | home-assistant | 10Gi | 🟡 Important | Smart Home Config | |
| 71 | +| `frigate-config-pvc` | frigate | 5Gi | 🟡 Important | Video Surveillance | |
| 72 | +| `mqtt-data-pvc` | frigate | 1Gi | 🟡 Important | MQTT Broker | |
| 73 | +| `paperless-data-pvc` | paperless-ngx | 10Gi | 🟡 Important | Document Data | |
| 74 | +| `paperless-media-pvc` | paperless-ngx | 20Gi | 🟡 Important | Document Media | |
| 75 | +| `paperless-consume-pvc` | paperless-ngx | 5Gi | 🟡 Important | Document Intake | |
| 76 | +| `paperless-export-pvc` | paperless-ngx | 5Gi | 🟡 Important | Document Export | |
| 77 | + |
| 78 | +### Media Applications |
| 79 | + |
| 80 | +| PVC Name | Namespace | Storage | Backup Tier | Application | |
| 81 | +|----------|-----------|---------|-------------|-------------| |
| 82 | +| `immich-data` | immich | 20Gi | 🟡 Important | Photo Management | |
| 83 | +| `immich-library` | immich | 100Gi | 🟡 Important | Photo Library | |
| 84 | +| `immich-cache` | immich | 10Gi | 🟡 Important | ML Cache | |
| 85 | +| `plex-config` | plex | 10Gi | 🟡 Important | Media Server Config | |
| 86 | +| `plex-transcode` | plex | 10Gi | 🔵 Standard | Transcode Cache | |
| 87 | +| `plex-logs` | plex | 1Gi | 🔵 Standard | Application Logs | |
| 88 | +| `jellyfin-config-pvc` | jellyfin | 1Gi | 🔵 Standard | Media Server Config | |
| 89 | +| `data-pvc` | hoarder | 10Gi | 🟡 Important | Bookmark Data | |
| 90 | +| `meilisearch-pvc` | hoarder | 10Gi | 🔵 Standard | Search Index | |
| 91 | +| `homepage-config-pvc` | homepage-dashboard | 1Gi | 🔵 Standard | Dashboard Config | |
| 92 | +| `tubearchivist-cache-pvc` | tubearchivist | 50Gi | 🔵 Standard | Video Cache | |
| 93 | +| `tubearchivist-redis-pvc` | tubearchivist | 1Gi | 🔵 Standard | Cache Database | |
| 94 | +| `es-data` | tubearchivist | 50Gi | 🟡 Important | Search Data | |
| 95 | +| `nestmtx-storage-pvc` | nestmtx | 10Gi | 🔵 Standard | Streaming Cache | |
| 96 | + |
| 97 | +### Privacy Applications |
| 98 | + |
| 99 | +| PVC Name | Namespace | Storage | Backup Tier | Application | |
| 100 | +|----------|-----------|---------|-------------|-------------| |
| 101 | +| `proxitok-cache-pvc` | proxitok | 10Gi | 🔵 Standard | TikTok Proxy Cache | |
| 102 | +| `redis-data-pvc` | searxng | 1Gi | 🔵 Standard | Search Cache | |
| 103 | + |
| 104 | +### Development Tools |
| 105 | + |
| 106 | +| PVC Name | Namespace | Storage | Backup Tier | Application | |
| 107 | +|----------|-----------|---------|-------------|-------------| |
| 108 | +| `nginx-storage` | nginx | 1Gi | 🔵 Standard | Web Server Config | |
| 109 | + |
| 110 | +## MinIO S3 Bucket Structure |
| 111 | + |
| 112 | +``` |
| 113 | +longhorn-backups/ |
| 114 | +├── backupstore/ |
| 115 | +│ ├── volumes/ |
| 116 | +│ │ ├── <volume-name>/ |
| 117 | +│ │ │ ├── backups/ |
| 118 | +│ │ │ │ ├── backup-<timestamp>/ |
| 119 | +│ │ │ │ └── backup-<timestamp>/ |
| 120 | +│ │ │ └── volume.cfg |
| 121 | +│ │ └── ... |
| 122 | +│ └── backup_volumes.cfg |
| 123 | +``` |
| 124 | + |
| 125 | +## Monitoring & Verification |
| 126 | + |
| 127 | +### Key Commands |
| 128 | + |
| 129 | +```bash |
| 130 | +# Check backup system health |
| 131 | +./scripts/verify-longhorn-backups.sh |
| 132 | + |
| 133 | +# View all backups |
| 134 | +kubectl get backups -n longhorn-system |
| 135 | + |
| 136 | +# Check backup target status |
| 137 | +kubectl get backuptarget -n longhorn-system |
| 138 | + |
| 139 | +# Monitor recurring jobs |
| 140 | +kubectl get recurringjobs -n longhorn-system |
| 141 | + |
| 142 | +# View volume backup status |
| 143 | +kubectl get volumes -n longhorn-system |
| 144 | +``` |
| 145 | + |
| 146 | +### Web Interfaces |
| 147 | + |
| 148 | +- **Longhorn UI**: Access via HTTPRoute at your cluster domain |
| 149 | +- **MinIO Console**: `http://192.168.10.133:9002` |
| 150 | +- **TrueNAS**: `http://192.168.10.133` (main interface) |
| 151 | + |
| 152 | +## Backup Verification Checklist |
| 153 | + |
| 154 | +- [ ] All Longhorn PVCs have `longhorn.io/recurring-job-*` annotations |
| 155 | +- [ ] BackupTarget shows as "Available" |
| 156 | +- [ ] External Secrets are syncing credentials |
| 157 | +- [ ] MinIO bucket `longhorn-backups` exists and is accessible |
| 158 | +- [ ] Recent backups are appearing in `kubectl get backups` |
| 159 | +- [ ] No backup job failures in recurring job status |
| 160 | + |
| 161 | +## Emergency Procedures |
| 162 | + |
| 163 | +For disaster recovery procedures, see: |
| 164 | +- `docs/runbooks/longhorn-emergency-procedures.md` |
| 165 | +- `docs/longhorn-backup-guide.md` |
| 166 | + |
| 167 | +## Next Steps |
| 168 | + |
| 169 | +1. **Verify Configuration**: Run `./scripts/verify-longhorn-backups.sh` |
| 170 | +2. **Test Restore**: Practice restoring a volume from backup |
| 171 | +3. **Monitor**: Set up alerts for backup failures |
| 172 | +4. **Document**: Update any application-specific restore procedures |
| 173 | + |
| 174 | +--- |
| 175 | + |
| 176 | +**Last Updated**: September 5, 2025 |
| 177 | +**Configuration Files**: |
| 178 | +- `infrastructure/storage/longhorn/backup-settings.yaml` |
| 179 | +- `infrastructure/storage/longhorn/recurring-jobs.yaml` |
| 180 | +- All PVC files with `storageClassName: longhorn` |
0 commit comments