|
| 1 | +# Testing k8s-admin Unified Authentication |
| 2 | + |
| 3 | +## Prerequisites |
| 4 | + |
| 5 | +### 1. Update 1Password Item |
| 6 | +Ensure the `rustfs` item in 1Password has these exact fields: |
| 7 | + |
| 8 | +``` |
| 9 | +Item: rustfs |
| 10 | +├─ k8s-admin-access-key: "k8s-admin" |
| 11 | +├─ k8s-admin-secret-key: "<secret from RustFS console>" |
| 12 | +├─ restic_password: "<restic encryption password>" |
| 13 | +└─ restic_repository: "s3:http://192.168.10.133:30292/volsync-backup/" |
| 14 | +``` |
| 15 | + |
| 16 | +**Action Required:** |
| 17 | +1. Open 1Password |
| 18 | +2. Find the `rustfs` item |
| 19 | +3. Add/rename fields: |
| 20 | + - `k8s-admin-access-key` → value should be `k8s-admin` |
| 21 | + - `k8s-admin-secret-key` → copy the secret key from RustFS console |
| 22 | +4. Save the item |
| 23 | + |
| 24 | +--- |
| 25 | + |
| 26 | +## Testing Plan |
| 27 | + |
| 28 | +### Phase 1: Verify ExternalSecrets Sync |
| 29 | + |
| 30 | +After committing changes and ArgoCD syncs (or manual apply): |
| 31 | + |
| 32 | +```bash |
| 33 | +# Check ExternalSecret status in volsync-system |
| 34 | +kubectl get externalsecret -n volsync-system |
| 35 | +kubectl describe externalsecret volsync-s3-credentials -n volsync-system |
| 36 | + |
| 37 | +# Expected: Status should show "SecretSynced: True" |
| 38 | +# If error: Check that 1Password has the k8s-admin-access-key field |
| 39 | + |
| 40 | +# Verify the generated secret has correct keys |
| 41 | +kubectl get secret volsync-s3-credentials -n volsync-system -o yaml |
| 42 | + |
| 43 | +# Should contain: |
| 44 | +# AWS_ACCESS_KEY_ID: <base64 of "k8s-admin"> |
| 45 | +# AWS_SECRET_ACCESS_KEY: <base64 of secret> |
| 46 | +``` |
| 47 | + |
| 48 | +```bash |
| 49 | +# Check ClusterExternalSecret for volsync-rustfs-base |
| 50 | +kubectl get clusterexternalsecret volsync-rustfs-base |
| 51 | +kubectl describe clusterexternalsecret volsync-rustfs-base |
| 52 | + |
| 53 | +# Check generated secrets in labeled namespaces |
| 54 | +kubectl get secret volsync-rustfs-base -n karakeep -o yaml |
| 55 | +kubectl get secret volsync-rustfs-base -n open-webui -o yaml |
| 56 | + |
| 57 | +# Decode and verify AWS_ACCESS_KEY_ID = "k8s-admin" |
| 58 | +kubectl get secret volsync-rustfs-base -n karakeep -o jsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 -d |
| 59 | +# Expected output: k8s-admin |
| 60 | +``` |
| 61 | + |
| 62 | +```bash |
| 63 | +# Check Longhorn credentials |
| 64 | +kubectl get externalsecret -n longhorn-system |
| 65 | +kubectl describe externalsecret longhorn-backup-credentials -n longhorn-system |
| 66 | + |
| 67 | +kubectl get secret longhorn-backup-credentials -n longhorn-system -o yaml |
| 68 | +# Should contain AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY |
| 69 | +``` |
| 70 | + |
| 71 | +```bash |
| 72 | +# Check monitoring stack credentials |
| 73 | +kubectl get externalsecret -n loki-stack |
| 74 | +kubectl get externalsecret -n monitoring # for tempo |
| 75 | + |
| 76 | +kubectl get secret loki-s3-credentials -n loki-stack -o jsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 -d |
| 77 | +# Expected: k8s-admin |
| 78 | + |
| 79 | +kubectl get secret tempo-s3-credentials -n monitoring -o jsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 -d |
| 80 | +# Expected: k8s-admin |
| 81 | +``` |
| 82 | + |
| 83 | +```bash |
| 84 | +# Check database backup credentials |
| 85 | +kubectl get externalsecret -n cloudnative-pg |
| 86 | +kubectl get secret cnpg-s3-credentials -n cloudnative-pg -o jsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 -d |
| 87 | +# Expected: k8s-admin |
| 88 | + |
| 89 | +kubectl get externalsecret -n crunchy-postgres |
| 90 | +kubectl get secret pgo-s3-credentials -n crunchy-postgres -o jsonpath='{.data}' | jq |
| 91 | +``` |
| 92 | + |
| 93 | +--- |
| 94 | + |
| 95 | +### Phase 2: Test VolSync Backups |
| 96 | + |
| 97 | +```bash |
| 98 | +# Check existing ReplicationSources |
| 99 | +kubectl get replicationsource -A |
| 100 | + |
| 101 | +# Pick one that already exists (e.g., karakeep/meilisearch-pvc-backup) |
| 102 | +kubectl describe replicationsource meilisearch-pvc-backup -n karakeep |
| 103 | + |
| 104 | +# Look for: |
| 105 | +# - Status.Conditions: Type=Synchronizing, Status=True |
| 106 | +# - Status.LastSyncTime: should be recent |
| 107 | +# - Events: should show successful sync |
| 108 | + |
| 109 | +# Trigger manual backup (if schedule hasn't run yet) |
| 110 | +kubectl patch replicationsource meilisearch-pvc-backup -n karakeep \ |
| 111 | + --type=merge -p '{"spec":{"trigger":{"manual":"test-'$(date +%s)'"}}}' |
| 112 | + |
| 113 | +# Watch for backup job to start |
| 114 | +kubectl get jobs -n karakeep -w |
| 115 | + |
| 116 | +# Check logs of backup job |
| 117 | +kubectl logs -n karakeep -l volsync.backube/replication-source=meilisearch-pvc-backup --tail=50 |
| 118 | + |
| 119 | +# Expected: Should show Restic connecting to S3, uploading data, no auth errors |
| 120 | +``` |
| 121 | + |
| 122 | +#### Create Test PVC and Backup |
| 123 | + |
| 124 | +```bash |
| 125 | +# Create test namespace |
| 126 | +kubectl create namespace volsync-test |
| 127 | +kubectl label namespace volsync-test volsync.backube/privileged-movers=true |
| 128 | + |
| 129 | +# Wait for ExternalSecret to sync |
| 130 | +sleep 10 |
| 131 | +kubectl get secret volsync-rustfs-base -n volsync-test |
| 132 | + |
| 133 | +# Create test PVC with backup label |
| 134 | +cat <<EOF | kubectl apply -f - |
| 135 | +apiVersion: v1 |
| 136 | +kind: PersistentVolumeClaim |
| 137 | +metadata: |
| 138 | + name: test-backup-pvc |
| 139 | + namespace: volsync-test |
| 140 | + labels: |
| 141 | + backup: "hourly" |
| 142 | +spec: |
| 143 | + accessModes: |
| 144 | + - ReadWriteOnce |
| 145 | + storageClassName: longhorn |
| 146 | + resources: |
| 147 | + requests: |
| 148 | + storage: 1Gi |
| 149 | +EOF |
| 150 | + |
| 151 | +# Write some test data |
| 152 | +kubectl run test-writer --image=busybox --restart=Never -n volsync-test \ |
| 153 | + --overrides='{"spec":{"containers":[{"name":"test","image":"busybox","command":["sh","-c","echo \"Test data at $(date)\" > /data/test.txt && sleep 3600"],"volumeMounts":[{"name":"data","mountPath":"/data"}]}],"volumes":[{"name":"data","persistentVolumeClaim":{"claimName":"test-backup-pvc"}}]}}' |
| 154 | + |
| 155 | +# Wait for pod to write data |
| 156 | +sleep 5 |
| 157 | +kubectl logs test-writer -n volsync-test |
| 158 | + |
| 159 | +# Check if Kyverno generated ReplicationSource |
| 160 | +kubectl get replicationsource -n volsync-test |
| 161 | +# Expected: test-backup-pvc-backup should exist |
| 162 | + |
| 163 | +# Describe it to see status |
| 164 | +kubectl describe replicationsource test-backup-pvc-backup -n volsync-test |
| 165 | + |
| 166 | +# Check Kyverno events on PVC |
| 167 | +kubectl describe pvc test-backup-pvc -n volsync-test |
| 168 | +# Should show events from kyverno about generating ReplicationSource |
| 169 | + |
| 170 | +# Wait for next hour boundary or trigger manual backup |
| 171 | +kubectl patch replicationsource test-backup-pvc-backup -n volsync-test \ |
| 172 | + --type=merge -p '{"spec":{"trigger":{"manual":"test-'$(date +%s)'"}}}' |
| 173 | + |
| 174 | +# Watch backup job |
| 175 | +kubectl get jobs -n volsync-test -w |
| 176 | + |
| 177 | +# Check job logs |
| 178 | +kubectl logs -n volsync-test -l volsync.backube/replication-source=test-backup-pvc-backup -f |
| 179 | +``` |
| 180 | + |
| 181 | +**Expected Success Indicators:** |
| 182 | +- ReplicationSource shows `Synchronizing: True` |
| 183 | +- Job completes successfully |
| 184 | +- Logs show Restic uploading to S3 without auth errors |
| 185 | +- Check RustFS console: `volsync-backup/volsync-test/test-backup-pvc/` should have data |
| 186 | + |
| 187 | +**Common Errors:** |
| 188 | +- `Access Denied` → k8s-admin-secret-key is wrong in 1Password |
| 189 | +- `Secret not found` → ExternalSecret hasn't synced yet |
| 190 | +- `Repository does not exist` → First backup will init repo (normal) |
| 191 | + |
| 192 | +--- |
| 193 | + |
| 194 | +### Phase 3: Test Longhorn Backups |
| 195 | + |
| 196 | +```bash |
| 197 | +# Check Longhorn backup target configuration |
| 198 | +kubectl get setting -n longhorn-system backup-target -o yaml |
| 199 | + |
| 200 | +# Should show: s3://longhorn@... with credentials from secret |
| 201 | + |
| 202 | +# Trigger test backup of a Longhorn volume |
| 203 | +# Find a volume to test with |
| 204 | +kubectl get volumes -n longhorn-system |
| 205 | + |
| 206 | +# Create backup via Longhorn UI or kubectl |
| 207 | +# (Longhorn backups are typically done via UI or custom scripts) |
| 208 | + |
| 209 | +# Alternative: Check if existing backups are accessible |
| 210 | +# Login to Longhorn UI and go to Backup tab |
| 211 | +# Expected: Should be able to see existing backups without auth errors |
| 212 | +``` |
| 213 | + |
| 214 | +--- |
| 215 | + |
| 216 | +### Phase 4: Test Monitoring Stack S3 Access |
| 217 | + |
| 218 | +```bash |
| 219 | +# Check Loki is writing to S3 (chunks storage) |
| 220 | +kubectl logs -n loki-stack -l app.kubernetes.io/name=loki --tail=50 | grep -i s3 |
| 221 | + |
| 222 | +# Should NOT see auth errors like: |
| 223 | +# - "Access Denied" |
| 224 | +# - "InvalidAccessKeyId" |
| 225 | + |
| 226 | +# Check Tempo is writing to S3 (traces storage) |
| 227 | +kubectl logs -n monitoring -l app.kubernetes.io/name=tempo --tail=50 | grep -i s3 |
| 228 | +``` |
| 229 | + |
| 230 | +--- |
| 231 | + |
| 232 | +### Phase 5: Test Database Backups |
| 233 | + |
| 234 | +```bash |
| 235 | +# Check CloudNativePG backups |
| 236 | +kubectl get backups -n cloudnative-pg |
| 237 | +kubectl describe backup <backup-name> -n cloudnative-pg |
| 238 | + |
| 239 | +# Check Crunchy Postgres backups |
| 240 | +kubectl get postgrescluster -n crunchy-postgres immich -o yaml | grep -A 10 backup |
| 241 | + |
| 242 | +# Trigger manual backup |
| 243 | +kubectl annotate postgrescluster immich -n crunchy-postgres \ |
| 244 | + postgres-operator.crunchydata.com/pgbackrest-backup="$(date +%Y-%m-%d-%H-%M-%S)" |
| 245 | + |
| 246 | +# Check backup job logs |
| 247 | +kubectl logs -n crunchy-postgres -l postgres-operator.crunchydata.com/pgbackrest-backup --tail=100 |
| 248 | +``` |
| 249 | + |
| 250 | +--- |
| 251 | + |
| 252 | +## Cleanup Test Resources |
| 253 | + |
| 254 | +```bash |
| 255 | +# Remove test PVC and namespace |
| 256 | +kubectl delete pod test-writer -n volsync-test |
| 257 | +kubectl delete pvc test-backup-pvc -n volsync-test |
| 258 | +kubectl delete replicationsource test-backup-pvc-backup -n volsync-test |
| 259 | +kubectl delete namespace volsync-test |
| 260 | +``` |
| 261 | + |
| 262 | +--- |
| 263 | + |
| 264 | +## Rollback Plan (If Something Breaks) |
| 265 | + |
| 266 | +If authentication fails, you can quickly rollback: |
| 267 | + |
| 268 | +### Option 1: Revert 1Password (Quick Fix) |
| 269 | +1. Rename fields back to old names temporarily: |
| 270 | + - `k8s-admin-access-key` → `access_key` (or `loki_access_key` for monitoring) |
| 271 | + - `k8s-admin-secret-key` → `secret_key` (or `loki` for monitoring) |
| 272 | +2. Wait 1 hour for ExternalSecrets to refresh, or force sync: |
| 273 | + ```bash |
| 274 | + kubectl annotate externalsecret volsync-s3-credentials -n volsync-system \ |
| 275 | + force-sync=$(date +%s) --overwrite |
| 276 | + ``` |
| 277 | + |
| 278 | +### Option 2: Revert Git Changes |
| 279 | +```bash |
| 280 | +git revert HEAD |
| 281 | +git push |
| 282 | +# ArgoCD will auto-sync back to old config |
| 283 | +``` |
| 284 | + |
| 285 | +### Option 3: Manual Secret Override (Emergency) |
| 286 | +```bash |
| 287 | +# Manually create secret with correct credentials |
| 288 | +kubectl create secret generic volsync-s3-credentials \ |
| 289 | + -n volsync-system \ |
| 290 | + --from-literal=AWS_ACCESS_KEY_ID=k8s-admin \ |
| 291 | + --from-literal=AWS_SECRET_ACCESS_KEY='<actual-secret>' \ |
| 292 | + --dry-run=client -o yaml | kubectl apply -f - |
| 293 | +``` |
| 294 | + |
| 295 | +--- |
| 296 | + |
| 297 | +## Success Criteria Checklist |
| 298 | + |
| 299 | +- [ ] All ExternalSecrets show `SecretSynced: True` |
| 300 | +- [ ] Secrets contain `AWS_ACCESS_KEY_ID=k8s-admin` (decoded) |
| 301 | +- [ ] VolSync test backup completes successfully |
| 302 | +- [ ] No S3 authentication errors in any component logs |
| 303 | +- [ ] RustFS console shows new backup data in test namespace |
| 304 | +- [ ] Longhorn backup target accessible in UI |
| 305 | +- [ ] Loki and Tempo logs show no S3 errors |
| 306 | +- [ ] Database backup jobs complete successfully |
| 307 | + |
| 308 | +Once all checkboxes are ✅, the k8s-admin unified authentication is working! |
0 commit comments