|
| 1 | +# Longhorn 1.10 Upgrade Runbook |
| 2 | + |
| 3 | +This runbook captures the exact prechecks and commands required to safely upgrade Longhorn to v1.10.x in this cluster. |
| 4 | + |
| 5 | +Important highlights from the v1.10.0 release notes: |
| 6 | +- Kubernetes must be >= 1.25 |
| 7 | +- The Longhorn v1beta1 API was removed. If your cluster ever stored Longhorn CRs in v1beta1 (likely if you originally installed < v1.3.0), you MUST migrate CR storage to v1beta2 before upgrading. |
| 8 | +- Some defaultSettings support per-engine JSON. A known 1.10.0 bug can affect boolean DataEngineSpecific values when sourced via Helm values. We keep these as simple scalars unless we actively use V2. |
| 9 | + |
| 10 | +## Pre-checks |
| 11 | +- Ensure Kubernetes >= 1.25 across the cluster. |
| 12 | +- Confirm ArgoCD will install CRDs during Helm upgrade (kustomization includesCRDs: true). |
| 13 | +- Optionally snapshot/backup critical workloads. |
| 14 | + |
| 15 | +## Mandatory: CRD storage version migration (before upgrading) |
| 16 | +Before the migration, fix legacy CRD conversion blocks that can break CRD applies during upgrade. |
| 17 | + |
| 18 | +1) Fix CRD conversion blocks (older installs sometimes leave webhookClientConfig while strategy isn't Webhook): |
| 19 | + |
| 20 | +``` |
| 21 | +./scripts/longhorn-fix-crd-conversion.sh |
| 22 | +``` |
| 23 | + |
| 24 | +2) Run the helper script to migrate any Longhorn CRDs that still have v1beta1 storedVersions to v1beta2. |
| 25 | + |
| 26 | +Steps (requires kubectl + jq): |
| 27 | +1) Pause Longhorn syncs in ArgoCD (optional but recommended during migration window). |
| 28 | +2) Run the script: |
| 29 | + |
| 30 | +``` |
| 31 | +./scripts/longhorn-v110-crd-migration.sh |
| 32 | +``` |
| 33 | + |
| 34 | +3) Verify all Longhorn CRDs show only v1beta2 in storedVersions: |
| 35 | + |
| 36 | +``` |
| 37 | +kubectl get crd -l app.kubernetes.io/name=longhorn -o=jsonpath='{range .items[*]}{.metadata.name}{": "}{.status.storedVersions}{"\n"}{end}' |
| 38 | +``` |
| 39 | +Expected: every line shows ["v1beta2"]. If any show v1beta1, re-run the script or investigate. |
| 40 | + |
| 41 | +## Upgrade via ArgoCD |
| 42 | +- Chart version is pinned to 1.10.0 in `infrastructure/storage/longhorn/kustomization.yaml`. |
| 43 | +- Values are managed in `infrastructure/storage/longhorn/values.yaml`. |
| 44 | +- Pre-upgrade checker job is disabled to avoid GitOps drift (`preUpgradeChecker.jobEnabled: false`). |
| 45 | + |
| 46 | +Sync the Longhorn app in ArgoCD. Wait for all pods in `longhorn-system` to become Ready. |
| 47 | + |
| 48 | +## Post-upgrade checks |
| 49 | +- Pods healthy: |
| 50 | + - longhorn-manager, longhorn-ui, longhorn-csi-plugin, csi-* sidecars, instance-manager, engine-image, share-manager |
| 51 | +- Longhorn UI reachable (via existing Gateway/HTTPRoute) |
| 52 | +- Create a test PVC, attach to a test pod, write small data, and verify persistence. |
| 53 | +- Recurring jobs present (from `recurring-jobs.yaml`). |
| 54 | +- Backup target is detected (S3/MinIO) and can list/create a small backup. |
| 55 | + |
| 56 | +## Rollback guidance (only if upgrade fails early) |
| 57 | +If you skipped the migration and upgraded, managers may fail with CRD storedVersions errors. Follow the v1.10 release notes to temporarily patch the webhook and downgrade to the exact previous 1.9.x, then perform the migration script above and retry the upgrade. |
| 58 | + |
| 59 | +Reference: |
| 60 | +- Release notes: https://github.com/longhorn/longhorn/releases/tag/v1.10.0 |
| 61 | +- Install with Helm Controller (context for K3s/RKE2): https://longhorn.io/docs/1.10.0/deploy/install/install-with-helm-controller/ |
0 commit comments