Releases: longhorn/longhorn
Longhorn v1.11.0
Longhorn v1.11.0 Release Notes
The Longhorn team is excited to announce the release of Longhorn v1.11.0. This release marks a major milestone, with the V2 Data Engine officially entering the Technical Preview stage following significant stability improvements.
Additionally, this version optimizes the stability of the whole system and introduces critical improvements in resource observability, scheduling, and utilization.
For terminology and background on Longhorn releases, see Releases.
Warning
Hotfix
longhorn-instance-manager Image
The longhorn-instance-manager:v1.11.0 image is affected by a regression issue introduced by the new longhorn-instance-manager Proxy service APIs. The bug causes Proxy connection leaks in the longhorn-instance-manager pods, resulting in increased memory usage. To mitigate this issue, replace longhornio/longhorn-instance-manager:v1.11.0 with the hotfixed image longhornio/longhorn-instance-manager:v1.11.0-hotfix-1.
You can apply the update by following these steps:
-
Update the
longhorn-instance-managerimage- Change the longhorn-instance-manager image tag from
v1.11.0tov1.11.0-hotfix-1in the appropriate file:- For Helm: Update
values.yaml - For manifests: Update the deployment manifest directly.
- For Helm: Update
- Change the longhorn-instance-manager image tag from
-
Proceed with the installation or upgrade
- Apply the changes using your standard Helm install/upgrade command or reapply the updated manifest.
longhorn-manager Image
The longhorn-manager:v1.11.0 image is affected by a regression issue introduced by the new Kubernetes Node validator. The bug blocks setting Kubernetes node CNI labels because it waits for the Longhorn webhook server to be running, while the Longhorn webhook server waits for CNI network to be ready. To mitigate this issue, replace longhornio/longhorn-manager:v1.11.0 with the hotfixed image longhornio/longhorn-manager:v1.11.0-hotfix-1.
You can apply the update by following these steps:
- Disable the upgrade version check
- Helm users: Set
upgradeVersionChecktofalsein thevalues.yamlfile. - Manifest users: Remove the
--upgrade-version-checkflag from the deployment manifest.
- Update the
longhorn-managerimage
- Change the
longhorn-managerimage tag fromv1.11.0tov1.11.0-hotfix-1in the appropriate file:- For Helm: Update
values.yaml. - For manifests: Update the deployment manifest directly.
- For Helm: Update
- Proceed with the installation or upgrade
- Apply the changes using your standard Helm install/upgrade command or reapply the updated manifest.
Deprecation
V2 Backing Image Deprecation
The Backing Image feature for the V2 Data Engine is now deprecated in v1.11.0 and is scheduled for removal in v1.12.0.
Users using V2 volumes for virtual machines are encouraged to adopt the Containerized Data Importer (CDI) for volume population instead.
Primary Highlights
V2 Data Engine
Now in Technical Preview Stage
We are pleased to announce that the V2 Data Engine has officially graduated to the Technical Preview stage. This indicates increased stability and feature maturity as we move toward General Availability.
Limitation: While the engine is in Technical Preview, live upgrade is not supported yet. V2 volumes must be detached (offline) before engine upgrade.
Support for ublk Frontend
Longhorn supports configuring UBLK performance parameters globally, per volume, or via StorageClass to improve I/O performance.
V1 Data Engine
Faster Replica Rebuilding from Multiple Sources
The V1 Data Engine now supports parallel rebuilding. When a replica needs to be rebuilt, the engine can now stream data from multiple healthy replicas simultaneously rather than a single source. This significantly reduces the time required to restore redundancy for volumes containing tons of scattered data chunks.
General
Balance-Aware Algorithm Disk Selection For Replica Scheduling
Longhorn improves the disk selection for the replica scheduling by introducing an intelligent balance-aware scheduling algorithm, reducing uneven storage usage across nodes and disks.
Node Disk Health Monitoring
Longhorn now actively monitors the physical health of the underlying disks used for storage by using S.M.A.R.T. data. This allows administrators to identify issues and raise alerts when abnormal SMART metrics are detected, helping prevent failed volumes.
Share Manager Networking
Users can now configure an extra network interface for the Share Manager to support complex network segmentation requirements.
ReadWriteOncePod (RWOP) Support
Full support for the Kubernetes ReadWriteOncePod access mode has been added.
StorageClass allowedTopologies Support
Administrators can now use the allowedTopologies field in Longhorn StorageClasses to restrict volume provisioning to specific zones, regions, or nodes within the cluster.
Installation
Important
Ensure that your cluster is running Kubernetes v1.25 or later before installing Longhorn v1.11.0.
You can install Longhorn using a variety of tools, including Rancher, Kubectl, and Helm. For more information about installation methods and requirements, see Quick Installation in the Longhorn documentation.
Upgrade
Important
Ensure that your cluster is running Kubernetes v1.25 or later before upgrading from Longhorn v1.10.x to v1.11.0.
Longhorn only allows upgrades from supported versions. For more information about upgrade paths and procedures, see Upgrade in the Longhorn documentation.
Post-Release Known Issues
For information about issues identified after this release, see Release-Known-Issues.
Resolved Issues in this release
Highlight
- [FEATURE] Add support for ReadWriteOncePod access mode 9727 - @derekbit @shikanime @chriscchien @Copilot
- [FEATURE] Scale replica rebuilding speed from multiple healthy replicas 11331 - @derekbit @shuo-wu @roger-ryao @Copilot
- [FEATURE] Support StorageClass allowedTopologies for Longhorn volumes 12261 - @yangchiu @derekbit @hookak @Copilot
- [FEATURE] Support extra network interface (not only storage network) on the share manager pod 10269 - @yangchiu @c3y1huang
- [FEATURE] Monitor Node Disk Health 12016 - @c3y1huang @roger-ryao
- [FEATURE] Replica Auto Balance Across Nodes based on Node Disk Space Consumption 10512 - @davidcheng0922 @chriscchien
Feature
- [FEATURE] Guess Linux distro from the package manager 12153 - @yangchiu @derekbit @NamrathShetty @Copilot
- [FEATURE] Provide a helm chart setting to define the managerUrl 10583 - @lexfrei @yangchiu
- [FEATURE] Add metric for last backup of a volume 6049 - @c3y1huang @roger-ryao
- [FEATURE] Real-time volume performance monitoring 368 - @derekbit @hookak
- [UI][FEATURE] Monitor Node Disk Health 12263 - @houhoucoop @roger-ryao
- [FEATURE] custom annotation/label of UI's k8s service on value.yaml of helm chart 11754 - @yangchiu @lucasl0st
- [FEATURE] Make
longhornctlloadublk_drvmodule when kernel version is 6 or newer 11803 - @chriscchien @bachmanity1 - [BUG] Inherit namespace for longhorn-share-manager in FastFailover mode 12244 - @yangchiu @semenas
- [FEATURE] Enable CSI pod anti-affinity preset update 12100 - @yangchiu @yulken
- [FEATURE] [Dependency] aws-sdk-go v1.55.7 is EOL as of 2025-07-31 — plan to migrate to v2? 12098 - @mantissahz @roger-ryao
- [FEATURE] Change volume operation menu button behaviour from hover to click. 11408 - @yangchiu @houhoucoop
- [FEATURE] "hard" podAntiAffinity for csi-attacher/csi-provisioner/csi-resizer/csi-snapshotter 11617 - @yangchiu @yulken
- [FEATURE] node storage scheduled metrics 11949 - @yangchiu @AoRuiAC
Impro...
Longhorn v1.10.2
Longhorn v1.10.2 Release Notes
Longhorn 1.10.2 introduces several improvements and bug fixes that are intended to improve system quality, resilience, stability and security.
We welcome feedback and contributions to help continuously improve Longhorn.
For terminology and context on Longhorn releases, see Releases.
Important Fixes
This release includes several critical stability fixes.
RWX Volume Unavailable After Node Drain
Fixed a race condition where ReadWriteMany (RWX) volumes could remain in the attaching state after node drains, causing workloads to become unavailable.
For more details, see Issue #12231.
Encrypted Volume Cannot Be Expanded Online
Fixed an issue where online expansion of encrypted volumes did not propagate the new size to the dm-crypt device.
For more details, see Issue #12368.
Cloned Volume Cannot Be Attached to Workload
Fixed a bug where cloned volumes could fail to reach a healthy state, preventing attachment to workloads.
For more details, see Issue #12208.
Block Mode Volume Migration Stuck
Fixed a regression in block-mode volume migrations where newly created replicas could incorrectly inherit the lastFailedAt timestamp from source replicas, causing repeated deletion and blocking migration completion.
For more details, see Issue #12312.
Replica Auto Balance Disk Pressure Threshold Stalled
Fixed an issue where replica auto-balance under disk pressure could be blocked if stopped volumes were present on the disk.
For more details, see Issue #12334.
Replicas Accumulate During Engine Upgrade
Fixed a bug where temporary replicas could accumulate during engine upgrade. High etcd latency could cause new replicas to fail verification, leading to accumulation over multiple reconciliation cycles.
For more details, see Issue #12115.
Potential Client Connection and Context Leak
Fixed potential context leaks in the instance manager client and backing image manager client, improving stability and preventing resource exhaustion.
For more details, see Issue #12200 and Issue #12195.
Replica Node Level Soft Anti-Affinity Ignored
Fixed a bug of replica scheduling loop where replicas could be scheduled onto nodes that already host a replica, even when Replica Node-Level Soft Anti-Affinity was disabled.
For more details, see Issue #12251.
Installation
Important
Ensure that your cluster is running Kubernetes v1.25 or later before installing Longhorn v1.10.2.
You can install Longhorn using a variety of tools, including Rancher, Kubectl, and Helm. For more information about installation methods and requirements, see Quick Installation in the Longhorn documentation.
Upgrade
Important
Ensure that your cluster is running Kubernetes v1.25 or later before upgrading from Longhorn v1.9.x to v1.10.2.
Longhorn only allows upgrades from supported versions. For more information about upgrade paths and procedures, see Upgrade in the Longhorn documentation.
Post-Release Known Issues
For information about issues identified after this release, see Release-Known-Issues.
Resolved Issues
Feature
- [BACKPORT][v1.10.2][FEATURE] Inherit namespace for longhorn-share-manager in FastFailover mode 12245 - @yangchiu
- [BACKPORT][v1.10.2][FEATURE] [Dependency] aws-sdk-go v1.55.7 is EOL as of 2025-07-31 — plan to migrate to v2? 12181 - @mantissahz @roger-ryao
Improvement
- [BACKPORT][v1.10.2][IMPROVEMENT] Fix V2 Volume CSI Clone Slowness Caused by VolumeAttachment Webhook Blocking 12329 - @PhanLe1010 @roger-ryao
Bug
- [BACKPORT][v1.10.2][BUG]
instance-manageron nodes that don't have hard or solid state disk DDOSing cluster DNS server with TXT query_grpc_config.localhost12536 - @COLDTURNIP @chriscchien - [BACKPORT] Replica rebuild, clone and restore fail, traffic being sent to HTTP proxy 12518 - @yangchiu @derekbit
- [BACKPORT][v1.10.2][BUG] Healthy replica could be deleted unexpectedly after reducing volume's number of replicas 12512 - @yangchiu @shuo-wu
- [BACKPORT][v1.10.2][BUG] Data locality enabled volume fails to remove an existing running replica after numberOfReplicas reduced 12509 - @derekbit @chriscchien
- [BACKPORT][v1.10.2][BUG] System backup may fail to be created or deleted 12479 - @yangchiu @mantissahz
- [BACKPORT][v1.10.2][BUG] Some default settings in questions.yaml are placed incorrectly. 12222 - @derekbit @roger-ryao
- [BACKPORT][v1.10.2][BUG] Auto balance feature may lead to volumes falling into a replica deletion-recreation loop 12482 - @shuo-wu @roger-ryao
- [BACKPORT][v1.10.2][BUG] Single replica volume could get stuck in attaching/detaching loop after the replica node rebooted 12494 - @COLDTURNIP @yangchiu
- [BACKPORT][v1.10.2][BUG] Potential Instance Manager Client Context Leak 12200 - @derekbit @chriscchien
- [BACKPORT][v1.10.2][BUG] SnapshotBack proxy request might be sent to incorrect instance-manager pod 12476 - @derekbit @chriscchien
- [BACKPORT][v1.10.2][BUG] unknown OS condition in node CR is not properly removed during upgrade 12451 - @COLDTURNIP @roger-ryao
- [BACKPORT][v1.10.2][BUG] RWX volume becomes unavailable after drain node 12231 - @yangchiu @mantissahz
- [BACKPORT][v1.10.2][BUG] mounting error is not properly hanedled during CSI node publish volume 12382 - @COLDTURNIP @yangchiu
- [BACKPORT][v1.10.2][BUG] Encrypted Volume Cannot Be Expanded Online 12368 - @yangchiu @mantissahz
- [BACKPORT][v1.10.2][BUG] The auo generated backing image pod name is complained by kubelet 12357 - @COLDTURNIP @yangchiu
- [BACKPORT][v1.10.2][BUG]
tests.test_cloning.test_cloning_basicfails at msater-head 12342 - @c3y1huang - [BACKPORT][v1.10.2][Bug] A cloned volume cannot be attached to a workload 12208 - @yangchiu @PhanLe1010
- [BACKPORT][v1.10.2][BUG] Block Mode Volume Migration Stuck 12312 - @COLDTURNIP @yangchiu @shuo-wu
- [BACKPORT][v1.10.2][BUG] Replica auto balance disk pressure threshold stalled with stopped volumes 12334 - @c3y1huang @chriscchien
- [BACKPORT][v1.10.2][BUG] short name mode is enforcing, but image name longhornio/longhorn-manager:v1.10. │ │ 0 returns ambiguous list 12270 - @yangchiu
- [BACKPORT][v1.10.2][BUG] Replicas accumulate during engine upgrade 12115 - @c3y1huang @chriscchien
- [BACKPORT][v1.10.2][BUG] Potential BackingImageManagerClient Connection and Context Leak 12195 - @derekbit @chriscchien
- [BACKPORT][v1.10.2][BUG] Longhorn ignores
Replica Node Level Soft Anti-Affinitywhen auto balance is set tobest-effort12251 - @c3y1huang @chriscchien - [BACKPORT][v1.10.2][BUG] invalid memory address or nil pointer dereference (again) 12234 - @chriscchien @bachmanity1
- [BACKPORT][v1.10.2][BUG] Request Header Or Cookie Too Large in Web UI with OIDC auth 12213 - @chriscchien @houhoucoop
Contributors
Longhorn v1.11.0-rc3
DON'T UPGRADE from/to any RC/Preview/Sprint releases because the operation is not supported.
Resolved Issues in this release
Highlight
- [FEATURE] Add support for ReadWriteOncePod access mode 9727 - @derekbit @shikanime @chriscchien @Copilot
- [FEATURE] Scale replica rebuilding speed form multiple healthy replicas 11331 - @derekbit @shuo-wu @roger-ryao @Copilot
- [FEATURE] Support StorageClass allowedTopologies for Longhorn volumes 12261 - @yangchiu @derekbit @hookak @Copilot
- [FEATURE] Support extra network interface (not only storage network) on the share manager pod 10269 - @yangchiu @c3y1huang
- [FEATURE] Monitor Node Disk Health 12016 - @c3y1huang @roger-ryao
- [FEATURE] Replica Auto Balance Across Nodes based on Node Disk Space Consumption 10512 - @davidcheng0922 @chriscchien
Feature
- [FEATURE] Guess Linux distro from the package manager 12153 - @yangchiu @derekbit @NamrathShetty @Copilot
- [FEATURE] Provide a helm chart setting to define the managerUrl 10583 - @lexfrei @yangchiu
- [FEATURE] Add metric for last backup of a volume 6049 - @c3y1huang @roger-ryao
- [FEATURE] Real-time volume performance monitoring 368 - @derekbit @hookak
- [UI][FEATURE] Monitor Node Disk Health 12263 - @houhoucoop @roger-ryao
- [FEATURE] custom annotation/label of UI's k8s service on value.yaml of helm chart 11754 - @yangchiu @lucasl0st
- [FEATURE] Make
longhornctlloadublk_drvmodule when kernel version is 6 or newer 11803 - @chriscchien @bachmanity1 - [BUG] Inherit namespace for longhorn-share-manager in FastFailover mode 12244 - @yangchiu @semenas
- [FEATURE] Enable CSI pod anti-affinity preset update 12100 - @yangchiu @yulken
- [FEATURE] [Dependency] aws-sdk-go v1.55.7 is EOL as of 2025-07-31 — plan to migrate to v2? 12098 - @mantissahz @roger-ryao
- [FEATURE] Change volume operation menu button behaviour from hover to click. 11408 - @yangchiu @houhoucoop
- [FEATURE] "hard" podAntiAffinity for csi-attacher/csi-provisioner/csi-resizer/csi-snapshotter 11617 - @yangchiu @yulken
- [FEATURE] node storage scheduled metrics 11949 - @yangchiu @AoRuiAC
Improvement
- [IMPROVEMENT] Generalize the offline rebuilding setting for both data engines 12484 - @mantissahz @chriscchien
- [IMPROVEMENT] Introduce Concurrent Job Limit for Snapshot Operations 11635 - @yangchiu @derekbit @davidcheng0922 @Copilot
- [IMPROVEMENT] Improve disk error logging to retain errors from newDiskServiceClients() 12446 - @yangchiu @davidcheng0922
- [IMPROVEMENT] Propagate longhorn-manager's timezone to instance-manager and CSI pods 12448 - @hookak @roger-ryao
- [UI][FEATURE] Scale replica rebuilding speed form multiple healthy replicas 12461 - @houhoucoop @roger-ryao
- [IMPROVEMENT] Configure rolling update strategy for longhorn-manager and CSI deployments 12240 - @hookak @chriscchien
- [IMPROVEMENT] Improve log messages for
rebuildNewReplica()in lonbghorn-manager 12426 - @derekbit @chriscchien - [IMPROVEMENT] misleading message when instance manager tries to create the pod 11759 - @mantissahz @chriscchien
- [IMPROVEMENT] To improve the debugging process and UX, it would be nice that the error is recorded in the
instancemanager.status.conditions. 6732 - @mantissahz @chriscchien - [IMPROVEMENT] Add setting to disable node disk health monitoring 12300 - @derekbit @roger-ryao @Copilot
- [IMPROVEMENT] Avoid repeat engine restart when there are replica unavailable during migration 11397 - @yangchiu @shuo-wu
- [IMPROVEMENT] [Script] Minor script adjustments from PR #12177 12187 - @rauldsl @yangchiu
- [IMPROVEMENT] Check toolchain versions before generate k8s codes 12164 - @derekbit @roger-ryao
- [IMPROVEMENT] Create Volume UI improvement, Automatically Filter
Data SourceBased on v1 or v2 Selection 11846 - @yangchiu @houhoucoop - [IMPROVEMENT] Disable the snapshot of v1 volume hashing while it is being deleted 10294 - @davidcheng0922 @chriscchien
- [IMPROVEMENT] Expose SPDK UBLK Parameters 11039 - @derekbit @PhanLe1010 @roger-ryao @Copilot
- [IMPROVEMENT] Check that block device is not in use before creating disk 12078 - @chriscchien @bachmanity1
- [UI][IMPROVEMENT] Awareness of when an offline replica rebuilding is triggered for an individual volume 11247 - @houhoucoop @roger-ryao
- [IMPROVEMENT] Ensure synchronized upgrades between longhorn-manager and instance-manager 12309 - @hookak @chriscchien
- [IMPROVEMENT] Add Resource Limits Configuration for Longhorn manager/instance-manager 12225 - @hookak @chriscchien
- [IMPROVEMENT] Add Validation Webhook to Volume Expansion When Node Disk Is Full 12134 - @yangchiu @davidcheng0922
- [UI][IMPROVEMENT] Expose SPDK UBLK Parameters 12166 - @houhoucoop @roger-ryao
- [IMPROVEMENT] Fix V2 Volume CSI Clone Slowness Caused by VolumeAttachment Webhook Blocking 12328 - @PhanLe1010 @roger-ryao
- [IMPROVEMENT] Use label-based state in metrics instead of numeric values 10723 - @hookak @roger-ryao
- [IMPROVEMENT] Add Resource Limits Configuration for CSI Components 12224 - @yangchiu @hookak @Copilot
- [IMPROVEMENT] Awareness of when an offline replica rebuilding is triggered for an individual volume 11246 - @yangchiu @mantissahz
- [IMPROVEMENT] Add loadBalancerClass value inside a helm chart for ui service 12273 - @ehpc @chriscchien
- [IMPROVEMENT] Add DNS round-robin load balancing to the pool of S3 addresses 12296 - @yangchiu
- [UI][IMPROVEMENT] Should Not Hide the Deleted Snapshots on UI 11620 - @yangchiu @houhoucoop
- [IMPROVMENT] Helm chart Multiple TLS FQDNs 12127 - @yangchiu @hrabalvojta
- [IMPROVEMENT] Removing executables from mirrored-longhornio-longhorn-engine image 11254 - @derekbit @chriscchien
- [IMPROVEMENT] [DOC] Clarify replica auto-balance behavior for unhealthy and detached volumes 12002 - @roger-ryao @sushant-suse
- [IMPROVEMENT] CRD enum values 9718 - @roger-ryao @nzhan126
- [DOC] Troubleshooting KB Articles Fix Typos 12199 - @jmeza-xyz
- [IMPROVEMENT] Remove backupstore releated settings 11026 - @nzhan126
- [IMPROVEMENT] Reject Trim Operation on Block Volume 12048 - @yangchiu @derekbit
- [IMPROVEMENT] Replace
github.com/pkg/errorswithgithub.com/cockroachdb/errors11413 - @derekbit @chriscchien - [UI][IMPROVEMENT] UI shows the backing image virtual size 11674 - @chriscchien @houhoucoop
- [IMPROVEMENT] Simplify locking in unsub and stream methods 12057 - @derekbit @NamrathShetty
- [UI][IMPROVEMENT] Show Error Message for Unschedulable Disks 11449 - @yangchiu @houhoucoop
- [IMPROVEMENT] The
auto-delete-pod-when-volume-detached-unexpectedlyshould only focus on the kuberentes builtin workload. 12120 - @derekbit @chriscchien @sushant-suse - [IMPROVEMENT]
CSIStorageCapacityobjects must show schedulable (allocatable) capacity 12014 ...
Longhorn v1.10.2-rc1
DON'T UPGRADE from/to any RC/Preview/Sprint releases because the operation is not supported.
Resolved Issues in this release
Feature
- [BACKPORT][v1.10.2][FEATURE] Inherit namespace for longhorn-share-manager in FastFailover mode 12245 - @yangchiu
- [BACKPORT][v1.10.2][FEATURE] [Dependency] aws-sdk-go v1.55.7 is EOL as of 2025-07-31 — plan to migrate to v2? 12181 - @mantissahz @roger-ryao
Improvement
- [BACKPORT][v1.10.2][IMPROVEMENT] Fix V2 Volume CSI Clone Slowness Caused by VolumeAttachment Webhook Blocking 12329 - @PhanLe1010 @roger-ryao
Bug
- [BACKPORT] Replica rebuild, clone and restore fail, traffic being sent to HTTP proxy 12518 - @yangchiu @derekbit
- [BACKPORT][v1.10.2][BUG] Healthy replica could be deleted unexpectedly after reducing volume's number of replicas 12512 - @yangchiu @shuo-wu
- [BACKPORT][v1.10.2][BUG] Data locality enabled volume fails to remove an existing running replica after numberOfReplicas reduced 12509 - @derekbit @chriscchien
- [BACKPORT][v1.10.2][BUG] System backup may fail to be created or deleted 12479 - @yangchiu @mantissahz
- [BACKPORT][v1.10.2][BUG] Some default settings in questions.yaml are placed incorrectly. 12222 - @derekbit @roger-ryao
- [BACKPORT][v1.10.2][BUG] Auto balance feature may lead to volumes falling into a replica deletion-recreation loop 12482 - @shuo-wu @roger-ryao
- [BACKPORT][v1.10.2][BUG] Single replica volume could get stuck in attaching/detaching loop after the replica node rebooted 12494 - @COLDTURNIP @yangchiu
- [BACKPORT][v1.10.2][BUG] Potential Instance Manager Client Context Leak 12200 - @derekbit @chriscchien
- [BACKPORT][v1.10.2][BUG] SnapshotBack proxy request might be sent to incorrect instance-manager pod 12476 - @derekbit @chriscchien
- [BACKPORT][v1.10.2][BUG] unknown OS condition in node CR is not properly removed during upgrade 12451 - @COLDTURNIP @roger-ryao
- [BACKPORT][v1.10.2][BUG] RWX volume becomes unavailable after drain node 12231 - @yangchiu @mantissahz
- [BACKPORT][v1.10.2][BUG] Should not rebuild the v2 volume if this volume never be used. 12239 - @mantissahz
- [BACKPORT][v1.10.2][BUG] Test case
test_recurring_jobs_when_volume_detached_unexpectedlyfailed: backup completed but progress did not reach 100% 12156 - @yangchiu @mantissahz - [BACKPORT][v1.10.2][BUG] mounting error is not properly hanedled during CSI node publish volume 12382 - @COLDTURNIP @yangchiu
- [BACKPORT][v1.10.2][BUG] Encrypted Volume Cannot Be Expanded Online 12368 - @yangchiu @mantissahz
- [BACKPORT][v1.10.2][BUG] The auo generated backing image pod name is complained by kubelet 12357 - @COLDTURNIP @yangchiu
- [BACKPORT][v1.10.2][BUG]
tests.test_cloning.test_cloning_basicfails at msater-head 12342 - @c3y1huang - [BACKPORT][v1.10.2][Bug] A cloned volume cannot be attached to a workload 12208 - @yangchiu @PhanLe1010
- [BACKPORT][v1.10.2][BUG] Block Mode Volume Migration Stuck 12312 - @COLDTURNIP @yangchiu @shuo-wu
- [BACKPORT][v1.10.2][BUG] Replica auto balance disk pressure threshold stalled with stopped volumes 12334 - @c3y1huang @chriscchien
- [BACKPORT][v1.10.2][BUG] short name mode is enforcing, but image name longhornio/longhorn-manager:v1.10. │ │ 0 returns ambiguous list 12270 - @yangchiu
- [BACKPORT][v1.10.2][BUG] Replicas accumulate during engine upgrade 12115 - @c3y1huang @chriscchien
- [BACKPORT][v1.10.2][BUG] Potential BackingImageManagerClient Connection and Context Leak 12195 - @derekbit @chriscchien
- [BACKPORT][v1.10.2][BUG] Longhorn ignores
Replica Node Level Soft Anti-Affinitywhen auto balance is set tobest-effort12251 - @c3y1huang @chriscchien - [BACKPORT][v1.10.2][BUG] invalid memory address or nil pointer dereference (again) 12234 - @chriscchien @bachmanity1
- [BACKPORT][v1.10.2][BUG] Request Header Or Cookie Too Large in Web UI with OIDC auth 12213 - @chriscchien @houhoucoop
Contributors
Longhorn v1.11.0-rc2
DON'T UPGRADE from/to any RC/Preview/Sprint releases because the operation is not supported.
Resolved Issues in this release
Highlight
- [FEATURE] Scale replica rebuilding speed form multiple healthy replicas 11331 - @derekbit @shuo-wu @roger-ryao @Copilot
- [FEATURE] Support StorageClass allowedTopologies for Longhorn volumes 12261 - @yangchiu @derekbit @hookak @Copilot
- [FEATURE] Add support for ReadWriteOncePod access mode 9727 - @shikanime @chriscchien
- [FEATURE] Support extra network interface (not only storage network) on the share manager pod 10269 - @yangchiu @c3y1huang
- [FEATURE] Monitor Node Disk Health 12016 - @c3y1huang @roger-ryao
- [FEATURE] Replica Auto Balance Across Nodes based on Node Disk Space Consumption 10512 - @davidcheng0922 @chriscchien
Feature
- [FEATURE] Guess Linux distro from the package manager 12153 - @yangchiu @NamrathShetty
- [FEATURE] Provide a helm chart setting to define the managerUrl 10583 - @lexfrei @yangchiu
- [FEATURE] Add metric for last backup of a volume 6049 - @c3y1huang @roger-ryao
- [FEATURE] Real-time volume performance monitoring 368 - @derekbit @hookak
- [UI][FEATURE] Monitor Node Disk Health 12263 - @houhoucoop @roger-ryao
- [FEATURE] custom annotation/label of UI's k8s service on value.yaml of helm chart 11754 - @yangchiu @lucasl0st
- [FEATURE] Make
longhornctlloadublk_drvmodule when kernel version is 6 or newer 11803 - @chriscchien @bachmanity1 - [BUG] Inherit namespace for longhorn-share-manager in FastFailover mode 12244 - @yangchiu @semenas
- [FEATURE] Enable CSI pod anti-affinity preset update 12100 - @yangchiu @yulken
- [FEATURE] [Dependency] aws-sdk-go v1.55.7 is EOL as of 2025-07-31 — plan to migrate to v2? 12098 - @mantissahz @roger-ryao
- [FEATURE] Change volume operation menu button behaviour from hover to click. 11408 - @yangchiu @houhoucoop
- [FEATURE] "hard" podAntiAffinity for csi-attacher/csi-provisioner/csi-resizer/csi-snapshotter 11617 - @yangchiu @yulken
- [FEATURE] node storage scheduled metrics 11949 - @yangchiu @AoRuiAC
Improvement
- [IMPROVEMENT] generalize the offline rebuilding setting for both data engines. 12484 - @mantissahz
- [IMPROVEMENT] Improve disk error logging to retain errors from newDiskServiceClients() 12446 - @yangchiu @davidcheng0922
- [IMPROVEMENT] Propagate longhorn-manager's timezone to instance-manager and CSI pods 12448 - @hookak @roger-ryao
- [UI][FEATURE] Scale replica rebuilding speed form multiple healthy replicas 12461 - @houhoucoop @roger-ryao
- [IMPROVEMENT] Configure rolling update strategy for longhorn-manager and CSI deployments 12240 - @hookak @chriscchien
- [IMPROVEMENT] Improve log messages for
rebuildNewReplica()in lonbghorn-manager 12426 - @derekbit @chriscchien - [IMPROVEMENT] misleading message when instance manager tries to create the pod 11759 - @mantissahz @chriscchien
- [IMPROVEMENT] To improve the debugging process and UX, it would be nice that the error is recorded in the
instancemanager.status.conditions. 6732 - @mantissahz @chriscchien - [IMPROVEMENT] Add setting to disable node disk health monitoring 12300 - @derekbit @roger-ryao @Copilot
- [IMPROVEMENT] Avoid repeat engine restart when there are replica unavailable during migration 11397 - @yangchiu @shuo-wu
- [IMPROVEMENT] [Script] Minor script adjustments from PR #12177 12187 - @rauldsl @yangchiu
- [IMPROVEMENT] Check toolchain versions before generate k8s codes 12164 - @derekbit @roger-ryao
- [IMPROVEMENT] Create Volume UI improvement, Automatically Filter
Data SourceBased on v1 or v2 Selection 11846 - @yangchiu @houhoucoop - [IMPROVEMENT] Disable the snapshot of v1 volume hashing while it is being deleted 10294 - @davidcheng0922 @chriscchien
- [IMPROVEMENT] Expose SPDK UBLK Parameters 11039 - @PhanLe1010 @roger-ryao
- [IMPROVEMENT] Check that block device is not in use before creating disk 12078 - @chriscchien @bachmanity1
- [UI][IMPROVEMENT] Awareness of when an offline replica rebuilding is triggered for an individual volume 11247 - @houhoucoop @roger-ryao
- [IMPROVEMENT] Ensure synchronized upgrades between longhorn-manager and instance-manager 12309 - @hookak @chriscchien
- [IMPROVEMENT] Add Resource Limits Configuration for Longhorn manager/instance-manager 12225 - @hookak @chriscchien
- [IMPROVEMENT] Add Validation Webhook to Volume Expansion When Node Disk Is Full 12134 - @yangchiu @davidcheng0922
- [UI][IMPROVEMENT] Expose SPDK UBLK Parameters 12166 - @houhoucoop @roger-ryao
- [IMPROVEMENT] Fix V2 Volume CSI Clone Slowness Caused by VolumeAttachment Webhook Blocking 12328 - @PhanLe1010 @roger-ryao
- [IMPROVEMENT] Use label-based state in metrics instead of numeric values 10723 - @hookak @roger-ryao
- [IMPROVEMENT] Add Resource Limits Configuration for CSI Components 12224 - @yangchiu @hookak @Copilot
- [IMPROVEMENT] Awareness of when an offline replica rebuilding is triggered for an individual volume 11246 - @yangchiu @mantissahz
- [IMPROVEMENT] Introduce Concurrent Job Limit for Snapshot Operations 11635 - @yangchiu @davidcheng0922
- [IMPROVEMENT] Add loadBalancerClass value inside a helm chart for ui service 12273 - @ehpc @chriscchien
- [IMPROVEMENT] Add DNS round-robin load balancing to the pool of S3 addresses 12296 - @yangchiu
- [UI][IMPROVEMENT] Should Not Hide the Deleted Snapshots on UI 11620 - @yangchiu @houhoucoop
- [IMPROVMENT] Helm chart Multiple TLS FQDNs 12127 - @yangchiu @hrabalvojta
- [IMPROVEMENT] Removing executables from mirrored-longhornio-longhorn-engine image 11254 - @derekbit @chriscchien
- [IMPROVEMENT] [DOC] Clarify replica auto-balance behavior for unhealthy and detached volumes 12002 - @roger-ryao @sushant-suse
- [IMPROVEMENT] CRD enum values 9718 - @roger-ryao @nzhan126
- [DOC] Troubleshooting KB Articles Fix Typos 12199 - @jmeza-xyz
- [IMPROVEMENT] Remove backupstore releated settings 11026 - @nzhan126
- [IMPROVEMENT] Reject Trim Operation on Block Volume 12048 - @yangchiu @derekbit
- [IMPROVEMENT] Replace
github.com/pkg/errorswithgithub.com/cockroachdb/errors11413 - @derekbit @chriscchien - [UI][IMPROVEMENT] UI shows the backing image virtual size 11674 - @chriscchien @houhoucoop
- [IMPROVEMENT] Simplify locking in unsub and stream methods 12057 - @derekbit @NamrathShetty
- [UI][IMPROVEMENT] Show Error Message for Unschedulable Disks 11449 - @yangchiu @houhoucoop
- [IMPROVEMENT] The
auto-delete-pod-when-volume-detached-unexpectedlyshould only focus on the kuberentes builtin workload. 12120 - @derekbit @chriscchien @sushant-suse - [IMPROVEMENT]
CSIStorageCapacityobjects must show schedulable (allocatable) capacity 12014 - @chriscchien @bachmanity1 - [IMPROVEMENT] improve error logging for failed mounting d...
Longhorn v1.10.1
Longhorn v1.10.1 Release Notes
Longhorn 1.10.1 introduces several improvements and bug fixes that are intended to improve system quality, resilience, stability and security.
We welcome feedback and contributions to help continuously improve Longhorn.
For terminology and context on Longhorn releases, see Releases.
Warning
HotFix
The longhorn-manager:v1.10.1 image is affected by
- Regression:
- [BUG] invalid memory address or nil pointer dereference that can trigger a nil-pointer dereference under certain conditions, potentially causing unexpected crashes.
- [BUG] Block Mode Volume Migration Stuck that can cause block mode volume migration to get stuck indefinitely.
- Day-one issues:
- V2 volume clone:
- [BUG] V2 Volume CSI Clone Slowness Caused by VolumeAttachment Webhook that can lead to significant delays during V2 volume cloning operations.
- [BUG] [Bug] A cloned volume cannot be attached to a workload that prevents cloned volumes from being attached to workloads.
- Replica auto-balance:
- [BUG] Replica auto balance disk pressure threshold stalled with stopped volumes that can cause the replica auto-balance feature to stall when volumes are stopped under disk pressure conditions.
- V2 volume clone:
To mitigate the issues, replace longhorn-manager:v1.10.1 with the hotfixed image longhorn-manager:v1.10.1-hotfix-2.
Follow these steps to apply the update:
-
Disable the upgrade version check
- Helm users: Set
upgradeVersionChecktofalsein thevalues.yamlfile. - Manifest users: Remove the
--upgrade-version-checkflag from the deployment manifest.
- Helm users: Set
-
Update the
longhorn-managerimage- Change the image tag from
v1.10.1tov1.10.1-hotfix-2in the appropriate file:- For Helm: Update
values.yaml - For manifests: Update the deployment manifest directly.
- For Helm: Update
- Change the image tag from
-
Proceed with the upgrade
- Apply the changes using your standard Helm upgrade command or reapply the updated manifest.
Upgrade
If your Longhorn cluster was initially deployed with a version earlier than v1.3.0, the Custom Resources (CRs) were created using the v1beta1 APIs. While the upgrade from Longhorn v1.8 to v1.9 automatically migrates all CRs to the new v1beta2 version, a manual CR migration is strongly advised before upgrading from Longhorn v1.9 to v1.10.
Certain operations, such as an etcd or CRD restore, may leave behind v1beta1 data. Manually migrating your CRs ensures that all Longhorn data is properly updated to the v1beta2 API, preventing potential compatibility issues and unexpected behavior with the new Longhorn version.
Following the manual migration, verify that v1beta1 has been removed from the CRD stored versions to ensure completion and a successful upgrade.
For more details, see Kubernetes official document for CRD storage version, and Issue #11886.
Migration Requirement Before Longhorn v1.10 Upgrade
Before upgrading from Longhorn v1.9 to v1.10, perform the following manual CRD storage version migration.
Note: If your Longhorn installation uses a namespace other than
longhorn-system, replacelonghorn-systemwith your custom namespace throughout the commands.
# Temporarily disable the CR validation webhook to allow updating read-only settings CRs.
kubectl patch validatingwebhookconfiguration longhorn-webhook-validator \
--type=merge \
-p "$(kubectl get validatingwebhookconfiguration longhorn-webhook-validator -o json | \
jq '.webhooks[0].rules |= map(if .apiGroups == ["longhorn.io"] and .resources == ["settings"] then
.operations |= map(select(. != "UPDATE")) else . end)')"
# Migrate CRDs that ever stored v1beta1 resources
migration_time="$(date +%Y-%m-%dT%H:%M:%S)"
crds=($(kubectl get crd -l app.kubernetes.io/name=longhorn -o json | jq -r '.items[] | select(.status.storedVersions | index("v1beta1")) | .metadata.name'))
for crd in "${crds[@]}"; do
echo "Migrating ${crd} ..."
for name in $(kubectl -n longhorn-system get "$crd" -o jsonpath='{.items[*].metadata.name}'); do
# Attach additional annotations to trigger v1beta1 resource updating in the latest storage version.
kubectl patch "${crd}" "${name}" -n longhorn-system --type=merge -p='{"metadata":{"annotations":{"migration-time":"'"${migration_time}"'"}}}'
done
# Clean up the stored version in CRD status
kubectl patch crd "${crd}" --type=merge -p '{"status":{"storedVersions":["v1beta2"]}}' --subresource=status
done
# Re-enable the CR validation webhook.
kubectl patch validatingwebhookconfiguration longhorn-webhook-validator \
--type=merge \
-p "$(kubectl get validatingwebhookconfiguration longhorn-webhook-validator -o json | \
jq '.webhooks[0].rules |= map(if .apiGroups == ["longhorn.io"] and .resources == ["settings"] then
.operations |= (. + ["UPDATE"] | unique) else . end)')"Migration Verification
After running the script, verify the CRD stored versions using this command:
kubectl get crd -l app.kubernetes.io/name=longhorn -o=jsonpath='{range .items[*]}{.metadata.name}{": "}{.status.storedVersions}{"\n"}{end}'Crucially, all Longhorn CRDs MUST have only "v1beta2" listed in storedVersions (i.e., "v1beta1" must be completely absent) before proceeding to the v1.10 upgrade.
Example of successful output:
backingimagedatasources.longhorn.io: ["v1beta2"]
backingimagemanagers.longhorn.io: ["v1beta2"]
backingimages.longhorn.io: ["v1beta2"]
backupbackingimages.longhorn.io: ["v1beta2"]
backups.longhorn.io: ["v1beta2"]
backuptargets.longhorn.io: ["v1beta2"]
backupvolumes.longhorn.io: ["v1beta2"]
engineimages.longhorn.io: ["v1beta2"]
engines.longhorn.io: ["v1beta2"]
instancemanagers.longhorn.io: ["v1beta2"]
nodes.longhorn.io: ["v1beta2"]
orphans.longhorn.io: ["v1beta2"]
recurringjobs.longhorn.io: ["v1beta2"]
replicas.longhorn.io: ["v1beta2"]
settings.longhorn.io: ["v1beta2"]
sharemanagers.longhorn.io: ["v1beta2"]
snapshots.longhorn.io: ["v1beta2"]
supportbundles.longhorn.io: ["v1beta2"]
systembackups.longhorn.io: ["v1beta2"]
systemrestores.longhorn.io: ["v1beta2"]
volumeattachments.longhorn.io: ["v1beta2"]
volumes.longhorn.io: ["v1beta2"]
With these steps completed, the Longhorn upgrade to v1.10 should now proceed without issues.
Troubleshooting CRD Upgrade Failures During Upgrade to Longhorn v1.10
If you did not apply the required pre-upgrade migration steps and the CRs are not fully migrated to v1beta2, the longhorn-manager Pods may fail to operate correctly. A common error message for this issue is:
Upgrade failed: cannot patch "backingimagedatasources.longhorn.io" with kind CustomResourceDefinition: CustomResourceDefinition.apiextensions.k8s.io "backingimagedatasources.longhorn.io" is invalid: status.storedVersions[0]: Invalid value: "v1beta1": missing from spec.versions; v1beta1 was previously a storage version, and must remain in spec.versions until a storage migration ensures no data remains persisted in v1beta1 and removes v1beta1 from status.storedVersions
To fix this issue, you must perform a forced downgrade back to the exact Longhorn v1.9.x version that was running before the failed upgrade attempt.
Downgrade Procedure (kubectl Installation)
If Longhorn was installed using kubectl, you must patch the current-longhorn-version setting before downgrading. Replace v1.9.x with the original version before upgrade in the following commands.
# Attaching annotation to allow patching current-longhorn-version.
kubectl patch settings.longhorn.io current-longhorn-version -n longhorn-system --type=merge -p='{"metadata":{"annotations":{"longhorn.io/update-setting-from-longhorn":""}}}'
# Temporarily override current version to allow old version installation
# Replace the value `"v1.9.x" to the original version before upgrade.
kubectl patch settings.longhorn.io current-longhorn-version -n longhorn-system --type=merge -p='{"value":"v1.9.x"}'After modifying current-longhorn-version, you can proceed to downgrade to the original Longhorn v1.9.x deployment.
Downgrade Procedure (Helm Installation)
If Longhorn was installed using Helm, the downgrade is allowed by disabling the preUpgradeChecker.upgradeVersionCheck flag.
Post-Downgrade
Once the downgrade is complete and the Longhorn system is stable on the v1.9.x version, you must immediately follow the steps outlined in the Migration Requirement Before Longhorn v1.10 Upgrade. This step is crucial to migrate all remaining v1beta1 CRs to v1beta2 before attempting the Longhorn v1.10 upgrade again.
Important Fixes
This release includes several critical stability and performance improvements:
Goroutine Leak in Instance Manager (V2 Data Engine)
Fixed a goroutine leak in the instance manager when using the V2 data engine. This issue could lead to increased memory usage and potential stability problems over time.
For ...
Longhorn v1.10.1-rc1
DON'T UPGRADE from/to any RC/Preview/Sprint releases because the operation is not supported.
Resolved Issues in this release
Feature
Improvement
- [BACKPORT][v1.10.1][IMPROVEMENT]
CSIStorageCapacityobjects must show schedulable (allocatable) capacity 12036 - @chriscchien @bachmanity1 - [BACKPORT][v1.10.1][IMPROVEMENT] improve error logging for failed mounting during node publish volume 12033 - @COLDTURNIP @roger-ryao
- [BACKPORT][v1.10.1][IMPROVEMENT] Improve Helm Chart defaultSettings handling with automatic quoting and multi-type support 12020 - @derekbit @chriscchien
- [BACKPORT][v1.10.1][IMPROVEMENT] Avoid repeat engine restart when there are replica unavailable during migration 11945 - @yangchiu @shuo-wu
- [BACKPORT][v1.10.1][IMPROVEMENT] Adjust maximum of GuaranteedInstanceManagerCPU to a big value 11968 - @mantissahz
- [BACKPORT][v1.10.1][IMPROVEMENT] Add usage metrics for Longhorn installation variant 11795 - @derekbit
Bug
- [BACKPORT][v1.10.1][BUG] Unable to complete uninstallation due to the remaining backuptarget 11964 - @mantissahz @roger-ryao
- [BACKPORT][v1.10.1][BUG] share-manager excessive memory usage 12043 - @derekbit @chriscchien
- [BACKPORT][v1.10.1][BUG] NVME disk not found in v2 data engine (failed to find device for BDF) 12029 - @derekbit @roger-ryao
- [BACKPORT][v1.10.1][BUG] NPE error during recurring job execution 11926 - @yangchiu @shuo-wu
- [BACKPORT][v1.10.1][BUG] v2 volume creation failed on talos nodes 12026 - @c3y1huang @chriscchien
- [BACKPORT][v1.10.1][BUG] mounting error is not properly hanedled during CSI node publish volume 12008 - @COLDTURNIP
- [BACKPORT][v1.10.1][BUG] Adding multiple disks to the same node concurrently may occasionally fail 12018 - @davidcheng0922 @roger-ryao
- [BUG] upgrading from 1.9.1 to 1.10.0 fails due to old resources still being in v1beta1 11886 - @COLDTURNIP @roger-ryao
- [BACKPORT][v1.10.1][BUG] DR volume gets stuck in
unknownstate if engine image is deleted from the attached node 11998 - @yangchiu @shuo-wu - [BACKPORT][v1.10.1][BUG] Volume gets stuck in
attachingstate if engine image image is not deployed on one of nodes 11996 - @yangchiu @shuo-wu - [BACKPORT][v1.10.1][BUG] Unable to re-add block-type disks by BDF after re-enable v2 data engine 12000 - @yangchiu @davidcheng0922
- [BACKPORT][v1.10.1][BUG]
test_system_backup_and_restoretest case failed on master-head 12005 - @derekbit @chriscchien - [BACKPORT][v1.10.1][BUG] Fix SPDK v25.05 CVE issue 11970 - @derekbit @roger-ryao
- [BACKPORT][v1.10.1][BUG] V2 volume stuck in volume attachment (V2 interrupt mode) 11976 - @c3y1huang @chriscchien
- [BACKPORT][v1.10.1][BUG] RWX volume causes process uninterruptable sleep 11958 - @COLDTURNIP @chriscchien
- [BACKPORT][v1.10.1][BUG] longhorn-manager fails to start after upgrading from 1.9.2 to 1.10.0 11865 - @derekbit @roger-ryao
- [BACKPORT][v1.10.1][BUG] Block disk deletion fails without error message 11954 - @davidcheng0922 @roger-ryao
- [BACKPORT][v1.10.1][BUG] Goroutine leak in instance-manager when using v2 data engine 11962 - @PhanLe1010 @chriscchien
- [BACKPORT][v1.10.1][BUG] invalid memory address or nil pointer dereference 11942 - @bachmanity1 @roger-ryao
- [BACKPORT][v1.10.1][BUG] csi-provisioner silently fails to create CSIStorageCapacity if dataEngine parameter is missing 11918 - @yangchiu @bachmanity1
- [BACKPORT][v1.10.1][BUG] longhorn-engine's UI panics 11901 - @derekbit @chriscchien
- [BACKPORT][v1.10.1][BUG] Volume is unable to upgrade if the number of active replicas is larger than
volumme.spec.numberOfReplicas11895 - @yangchiu @derekbit - [BACKPORT][v1.10.1][BUG] UI fails to deploy when only IPv4 is enabled on nodes with v1.10.0 version 11875 - @yangchiu @c3y1huang
- [BACKPORT][v1.10.1][BUG] Unable to detach a v2 volume after labeling
disable-v2-data-engine=true11801 - @mantissahz
Misc
- [BACKPORT][v1.10.1][REFACTOR] SAST checks for UI component 11992 - @chriscchien
- [HOTFIX] Create hotfixed image for longhorn-manager:v1.10.0 11951 - @c3y1huang @roger-ryao
Contributors
Longhorn v1.10.0
Longhorn v1.10.0 Release Notes
Longhorn v1.10.0 is a major release focused on improving stability, performance, and the overall user experience. This version introduces significant enhancements to our core features, including the V2 Data Engine, and streamlines configuration for easier management.
The key highlights include improvements to the V2 Data Engine, enhanced resilience, simplified configuration, and better observability.
We welcome feedback and contributions to help continuously improve Longhorn.
For terminology and context on Longhorn releases, see Releases.
Warning
HotFix
The longhorn-manager:v1.10.0 image is affected by a regression issue introduced by the new share-manager pod backoff logic. This bug may cause a nil pointer dereference panic in the longhorn-manager, leading to repeated crashes and failure to deploy new share-manager pods after an upgrade. To mitigate this issue, replace longhorn-manager:v1.10.0 with the hotfixed image longhorn-manager:v1.10.0-hotfix-1.
You can apply the update by following these steps:
-
Disable the upgrade version check
- Helm users: Set
upgradeVersionChecktofalsein thevalues.yamlfile. - Manifest users: Remove the
--upgrade-version-checkflag from the deployment manifest.
- Helm users: Set
-
Update the
longhorn-managerimage- Change the image tag from
v1.10.0tov1.10.0-hotfix-1in the appropriate file:- For Helm: Update
values.yaml - For manifests: Update the deployment manifest directly.
- For Helm: Update
- Change the image tag from
-
Proceed with the upgrade
- Apply the changes using your standard Helm upgrade command or reapply the updated manifest.
Upgrade
If your Longhorn cluster was initially deployed with a version earlier than v1.3.0, the Custom Resources (CRs) were created using the v1beta1 APIs. While the upgrade from Longhorn v1.8 to v1.9 automatically migrates all CRs to the new v1beta2 version, a manual CR migration is strongly advised before upgrading from Longhorn v1.9 to v1.10.
Certain operations, such as an etcd or CRD restore, may leave behind v1beta1 data. Manually migrating your CRs ensures that all Longhorn data is properly updated to the v1beta2 API, preventing potential compatibility issues and unexpected behavior with the new Longhorn version.
Following the manual migration, verify that v1beta1 has been removed from the CRD stored versions to ensure completion and a successful upgrade.
For more details, see Kubernetes official document for CRD storage version, and Issue #11886.
Migration Requirement Before Longhorn v1.10 Upgrade
Before upgrading from Longhorn v1.9 to v1.10, perform the following manual CRD storage version migration.
Note: If your Longhorn installation uses a namespace other than
longhorn-system, replacelonghorn-systemwith your custom namespace throughout the commands.
# Temporarily disable the CR validation webhook to allow updating read-only settings CRs.
kubectl patch validatingwebhookconfiguration longhorn-webhook-validator \
--type=merge \
-p "$(kubectl get validatingwebhookconfiguration longhorn-webhook-validator -o json | \
jq '.webhooks[0].rules |= map(if .apiGroups == ["longhorn.io"] and .resources == ["settings"] then
.operations |= map(select(. != "UPDATE")) else . end)')"
# Migrate CRDs that ever stored v1beta1 resources
migration_time="$(date +%Y-%m-%dT%H:%M:%S)"
crds=($(kubectl get crd -l app.kubernetes.io/name=longhorn -o json | jq -r '.items[] | select(.status.storedVersions | index("v1beta1")) | .metadata.name'))
for crd in "${crds[@]}"; do
echo "Migrating ${crd} ..."
for name in $(kubectl -n longhorn-system get "$crd" -o jsonpath='{.items[*].metadata.name}'); do
# Attach additional annotations to trigger v1beta1 resource updating in the latest storage version.
kubectl patch "${crd}" "${name}" -n longhorn-system --type=merge -p='{"metadata":{"annotations":{"migration-time":"'"${migration_time}"'"}}}'
done
# Clean up the stored version in CRD status
kubectl patch crd "${crd}" --type=merge -p '{"status":{"storedVersions":["v1beta2"]}}' --subresource=status
done
# Re-enable the CR validation webhook.
kubectl patch validatingwebhookconfiguration longhorn-webhook-validator \
--type=merge \
-p "$(kubectl get validatingwebhookconfiguration longhorn-webhook-validator -o json | \
jq '.webhooks[0].rules |= map(if .apiGroups == ["longhorn.io"] and .resources == ["settings"] then
.operations |= (. + ["UPDATE"] | unique) else . end)')"Migration Verification
After running the script, verify the CRD stored versions using this command:
kubectl get crd -l app.kubernetes.io/name=longhorn -o=jsonpath='{range .items[*]}{.metadata.name}{": "}{.status.storedVersions}{"\n"}{end}'Crucially, all Longhorn CRDs MUST have only "v1beta2" listed in storedVersions (i.e., "v1beta1" must be completely absent) before proceeding to the v1.10 upgrade.
Example of successful output:
backingimagedatasources.longhorn.io: ["v1beta2"]
backingimagemanagers.longhorn.io: ["v1beta2"]
backingimages.longhorn.io: ["v1beta2"]
backupbackingimages.longhorn.io: ["v1beta2"]
backups.longhorn.io: ["v1beta2"]
backuptargets.longhorn.io: ["v1beta2"]
backupvolumes.longhorn.io: ["v1beta2"]
engineimages.longhorn.io: ["v1beta2"]
engines.longhorn.io: ["v1beta2"]
instancemanagers.longhorn.io: ["v1beta2"]
nodes.longhorn.io: ["v1beta2"]
orphans.longhorn.io: ["v1beta2"]
recurringjobs.longhorn.io: ["v1beta2"]
replicas.longhorn.io: ["v1beta2"]
settings.longhorn.io: ["v1beta2"]
sharemanagers.longhorn.io: ["v1beta2"]
snapshots.longhorn.io: ["v1beta2"]
supportbundles.longhorn.io: ["v1beta2"]
systembackups.longhorn.io: ["v1beta2"]
systemrestores.longhorn.io: ["v1beta2"]
volumeattachments.longhorn.io: ["v1beta2"]
volumes.longhorn.io: ["v1beta2"]
With these steps completed, the Longhorn upgrade to v1.10 should now proceed without issues.
Troubleshooting CRD Upgrade Failures During Upgrade to Longhorn v1.10
If you did not apply the required pre-upgrade migration steps and the CRs are not fully migrated to v1beta2, the longhorn-manager Pods may fail to operate correctly. A common error message for this issue is:
Upgrade failed: cannot patch "backingimagedatasources.longhorn.io" with kind CustomResourceDefinition: CustomResourceDefinition.apiextensions.k8s.io "backingimagedatasources.longhorn.io" is invalid: status.storedVersions[0]: Invalid value: "v1beta1": missing from spec.versions; v1beta1 was previously a storage version, and must remain in spec.versions until a storage migration ensures no data remains persisted in v1beta1 and removes v1beta1 from status.storedVersions
To fix this issue, you must perform a forced downgrade back to the exact Longhorn v1.9.x version that was running before the failed upgrade attempt.
Downgrade Procedure (kubectl Installation)
If Longhorn was installed using kubectl, you must patch the current-longhorn-version setting before downgrading. Replace v1.9.x with the original version before upgrade in the following commands.
# Attaching annotation to allow patching current-longhorn-version.
kubectl patch settings.longhorn.io current-longhorn-version -n longhorn-system --type=merge -p='{"metadata":{"annotations":{"longhorn.io/update-setting-from-longhorn":""}}}'
# Temporarily override current version to allow old version installation
# Replace the value `"v1.9.x" to the original version before upgrade.
kubectl patch settings.longhorn.io current-longhorn-version -n longhorn-system --type=merge -p='{"value":"v1.9.x"}'After modifying current-longhorn-version, you can proceed to downgrade to the original Longhorn v1.9.x deployment.
Downgrade Procedure (Helm Installation)
If Longhorn was installed using Helm, the downgrade is allowed by disabling the preUpgradeChecker.upgradeVersionCheck flag.
Post-Downgrade
Once the downgrade is complete and the Longhorn system is stable on the v1.9.x version, you must immediately follow the steps outlined in the Migration Requirement Before Longhorn v1.10 Upgrade. This step is crucial to migrate all remaining v1beta1 CRs to v1beta2 before attempting the Longhorn v1.10 upgrade again.
Removal
longhorn.io/v1beta1 API
The v1beta1 Longhorn API version has been removed.
See GitHub Issue #10249 for details.
replica.status.evictionRequested Field
The deprecated replica.status.evictionRequested field has been removed.
See GitHub Issue #7022 for details.
Primary Highlights
New V2 Data Engine Features
Interrupt Mode Support
Interrupt mode has been added to the V2 Data Engine to help reduce CPU usage. This feature is especially beneficial for clusters with idle or low I/O workloads, where conserving CPU resources is more important than minimizing latency.
While interrupt mode lowers CPU consumption, it may introduce slightly higher I/O latency compared to polling mode. In addition, the current implementation uses a hybrid approach, which still ...
Longhorn v1.9.2
Longhorn v1.9.2 Release Notes
Longhorn 1.9.2 introduces several improvements and bug fixes that are intended to improve system quality, resilience, stability and security.
The Longhorn team appreciates your contributions and expects to receive feedback regarding this release.
Note
For more information about release-related terminology, see Releases.
Installation
Important
Ensure that your cluster is running Kubernetes v1.25 or later before installing Longhorn v1.9.2.
You can install Longhorn using a variety of tools, including Rancher, Kubectl, and Helm. For more information about installation methods and requirements, see Quick Installation in the Longhorn documentation.
Upgrade
Important
Ensure that your cluster is running Kubernetes v1.25 or later before upgrading from Longhorn v1.8.x or v1.9.x (< v1.9.2) to v1.9.2.
Longhorn only allows upgrades from supported versions. For more information about upgrade paths and procedures, see Upgrade in the Longhorn documentation.
Post-Release Known Issues
For information about issues identified after this release, see Release-Known-Issues.
Resolved Issues
Improvement
- [BACKPORT][v1.9.2][IMPROVEMENT] Add usage metrics for Longhorn installation variant 11805 - @derekbit
- [BACKPORT][v1.9.2][IMPROVEMENT] SAST Potential dereference of the null pointer in controller/volume_controller.go in longhorn-manager 11782 - @c3y1huang
- [BACKPORT][v1.9.2][IMPROVEMENT] Collect mount table, process status and process table in support bundle 11726 - @mantissahz @chriscchien
- [BACKPORT][v1.9.2][IMPROVEMENT] rename the backing image manager to reduce the probability of CR name collision 11567 - @COLDTURNIP @chriscchien
- [BACKPORT][v1.9.2][IMPROVEMENT] Improve log messages of longhorn-engine, tgt and liblonghorn for troubleshooting 11604 - @yangchiu @derekbit
- [BACKPORT][v1.9.2][IMPROVEMENT] Misleading log message
Deleting orphans on evicted node ...11501 - @yangchiu @derekbit - [BACKPORT][v1.9.2][IMPROVEMENT] Check if the backup target is available before creating a backup, backup backing image, and system backup 11324 - @yangchiu @nzhan126
- [BACKPORT][v1.9.2][IMPROVEMENT] adjust the hardcoded timeout limitation for backing image downloading 11310 - @COLDTURNIP @chriscchien
- [BACKPORT][v1.9.2][IMPROVEMENT] Improve longhorn-engine controller log messages 11508 - @derekbit @chriscchien
- [BACKPORT][v1.9.2][IMPROVEMENT] Make liveness probe parameters of instance-manager pod configurable 11506 - @derekbit @chriscchien
- [BACKPORT][v1.9.2][IMPROVEMENT] backing image handle node disk deleting events 11488 - @COLDTURNIP @chriscchien
- [BACKPORT][v1.9.2][IMPROVEMENT] Handle credential secret containing mixed invalid conditions 11327 - @yangchiu @nzhan126
- [BACKPORT][v1.9.2][IMPROVEMENT] Improve the condition message of engine image check 11193 - @derekbit @chriscchien
Bug
- [BACKPORT][v1.9.2][BUG] Potential Data Corruption During Volume Resizing When Created from Snapshot 11788 - @yangchiu @PhanLe1010
- [BUG] [v1.9.x] support bundle stuck at 33% 11744 - @mantissahz @chriscchien
- [BACKPORT][v1.9.2][BUG] Unable to disable v2-data-engine even though there is no v2 volumes, backing images or orphaned data 11639 - @shuo-wu @chriscchien
- [BACKPORT][v1.9.2][BUG] Longhorn pvcs are in pending state. 11722 - @yangchiu @derekbit
- [BUG] Broken link in documentation 11729 - @consideRatio
- [BACKPORT][v1.9.2][BUG] longhornctl preflight install should load and check iscsi_tcp kernel module. 11710 - @mantissahz @chriscchien
- [BACKPORT][v1.9.2][BUG] Backing image download gets stuck after network disconnection 11624 - @COLDTURNIP
- [BACKPORT][v1.9.2][BUG] Volume becomes faulted when its replica node disks run out of space during a write operation 11341 - @mantissahz @chriscchien
- [BACKPORT][v1.9.2][BUG] Engine process continues running after rapid volume detachment 11606 - @COLDTURNIP @yangchiu @chriscchien
- [BACKPORT][v1.9.2][BUG] Creating a 2 Gi volume with a 200 Mi backing image is rejected with “volume size should be larger than the backing image size” 11648 - @COLDTURNIP @yangchiu @chriscchien
- [BACKPORT][v1.9.2][BUG] longhorn-manager repeatedly emits
No instance manager for node xxx for update instance state of orphan instance orphan-xxx..11599 - @COLDTURNIP @chriscchien - [BACKPORT][v1.9.2][BUG] BackupBackingImage may be created from an unready BackingImageManager 11692 - @WebberHuang1118 @roger-ryao
- [BACKPORT][v1.9.2][BUG] Longhorn fails to create Backing Image Backup on ARM platform 11570 - @COLDTURNIP
- [BACKPORT][v1.9.2][BUG] remaining unknown OS condition in node CR 11614 - @COLDTURNIP @roger-ryao
- [BACKPORT][v1.9.2][BUG] Volumes fails to remount when they go read-only 11584 - @derekbit @chriscchien
- [BACKPORT][v1.9.2][BUG] Dangling Volume State When Live Migration Terminates Unexpectedly 11590 - @PhanLe1010 @chriscchien
- [BACKPORT][v1.9.2][BUG] Unable to setup backup target in storage network environment: cannot find a running instance manager for node 11482 - @derekbit @chriscchien
- [BACKPORT][v1.9.2][BUG] Test case
test_recurring_jobs_when_volume_detached_unexpectedlyfailed: backup completed but progress did not reach 100% 11476 - @yangchiu @mantissahz - [BACKPORT][v1.9.2][BUG] Recurring Job with 'default' group causes goroutine deadlock on v1.9.1 (Regression of #11020) 11494 - @c3y1huang
- [BACKPORT][v1.9.2][BUG] Test Case
test_replica_auto_balance_node_least_effortIs Sometimes Failed 11391 - @derekbit @chriscchien - [BACKPORT][v1.9.2][BUG] Unable to set up S3 backup target if backups already exist 11344 - @mantissahz @chriscchien
- [BACKPORT][v1.9.2][BUG] longhorn-manager is crashed due to
SIGSEGV: segmentation violation11422 - @derekbit @roger-ryao - [BACKPORT][v1.9.2][BUG] Typo in configuration parameter: "offlineRelicaRebuilding" should be "offlineReplicaRebuilding" 11382 - @yangchiu
- [BUG][UI][v1.9.2-rc2] Unable to Retrieve Volume's Backup List in the Operation 11841 - @houhouhoucoop @roger-ryao
New Contributors
Contributors
Longhorn v1.10.0-rc4
DON'T UPGRADE from/to any RC/Preview/Sprint releases because the operation is not supported.
Resolved Issues in this release
Highlight
- [FEATURE] V2 Volume Supports Cloning 7794 - @yangchiu @PhanLe1010
- [FEATURE] v2 supports volume expansion 8022 - @davidcheng0922 @chriscchien
- [UI][FEATURE] V2 Volume Supports Cloning 11736 - @yangchiu @houhoucoop
- [FEATURE] V2 volumes support interrupt mode 9834 - @yangchiu @c3y1huang
- [FEATURE] Support v2 volume without hugepage 7066 - @derekbit @chriscchien
- [FEATURE] Configurable Backup Block Size 5215 - @COLDTURNIP @yangchiu
- [UI][FEATURE] Configurable Backup Block Size 11586 -
- [FEATURE] Add QoS support to limit replica rebuilding load 10770 - @hookak @roger-ryao
- [FEATURE] Volume granular setting parity for V2 to match V1 data engine 10926 - @derekbit @chriscchien
- [IMPROVEMENT] Support CSIStorageCapacity in Longhorn CSI driver to enable capacity-aware pod scheduling 10685 - @bachmanity1 @roger-ryao
- [FEATURE] IPV6 for V1 Data Engine 2259 - @yangchiu @c3y1huang
- [FEATURE] Delta Replica Rebuilding using Delta Snapshot: Control and Data Planes 10037 - @shuo-wu @roger-ryao
- [FEATURE] Remove v1beta1 API CRD in Longhorn v1.10 10249 - @derekbit @roger-ryao
Feature
- [FEATURE] Add option to restart kubelet through
longhornctlafter huge page update 11241 - @chriscchien @bachmanity1 - [UI][FEATURE] Configurable Backup Block Size 11351 - @yangchiu @houhoucoop
- [UI][FEATURE] Display a summary of the attachment tickets in an individual volume's overview page 11401 - @yangchiu @houhoucoop
- [UI][FEATURE] Add QoS support to limit replica rebuilding load 11306 - @davidcheng0922 @houhoucoop @roger-ryao
- [UI][FEATURE] Volume granular setting parity for V2 to match V1 data engine 11354 - @chriscchien @houhoucoop
- [FEATURE] Display a summary of the attachment tickets in an individual volume's overview page 11400 - @yangchiu @davidcheng0922
- [FEATURE] Allow longhorn to restart pods with custom controllers, while the
Automatically Delete Workload Pod when The Volume Is Detached Unexpectedlyfeature is enabled 8353 - @derekbit @roger-ryao - [FEATURE] Standardized way to override container image registry 11064 - @marcosbc @yangchiu @roger-ryao
- [FEATURE] Standardized way to specify image pull secrets 11062 - @marcosbc @chriscchien
Improvement
- [IMPROVEMENT] Add usage metrics for Longhorn installation variant 11792 - @derekbit
- [IMPROVEMENT] Allow applying different values of snapshot checksum related settings for v1 and v2 data engine 11537 - @chriscchien @nzhan126
- [IMPROVEMENT] Make
longhornctlusable in air-gapped environments 11291 - @chriscchien @bachmanity1 - [IMPROVEMENT] SAST Potential dereference of the null pointer in controller/volume_controller.go in longhorn-manager 11780 - @c3y1huang
- [IMPROVEMENT] Collect Logs from the Host Directory Defined by the Setting
log-path11522 - @c3y1huang @roger-ryao - [IMPROVEMENT] Enhance Offline Rebuilding with Resource Awareness and Retry Backoff 11270 - @mantissahz @chriscchien
- [IMPROVEMENT] Collect mount table, process status and process table in support bundle 8397 - @mantissahz @chriscchien
- [IMPROVEMENT] Volume attachment should automatically exclude nodes with
disable-v2-data-engine="true"11695 - @derekbit @chriscchien - [IMPROVEMENT] Introduce
System InfoCategory for Settings 11656 - @derekbit @roger-ryao - [IMPROVEMENT] RBAC permissions 11345 - @davidcheng0922 @chriscchien
- [IMPROVEMENT] Improve Longhorn Pods Logging Precision to Nanoseconds 11596 - @derekbit @roger-ryao
- [IMPROVEMENT] Update validation logics for v2 data engine 11600 - @derekbit @chriscchien
- [IMPROVEMENT] Improve log messages of longhorn-engine, tgt and liblonghorn for troubleshooting 11545 - @yangchiu @derekbit
- [IMPROVEMENT] rename the backing image manager to reduce the probability of CR name collision 11455 - @COLDTURNIP @chriscchien
- [IMPROVEMENT] Remove outdated prerequisite installation scripts in longhorn/longhorn 11430 - @yangchiu @roger-ryao @sushant-suse
- [UI][IMPROVEMENT] Add UI Warning for Force-Detach Actions to Prevent Out-of-Sync Kubernetes and Longhorn VolumeAttachments 9944 - @yangchiu @houhoucoop
- [IMPROVEMENT] Add
node-selectoroption tolonghornctlto select nodes on which to run DaemonSet 11213 - @yangchiu @bachmanity1 - [IMPROVEMENT] Improve volume
Scheduledcondition message 11460 - @yangchiu @derekbit @chriscchien - [IMPROVEMENT] Launching a new mechanism to collect instance manager logs 5948 - @yangchiu @derekbit
- [IMPROVEMENT] adjust the hardcoded timeout limitation for backing image downloading 11309 - @COLDTURNIP @roger-ryao
- [IMPROVEMENT] Make liveness probe parameters of instance-manager pod configurable 10788 - @yangchiu @derekbit
- [IMPROVEMENT] Enhance menu descriptions for Longhorn CLI 8998 - @roger-ryao @sushant-suse
- [IMPROVEMENT] Improve longhorn-engine controller log messages 11507 - @derekbit @chriscchien
- [IMPROVEMENT] Add a comment to explain what
isSettingDataEngineSynceddoes in the instance manager controller. 11321 - @mantissahz - [IMPROVEMENT] Flooding and misleading log message
Deleting orphans on evicted node ...11500 - @yangchiu @derekbit - [IMPROVEMENT] Reject
volume.spec.replicaRebuildingBandwidthLimitupdate for V1 Data Engine 11497 - @derekbit @roger-ryao - [IMPROVEMENT] Detach an offline rebuilding volume if rebuilding can not start 11274 - @mantissahz
- [IMPROVEMENT] backing image handle node disk deleting events 10983 - @COLDTURNIP @chriscchien
- [IMPROVEMENT] Rename
RebuildingMbytesPerSecondtoReplicaRebuildBandwidthLimit11403 - @derekbit @roger-ryao - [IMPROVEMENT] Make the sync agent profilable 11386 - @COLDTURNIP @yangchiu
- [IMPROVEMENT] Add performance metrics for Longhorn disk I/O 11223 - @hookak @DamiaSan
- [IMPROVEMENT] Make CLI preflight check non-blocking for subsequent checkups 9877 - @davidcheng0922 @DamiaSan
- [IMPROVEMENT] Add namespace argument/parameter to cli pre-flight check 9749 - @davidcheng0922 @DamiaSan
- [IMPROVEMENT]
Orphaned Datashould not be placed under Settings 10383 - @houhoucoop @DamiaSan @sushant-suse - [IMPROVEMENT] Upgrade Node v20 in longhorn-ui 11315 - @chriscchien @houhoucoop
- [IMPROVEMENT] useful error message from /v1/backuptargets is not displayed in UI 10428 - @houhoucoop @DamiaSan
- [IMPROVEMENT] Check if the backup target is available before creating a backup, backup backing image, and system backup 10085 - @yangchiu @nzhan126
- [IMPROVEMENT] Backoff Retry Interval for Instance Manager Pod Re-creation in Resource Constraint Scenarios 10263 - @yangchiu @bachmanity1
- [IMPROVEMENT] record the detail while webhook rejecting migration attachment tickets [11150](https://github.com/longhorn/longh...