Skip to content

Add upgrade documentation for XOSTOR #319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 16, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/installation/upgrade.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,10 @@ There are 3 upgrade methods, detailed below:
* [From command line a.k.a. yum-style upgrade](#-from-command-line). ⚠️ Only for some point version upgrades.
:::

:::warning
For upgrading XCP-ng machines with an XOSTOR SR, please refer to this [additional information](../../xostor#upgrade) before taking any action.
:::

## ☢️ Release Notes & Known Issues

Read the [Release Notes and Known Issues](../../releases#xcp-ng-release-history) for every release that is higher than your current release. They may provide additional instructions for specific situations. Also **please read the following warnings**:
Expand Down
194 changes: 194 additions & 0 deletions docs/xostor/xostor.md
Original file line number Diff line number Diff line change
Expand Up @@ -323,6 +323,94 @@ Only XCP-ng 8.2.1 is currently supported and with a maximum of 7 machines per po

See this documentation: [RPU](/management/updates/#rolling-pool-update-rpu).

## Upgrade

If you are reading this documentation, we assume that you want to upgrade a pool on which XOSTOR is deployed, i.e. change the version of XCP-ng, for example from 8.2 to 8.3.
For updates that don't change the version number of XCP-ng (bugfixes, security fixes), see [the update section](#update).

### 1. Prerequisites

- All hosts must be up to date on the version of XCP-ng you are currently using. For this refer to [the update section](#update).
- HA must be disabled on your pool.
- Ensure all nodes are reachable and resources are in "OK" state via XO's XOSTOR view. Alternatively, you can use the CLI:
```
linstor n l
linstor r l
linstor adv r
linstor sp l
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not give the small command version? They can be prone to errors

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I think we should provide the full commands in the docs in general. That also better documents what the commands do, as the names have a meaning that is easier to understand than single letters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't see a problem here. Generally a simple copy-paste is executed by users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a problem, but that would be better pedagogically speaking. For example, me, I don't know what they mean without checking their docs.

```

### 2. XCP-ng ISO with LINSTOR support

To upgrade your XCP-ng LINSTOR pool without issues, you need to use a dedicated ISO with `linstor-upgradeonly` in its name.
Also don't try to upgrade using CLI or network ISO installer.
The dedicated upgrade ISO can be downloaded from [https://repo.vates.tech/xcp-ng/isos/](https://repo.vates.tech/xcp-ng/isos/).

LINSTOR has several prerequisites to work correctly and if you don't use the right upgrade image:
- LINSTOR's controller and satellite packages would be removed.
- Specific LINSTOR services would be removed through the use of a generic XCP-ng ISO.
- DRBDs/LINSTOR ports would not be open on the resulting upgraded host.

### 3. Upgrade steps

From this point we can proceed to upgrade your XOSTOR-enabled pool.

An upgrade can take quite a long time so we recommend disabling the auto-evict mechanism during this procedure to avoid bad behavior.
On the host where the controller is running:
```
linstor controller set-property DrbdOptions/AutoEvictAllowEviction False
```

For each host of the pool (starting with the master), follow the instructions given in [this guide](../installation/upgrade/#-upgrade-via-installation-iso-recommended).

:::warning
If you have this error during upgrade, you must download the right ISO version as documented in [this section](#2-download-xcp-ng-iso-with-linstor-support):
```
Cannot upgrade host with LINSTOR using a package source that does not have LINSTOR. Please use as package source the repository on the dedicated ISO.
```
:::

If you want to make sure that everything is going well so as not to impact your production, we recommend these manual checks after each host reboot:

- Ensure the host node is connected with the command below. Otherwise wait a few seconds.
```
linstor n list
```

- Check if there is an issue with the resources:
```
linstor r list
linstor advise r # Give possible fix commands in case of problems.
```

- Check in XOA that the PBD of the SR of this host is connected. If not, connect it.

:::warning
Very important: if you don't want to break the quorum or your production environment, you must execute the commands given above after upgrading a host and do not reboot/upgrade the others until the host's satellite is operational and its PBD is plugged.
:::

### 4. After pool upgrade

- If you have deactivated auto eviction as recommended, it's necessary to reactivate it. On the host where the controller resides, execute this command:
```
linstor controller set-property DrbdOptions/AutoEvictAllowEviction True
```

If a node was evicted because the recommendation was not followed, read this [topic](#what-to-do-when-a-node-is-in-an-evicted-state).

- Check the resource states with:
```
linstor r list
```

In case of bad sync between volumes, execute on each machine:
```
systemctl stop linstor-controller
systemctl restart linstor-satellite
```

- In case of a bad node (missing, without storage pool or inaccessible via `linstor n l`/`linstor sp l`) due to a failed upgrade or if the documentation was not followed correctly, you can read this [documentation](#how-to-add-a-new-host-or-fix-a-badly-configured-host) to recover.

## Global questions

### The linstor command does not work!?
Expand Down Expand Up @@ -520,6 +608,112 @@ Consider the following:
- If you create a volume of 200 GiB it means that 400 of the 600 GiB are physically used. However, the remaining disk of 200 GiB cannot be used because it cannot be replicated on two different disks.
- If you create 3 volumes of 100 GiB: the SR is filled. In this case, you have 300 GiB of unique data and a replication of 300 GiB.

### How to add a new host or fix a badly configured host?

:::warning
If you want to configure a new host, make sure the pool is up-to-date (see [the update section](#update)) and make sure you have the required packages on the new host by running these commands on it:
```
yum install -y xcp-ng-release-linstor
yum install -y xcp-ng-linstor
```

And then restart the toolstack to detect the LINSTOR driver:
```
xe-toolstack-restart
```

If you are in a situation where you can't safely update your pool, contact [Vates Pro Support](https://vates.tech/pricing-and-support) for guidance applying to your specific situation.
:::

First ensure you have the same configuration on each PBD of your XOSTOR SR using this command. Replace `<UUID>` with the SR UUID that you use:
```
xe pbd-list sr-uuid=<UUID>
```

Example output where the group-name is `linstor_group/thin_device`:
```
uuid ( RO) : 06d10e9e-c7ad-2ed6-a901-53ac1c2c7486
host-uuid ( RO): 4bac16be-b25b-4d0b-a159-8f5bda930640
sr-uuid ( RO): d5f990f6-abca-0ebf-8582-b7e55901fb50
device-config (MRO): group-name: linstor_group/thin_device; redundancy: 2; provisioning: thin
currently-attached ( RO): true


uuid ( RO) : 06b5e263-8ec1-74e9-3162-d39785be6ba7
host-uuid ( RO): f7737f79-ad49-491c-a303-95ac37fb6a13
sr-uuid ( RO): d5f990f6-abca-0ebf-8582-b7e55901fb50
device-config (MRO): group-name: linstor_group/thin_device; redundancy: 2; provisioning: thin
currently-attached ( RO): true


uuid ( RO) : 1d872d5b-fb60-dbd7-58fc-555a211f18fa
host-uuid ( RO): ef942670-e37d-49e6-81d0-d2a484b0cd10
sr-uuid ( RO): d5f990f6-abca-0ebf-8582-b7e55901fb50
device-config (MRO): group-name: linstor_group/thin_device; redundancy: 2; provisioning: thin
currently-attached ( RO): true
```

Then if you want to fix an incorrect group name value or even add a new host, use this command with the correct `<GROUP_NAME>` and `<HOST_UUID>`:
```
xe host-call-plugin host-uuid=<HOST_UUID> plugin=linstor-manager fn=addHost args:groupName=<GROUP_NAME>
```
For a short description, this command (re)creates a PBD, opens DRBD/LINSTOR ports, starts specific services and adds the node to the LINSTOR database.

If you have storage devices to use on the host, a LINSTOR storage layer is not directly added to the corresponding node. You can verify the storage state like this:
```
linstor sp list
```

Small example:

A `LVM_THIN` entry is missing for `hpmc17` in this context:
```
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool ┊ Node ┊ Driver ┊ PoolName ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊ SharedName ┊
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ hpmc15 ┊ DISKLESS ┊ ┊ ┊ ┊ False ┊ Ok ┊ hpmc15;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ hpmc16 ┊ DISKLESS ┊ ┊ ┊ ┊ False ┊ Ok ┊ hpmc16;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ hpmc17 ┊ DISKLESS ┊ ┊ ┊ ┊ False ┊ Ok ┊ hpmc17;DfltDisklessStorPool ┊
┊ xcp-sr-linstor_group_thin_device ┊ hpmc15 ┊ LVM_THIN ┊ linstor_group/thin_device ┊ 476.66 GiB ┊ 476.70 GiB ┊ True ┊ Ok ┊ hpmc15;xcp-sr-linstor_group_thin_device ┊
┊ xcp-sr-linstor_group_thin_device ┊ hpmc16 ┊ LVM_THIN ┊ linstor_group/thin_device ┊ 476.66 GiB ┊ 476.70 GiB ┊ True ┊ Ok ┊ hpmc16;xcp-sr-linstor_group_thin_device ┊
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
```

So if your host has disks that need to be added to the linstor SR, you will need to create VG/LV.
Connect to the machine to modify and use `vgcreate` with the wanted disks to create a VG group on the host:
```
vgcreate <GROUP_NAME> <DEVICES>
```

In this example where we want to use /dev/nvme0n1 with the group `linstor_group`:
```
vgcreate linstor_group /dev/nvme0n1
```

For `thin` additional commands are required:
```
lvcreate -l 100%FREE -T <GROUP_NAME>/<LV_THIN_VOLUME>
lvchange -ay <GROUP_NAME>/<LV_THIN_VOLUME>
```

Always regarding this example, we have:
- `<HOSTNAME>`: `linstor_group`.
- `<LV_THIN_VOLUME>`: `thin_device`.

Run the correct command where the controller is running to add the volume group in the LINSTOR database:
```
# For thin:
linstor storage-pool create lvmthin <NODE_NAME> <SP_NAME> <VG_NAME>

# For thick:
linstor storage-pool create lvm <NODE_NAME> <SP_NAME> <VG_NAME>
```

In this example:
```
linstor storage-pool create lvm hpmc17 xcp-sr-linstor_group_thin_device linstor_group/thin_device
```

### How to use a specific network for DRBD requests?

To use a specific network to handle the DRBD traffic, a new interface must be created on each host:
Expand Down