-
Notifications
You must be signed in to change notification settings - Fork 103
Add upgrade documentation for XOSTOR #319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -323,6 +323,94 @@ Only XCP-ng 8.2.1 is currently supported and with a maximum of 7 machines per po | |
|
||
See this documentation: [RPU](/management/updates/#rolling-pool-update-rpu). | ||
|
||
## Upgrade | ||
|
||
If you are reading this documentation, we assume that you want to upgrade a pool on which XOSTOR is deployed, i.e. change the version of XCP-ng, for example from 8.2 to 8.3. | ||
For updates that don't change the version number of XCP-ng (bugfixes, security fixes), see [the update section](#update). | ||
|
||
### 1. Prerequisites | ||
|
||
- All hosts must be up to date on the version of XCP-ng you are currently using. For this refer to [the update section](#update). | ||
- HA must be disabled on your pool. | ||
- Ensure all nodes are reachable and resources are in "OK" state via XO's XOSTOR view. Alternatively, you can use the CLI: | ||
``` | ||
linstor n l | ||
linstor r l | ||
linstor adv r | ||
linstor sp l | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe not give the small command version? They can be prone to errors There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree. I think we should provide the full commands in the docs in general. That also better documents what the commands do, as the names have a meaning that is easier to understand than single letters. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't see a problem here. Generally a simple copy-paste is executed by users. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not a problem, but that would be better pedagogically speaking. For example, me, I don't know what they mean without checking their docs. |
||
``` | ||
|
||
### 2. XCP-ng ISO with LINSTOR support | ||
|
||
To upgrade your XCP-ng LINSTOR pool without issues, you need to use a dedicated ISO with `linstor-upgradeonly` in its name. | ||
Also don't try to upgrade using CLI or network ISO installer. | ||
The dedicated upgrade ISO can be downloaded from [https://repo.vates.tech/xcp-ng/isos/](https://repo.vates.tech/xcp-ng/isos/). | ||
|
||
LINSTOR has several prerequisites to work correctly and if you don't use the right upgrade image: | ||
- LINSTOR's controller and satellite packages would be removed. | ||
- Specific LINSTOR services would be removed through the use of a generic XCP-ng ISO. | ||
- DRBDs/LINSTOR ports would not be open on the resulting upgraded host. | ||
|
||
### 3. Upgrade steps | ||
|
||
From this point we can proceed to upgrade your XOSTOR-enabled pool. | ||
|
||
An upgrade can take quite a long time so we recommend disabling the auto-evict mechanism during this procedure to avoid bad behavior. | ||
On the host where the controller is running: | ||
Wescoeur marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
linstor controller set-property DrbdOptions/AutoEvictAllowEviction False | ||
``` | ||
|
||
For each host of the pool (starting with the master), follow the instructions given in [this guide](../installation/upgrade/#-upgrade-via-installation-iso-recommended). | ||
|
||
:::warning | ||
If you have this error during upgrade, you must download the right ISO version as documented in [this section](#2-download-xcp-ng-iso-with-linstor-support): | ||
``` | ||
Cannot upgrade host with LINSTOR using a package source that does not have LINSTOR. Please use as package source the repository on the dedicated ISO. | ||
``` | ||
::: | ||
|
||
If you want to make sure that everything is going well so as not to impact your production, we recommend these manual checks after each host reboot: | ||
|
||
- Ensure the host node is connected with the command below. Otherwise wait a few seconds. | ||
``` | ||
linstor n list | ||
``` | ||
|
||
- Check if there is an issue with the resources: | ||
``` | ||
linstor r list | ||
linstor advise r # Give possible fix commands in case of problems. | ||
``` | ||
|
||
- Check in XOA that the PBD of the SR of this host is connected. If not, connect it. | ||
|
||
:::warning | ||
Very important: if you don't want to break the quorum or your production environment, you must execute the commands given above after upgrading a host and do not reboot/upgrade the others until the host's satellite is operational and its PBD is plugged. | ||
::: | ||
|
||
### 4. After pool upgrade | ||
|
||
- If you have deactivated auto eviction as recommended, it's necessary to reactivate it. On the host where the controller resides, execute this command: | ||
``` | ||
linstor controller set-property DrbdOptions/AutoEvictAllowEviction True | ||
``` | ||
|
||
If a node was evicted because the recommendation was not followed, read this [topic](#what-to-do-when-a-node-is-in-an-evicted-state). | ||
|
||
- Check the resource states with: | ||
``` | ||
linstor r list | ||
``` | ||
|
||
In case of bad sync between volumes, execute on each machine: | ||
``` | ||
systemctl stop linstor-controller | ||
systemctl restart linstor-satellite | ||
``` | ||
|
||
- In case of a bad node (missing, without storage pool or inaccessible via `linstor n l`/`linstor sp l`) due to a failed upgrade or if the documentation was not followed correctly, you can read this [documentation](#how-to-add-a-new-host-or-fix-a-badly-configured-host) to recover. | ||
|
||
## Global questions | ||
|
||
### The linstor command does not work!? | ||
|
@@ -520,6 +608,112 @@ Consider the following: | |
- If you create a volume of 200 GiB it means that 400 of the 600 GiB are physically used. However, the remaining disk of 200 GiB cannot be used because it cannot be replicated on two different disks. | ||
- If you create 3 volumes of 100 GiB: the SR is filled. In this case, you have 300 GiB of unique data and a replication of 300 GiB. | ||
|
||
### How to add a new host or fix a badly configured host? | ||
|
||
:::warning | ||
If you want to configure a new host, make sure the pool is up-to-date (see [the update section](#update)) and make sure you have the required packages on the new host by running these commands on it: | ||
``` | ||
yum install -y xcp-ng-release-linstor | ||
yum install -y xcp-ng-linstor | ||
stormi marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
|
||
And then restart the toolstack to detect the LINSTOR driver: | ||
``` | ||
xe-toolstack-restart | ||
``` | ||
|
||
If you are in a situation where you can't safely update your pool, contact [Vates Pro Support](https://vates.tech/pricing-and-support) for guidance applying to your specific situation. | ||
::: | ||
|
||
First ensure you have the same configuration on each PBD of your XOSTOR SR using this command. Replace `<UUID>` with the SR UUID that you use: | ||
``` | ||
xe pbd-list sr-uuid=<UUID> | ||
``` | ||
|
||
Example output where the group-name is `linstor_group/thin_device`: | ||
``` | ||
uuid ( RO) : 06d10e9e-c7ad-2ed6-a901-53ac1c2c7486 | ||
host-uuid ( RO): 4bac16be-b25b-4d0b-a159-8f5bda930640 | ||
sr-uuid ( RO): d5f990f6-abca-0ebf-8582-b7e55901fb50 | ||
device-config (MRO): group-name: linstor_group/thin_device; redundancy: 2; provisioning: thin | ||
currently-attached ( RO): true | ||
|
||
|
||
uuid ( RO) : 06b5e263-8ec1-74e9-3162-d39785be6ba7 | ||
host-uuid ( RO): f7737f79-ad49-491c-a303-95ac37fb6a13 | ||
sr-uuid ( RO): d5f990f6-abca-0ebf-8582-b7e55901fb50 | ||
device-config (MRO): group-name: linstor_group/thin_device; redundancy: 2; provisioning: thin | ||
currently-attached ( RO): true | ||
|
||
|
||
uuid ( RO) : 1d872d5b-fb60-dbd7-58fc-555a211f18fa | ||
host-uuid ( RO): ef942670-e37d-49e6-81d0-d2a484b0cd10 | ||
sr-uuid ( RO): d5f990f6-abca-0ebf-8582-b7e55901fb50 | ||
device-config (MRO): group-name: linstor_group/thin_device; redundancy: 2; provisioning: thin | ||
currently-attached ( RO): true | ||
``` | ||
|
||
Then if you want to fix an incorrect group name value or even add a new host, use this command with the correct `<GROUP_NAME>` and `<HOST_UUID>`: | ||
``` | ||
xe host-call-plugin host-uuid=<HOST_UUID> plugin=linstor-manager fn=addHost args:groupName=<GROUP_NAME> | ||
``` | ||
For a short description, this command (re)creates a PBD, opens DRBD/LINSTOR ports, starts specific services and adds the node to the LINSTOR database. | ||
|
||
If you have storage devices to use on the host, a LINSTOR storage layer is not directly added to the corresponding node. You can verify the storage state like this: | ||
``` | ||
linstor sp list | ||
``` | ||
|
||
Small example: | ||
|
||
A `LVM_THIN` entry is missing for `hpmc17` in this context: | ||
``` | ||
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ | ||
┊ StoragePool ┊ Node ┊ Driver ┊ PoolName ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊ SharedName ┊ | ||
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡ | ||
┊ DfltDisklessStorPool ┊ hpmc15 ┊ DISKLESS ┊ ┊ ┊ ┊ False ┊ Ok ┊ hpmc15;DfltDisklessStorPool ┊ | ||
┊ DfltDisklessStorPool ┊ hpmc16 ┊ DISKLESS ┊ ┊ ┊ ┊ False ┊ Ok ┊ hpmc16;DfltDisklessStorPool ┊ | ||
┊ DfltDisklessStorPool ┊ hpmc17 ┊ DISKLESS ┊ ┊ ┊ ┊ False ┊ Ok ┊ hpmc17;DfltDisklessStorPool ┊ | ||
┊ xcp-sr-linstor_group_thin_device ┊ hpmc15 ┊ LVM_THIN ┊ linstor_group/thin_device ┊ 476.66 GiB ┊ 476.70 GiB ┊ True ┊ Ok ┊ hpmc15;xcp-sr-linstor_group_thin_device ┊ | ||
┊ xcp-sr-linstor_group_thin_device ┊ hpmc16 ┊ LVM_THIN ┊ linstor_group/thin_device ┊ 476.66 GiB ┊ 476.70 GiB ┊ True ┊ Ok ┊ hpmc16;xcp-sr-linstor_group_thin_device ┊ | ||
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ | ||
``` | ||
|
||
So if your host has disks that need to be added to the linstor SR, you will need to create VG/LV. | ||
Connect to the machine to modify and use `vgcreate` with the wanted disks to create a VG group on the host: | ||
``` | ||
vgcreate <GROUP_NAME> <DEVICES> | ||
``` | ||
|
||
In this example where we want to use /dev/nvme0n1 with the group `linstor_group`: | ||
``` | ||
vgcreate linstor_group /dev/nvme0n1 | ||
``` | ||
|
||
For `thin` additional commands are required: | ||
``` | ||
lvcreate -l 100%FREE -T <GROUP_NAME>/<LV_THIN_VOLUME> | ||
lvchange -ay <GROUP_NAME>/<LV_THIN_VOLUME> | ||
``` | ||
|
||
Always regarding this example, we have: | ||
- `<HOSTNAME>`: `linstor_group`. | ||
- `<LV_THIN_VOLUME>`: `thin_device`. | ||
|
||
Run the correct command where the controller is running to add the volume group in the LINSTOR database: | ||
``` | ||
# For thin: | ||
linstor storage-pool create lvmthin <NODE_NAME> <SP_NAME> <VG_NAME> | ||
|
||
# For thick: | ||
linstor storage-pool create lvm <NODE_NAME> <SP_NAME> <VG_NAME> | ||
``` | ||
|
||
In this example: | ||
``` | ||
linstor storage-pool create lvm hpmc17 xcp-sr-linstor_group_thin_device linstor_group/thin_device | ||
``` | ||
|
||
### How to use a specific network for DRBD requests? | ||
|
||
To use a specific network to handle the DRBD traffic, a new interface must be created on each host: | ||
|
Uh oh!
There was an error while loading. Please reload this page.