harvester
diff --git a/‎docs/advanced/settings.md
Lines changed: 120 additions & 0 deletions b/‎docs/advanced/settings.md
Lines changed: 120 additions & 0 deletions
diff --git a/‎docs/rancher/resource-quota.md
Lines changed: 20 additions & 7 deletions b/‎docs/rancher/resource-quota.md
Lines changed: 20 additions & 7 deletions
diff --git a/‎docs/vm/create-vm.md
Lines changed: 17 additions & 0 deletions b/‎docs/vm/create-vm.md
Lines changed: 17 additions & 0 deletions
diff --git a/‎versioned_docs/version-v1.3/rancher/resource-quota.md
Lines changed: 14 additions & 2 deletions b/‎versioned_docs/version-v1.3/rancher/resource-quota.md
Lines changed: 14 additions & 2 deletions
@@ -328,6 +328,126 @@ A VM that is configured to use 2 CPUs (equivalent to 2,000 milliCPU) can consume
 }
 ```
 
+### `additional-guest-memory-overhead-ratio`
+
+**Definition**: The ratio to futher tune the VM `memory overhead`.
+
+Each VM is configured with a memory value, this memory is targeted for the VM guest OS to see and use. In Harvester, the VM run in a virt-launcher pod. CPU/VM resource limits are translated and applied to the launcher pod [Resource requests and limits of Pod and container](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#resource-requests-and-limits-of-pod-and-container). Kubevirt ensures certain amount of memory is reserved in the pod for managing the virtualization process. Harvester and KubeVirt summarize such additional memory as the VM `Memory Overhead`. The `Memory Overhead` is computed by a complex formula formula. However, sometimes the OOM(Out Of Memory) can still happen and the related VM is killed by the Harvester OS, the direct cause is that the whole POD/Container exceeds its memory limits. From practice, the `Memory Overhead` varies on different kinds of VM, different kinds of VM operating system, and also depends on the running workloads on the VM.
+
+This setting is for more flexibly tuning the VM `Memory Overhead`.
+
+**Default values**: `"1.5"`
+
+Valid values: `""`, `"0"` and from `"1.0"` to `"10.0"`.
+
+A VM that is configured to have `1 CPU, 2 Gi Memory, 1 Volume and 1 NIC` will get around `240 Mi` `Memory Overhead` when the ratio is `"1.0"`. When the ratio is `"1.5"`, the `Memory Overhead` is `360 Mi`. When the ratio is `"3"`, the `Memory Overhead` is `720 Mi`.
+
+A VM that is configured to have `1 CPU, 64 Gi Memory, 1 Volume and 1 NIC` will get around `250 Mi` `Memory Overhead` when the ratio is `"1.0"`. The VM memory size does not have a big influence on the computing of `Memory Overhead`. The overhead of `guest OS pagetables` needs one bit for every 512b of RAM size.
+
+The yaml output of this setting on a new cluster:
+
+```
+apiVersion: harvesterhci.io/v1beta1
+default: "1.5"
+kind: Setting
+metadata:
+  name: additional-guest-memory-overhead-ratio
+value: ""
+```
+
+When the `value` field is `""`, the `default` field is used.
+
+When the `value` field is `"0"`, the `additional-guest-memory-overhead-ratio` setting is not used, Harvester will fallback to the legacy [Reserved Memory](../../versioned_docs/version-v1.3/vm/create-vm.md#reserved-memory) which is used in Harvester v1.3.x, v1.2.x and earlier versions. When a new VM is created and the `Reserved Memory` field on WebUI is not filled, this VM will get the `100Mi default Reserved Memory`.
+
+If you have already set a valid value on the `spec.configuration.additionalGuestMemoryOverheadRatio` field of `kubevirt` object before Harvester v1.4.0 and then upgrade to v1.4.0, Harvester will fetch and convert it to the `value` field of this setting on the upgrade path. After that, Harvester will always use this setting to sync to the `kubevirt` object.
+
+This setting and the VM configuration field [Reserved Memroy](../vm/create-vm.md#reserved-memory) are both taken into account to get a final `Total Memory Overhead` for each VM.
+
+The `Total Memory Overhead` = automatically computed `Memory Overhead` + Harvester `Reserved Memory`.
+
+The following table shows how they work together.
+
+| VM Configured Memory | Reserved Memory | additional-guest-memory-overhead-ratio| Guest OS Memory | POD Container Memory Limit | Total Memory Overhead |
+| --- | --- | --- | --- | --- | --- |
+| 2 Gi | ""(not configured) | "0.0" | 2 Gi - 100 Mi | 2 Gi + 240 Mi | ~340 Mi |
+| 2 Gi | 256 Mi | "0.0" | 2 Gi - 256 Mi | 2 Gi + 240 Mi | ~500 Mi |
+| --- | --- | --- | --- | --- | --- |
+| 2 Gi | ""(not configured) | "1.0" | 2 Gi | 2 Gi + 240*1.0 Mi | ~240 Mi |
+| 2 Gi | ""(not configured) | "3.0" | 2 Gi | 2 Gi + 240*3.0 Mi | ~720 Mi |
+| --- | --- | --- | --- | --- | --- |
+| 2 Gi | ""(not configured) | "1.5" | 2 Gi | 2 Gi + 240*1.5 Mi | ~360 Mi |
+| 2 Gi | 256 Mi | "1.5" | 2 Gi - 256 Mi | 2 Gi + 240*1.5 Mi | ~620 Mi |
+
+The related information can be fetched from those objects:
+
+```
+When `additional-guest-memory-overhead-ratio` is set as "1.5".
+
+The VM object:
+...
+        memory:
+          guest: 2Gi    // Guest OS Memory
+        resources:
+          limits:
+            cpu: "1"
+            memory: 2Gi // VM Configured Memory
+
+The POD object:
+...
+    resources:
+      limits:
+        cpu: "1"
+        devices.kubevirt.io/kvm: "1"
+        devices.kubevirt.io/tun: "1"
+        devices.kubevirt.io/vhost-net: "1"
+        memory: "2532309561"                // POD Container Memory Limit
+
+```
+
+**Example**:
+
+```
+2.0
+```
+
+:::note
+
+To reduce the chance of hitting OOM, Harvester suggests to:
+
+- Configure this setting with value `"2"` to give all the VMs `~480 Mi` `Memory Overhead`, if the cluster has no memory resource pressure.
+
+- Configure the `Reserved Memory` field to have a bigger `Total Memory Overhead`, if some VMs are important. The rules based on experiences are:
+
+    - When `VM Configured Memory` is between `5 Gi` and `10 Gi`, the `Total Memory Overhead` is `>= VM Configured Memory * 10%`.
+
+    - When `VM Configured Memory` is greater than `10 Gi`, the `Total Memory Overhead` is `>= 1 Gi`.
+
+    - Keep observing, tuning and testing to get the best values for each VM.
+
+- Avoid configuring the `spec.configuration.additionalGuestMemoryOverheadRatio` field of `kubevirt` object directly.
+
+The bigger `Total Memory Overhead` does not mean that the amount of memory is used up all the time, it is set to tolerant the peak and hence avoid hitting OOM.
+
+There is no `one-fit-all` solution.
+
+:::
+
+:::important
+
+If you have set the `Reserved Memory` field for each VM and plan to keep the legacy [Reserved Memory](../../versioned_docs/version-v1.3/vm/create-vm.md#reserved-memory), after the cluster is upgraded to Harvester v1.4.0, you can set the `additional-guest-memory-overhead-ratio` setting to `"0"`.
+
+Changing the `additional-guest-memory-overhead-ratio` setting affects the VMs per following rules:
+
+- It will affect the VMs newly created after this change immediately.
+
+- It will not affect the running VMs immediately, those VMs will get the new `Memory Overhead` calculated from the setting after those VMs are rebooted.
+
+- When a VM has a user configured `Reserved Memory`, this is always kept.
+
+- When the value changes between `"0"` and the range `["", "1.0" .. "10.0"]`, the existing VMs which have the `100Mi default Reserved Memory` will keep it, the existing VMs which do not have `100Mi default Reserved Memory` will not get it automatically.
+
+:::
+
 ### `release-download-url`
 
 **Definition**: URL for downloading the software required for upgrades.
 
@@ -41,7 +41,13 @@ You can configure the **Namespace** limits as follows:
 Attempts to provision VMs for guest clusters are blocked when the resource quotas are reached. Rancher responds by creating a new VM in a loop, in which each failed attempt to create a VM is immediately followed by another creation attempt. This results in a transient error state in the cluster that is not recorded as the VM is recreated.
 :::
 
-## Overhead memory of virtual machine
+:::important
+
+Due to the [Overhead Memory of Virtual Machine](#overhead-memory-of-virtual-machine), each VM needs some additional memory to work. When setting **Memory Limit**, this should be taken into account. For example, when the project **Memory Limit** is `24 Gi`, it is not possible to run 3 VMs each has `8 Gi` memory.
+
+:::
+
+## Overhead Memory of Virtual Machine
 Upon creating a virtual machine (VM), the VM controller seamlessly incorporates overhead resources into the VM's configuration. These additional resources intend to guarantee the consistent and uninterrupted functioning of the VM. It's important to note that configuring memory limits requires a higher memory reservation due to the inclusion of these overhead resources.
 
 For example, consider the creation of a new VM with the following configuration:
@@ -56,18 +62,25 @@ Memory Overhead is calculated in the following sections:
 - **Memory PageTables Overhead:** This accounts for one bit for every 512b RAM size. For instance, a memory of 16Gi requires an overhead of 32Mi.
 - **VM Fixed Overhead:** This consists of several components:
     - `VirtLauncherMonitorOverhead`: 25Mi  (the `ps` RSS for virt-launcher-monitor)
-    - `VirtLauncherOverhead`: 75Mi  (the `ps` RSS for the virt-launcher process)
-    - `VirtlogdOverhead`: 17Mi  (the `ps` RSS for virtlogd)
-    - `LibvirtdOverhead`: 33Mi (the `ps` RSS for libvirtd)
+    - `VirtLauncherOverhead`: 100Mi  (the `ps` RSS for the virt-launcher process)
+    - `VirtlogdOverhead`: 20Mi  (the `ps` RSS for virtlogd)
+    - `VirtqemudOverhead`: 35Mi (the `ps` RSS for virtqemud)
     - `QemuOverhead` : 30Mi (the `ps` RSS for qemu, minus the RAM of its (stressed) guest, minus the virtual page table)
 - **8Mi per CPU (vCPU) Overhead:** Additionally, 8Mi of overhead per vCPU is added, along with a fixed 8Mi overhead for IOThread.
 - **Extra Added Overhead:** This encompasses various factors like video RAM overhead and architecture overhead. Refer to [Additional Overhead](https://github.com/kubevirt/kubevirt/blob/2bb88c3d35d33177ea16c0f1e9fffdef1fd350c6/pkg/virt-controller/services/template.go#L1853-L1890) for further details.
+- **additional-guest-memory-overhead-ratio** User can further tune the `Memory Overhead` by the Harvester setting [additional-guest-memory-overhead-ratio](../advanced/settings.md#additional-guest-memory-overhead-ratio), which defaults to `"1.5"`. This setting is important for VM to eliminate the chance to hit OOM(Out of Memory).
 
-This calculation demonstrates that the VM instance necessitates an additional memory overhead of approximately 276Mi.
+This calculation demonstrates that the VM instance necessitates an additional memory overhead of approximately 380Mi.
 
 For more information, see [Memory Overhead](https://kubevirt.io/user-guide/virtual_machines/virtual_hardware/#memory-overhead).
 
-For more information on how the memory overhead is calculated in Kubevirt, refer to [kubevirt/pkg/virt-controller/services/template.go](https://github.com/kubevirt/kubevirt/blob/v0.54.0/pkg/virt-controller/services/template.go#L1804).
+For more information on how the memory overhead is calculated in Kubevirt, refer to the source code [GetMemoryOverhead](https://github.com/kubevirt/kubevirt/blob/1466b658f78b9b8bb9517ffb6dafd4b777f33fe6/pkg/virt-controller/services/renderresources.go#L307).
+
+:::note
+
+The `Overhead Memory` varies between different Harvester releases (with different Kubevirt releases) because all those backing components are keeping adding new features and fixing bugs, they need more memory.
+
+:::
 
 ## Automatic adjustment of ResourceQuota during migration
 When the allocated resource quota controlled by the `ResourceQuota` object reaches its limit, migrating a VM becomes unfeasible. The migration process automatically creates a new pod mirroring the resource requirements of the source VM. If these pod creation prerequisites surpass the defined quota, the migration operation cannot proceed.
@@ -81,4 +94,4 @@ Please be aware of the following constrains of the automatic resizing of `Resour
 - When raising the `ResourceQuota` value, if you create, start, or restore other VMs, Harvester will verify if the resources are sufficient based on the original `ResourceQuota`. If the conditions are not met, the system will alert that the migration process is not feasible.
 - After expanding `ResourceQuota`, potential resource contention may occur between non-VM pods and VM pods, leading to migration failures. Therefore, deploying custom container workloads and VMs to the same namespace is not recommended.
 - Due to the concurrent limitation of the webhook validator, the VM controller will execute a secondary validation to confirm resource sufficiency. If the resource is insufficient, it will auto config the VM's `RunStrategy` to `Halted`, and a new annotation `harvesterhci.io/insufficient-resource-quota` will be added to the VM object, informing you that the VM was shut down due to insufficient resources.
-  ![](/img/v1.2/rancher/vm-annotation-insufficient-resource-quota.png)
+  ![](/img/v1.2/rancher/vm-annotation-insufficient-resource-quota.png)
@@ -238,6 +238,23 @@ In order to meet the scenario requirements of more users, the `RunStrategy` fiel
 - Stop: There will be no VM instance. If the guest is already running, it will be stopped. This is the same behavior as `Running: false`.
 
 
+### Reserved Memory
+
+Each VM is configured with a memory value, this memory is targeted for the VM guest OS to see and use. In Harvester, the VM is carried by a Kubernetes POD. The memory limitation is achieved by Kubernetes [Resource requests and limits of Pod and container](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#resource-requests-and-limits-of-pod-and-container). Certain amount of memory is required to simulate and manage the `CPU/Memory/Storage/Network/...` for the VM to run. Harvester and KubeVirt summarize such additional memory as the VM `Memory Overhead`. The `Memory Overhead` is computed by a complex formula. However, sometimes the OOM(Out Of Memory) can still happen and the related VM is killed by the Harvester OS, the direct cause is that the whole POD/Container exceeds its memory limits. From practice, the `Memory Overhead` varies on different kinds of VM, different kinds of VM operating system, and also depends on the running workloads on the VM.
+
+Harvester adds a `Reserved Memory` field and a setting `additional-guest-memory-overhead-ratio` for users to adjust the guest OS memory and the `Total Memory Overhead`.
+
+The `Total Memory Overhead` = automatically computed `Memory Overhead` + Harvester `Reserved Memory`.
+
+All the details are described in the setting [additional-guest-memory-overhead-ratio](../advanced/settings.md#additional-guest-memory-overhead-ratio).
+
+:::important
+
+Read the document carefully, understand how it works and set a proper value on this field.
+
+:::
+
+
 ### Cloud Configuration
 
 Harvester supports the ability to assign a startup script to a virtual machine instance which is executed automatically when the VM initializes.
 
@@ -40,6 +40,12 @@ You can configure the **Namespace** limits as follows:
 Attempts to provision VMs for guest clusters are blocked when the resource quotas are reached. Rancher responds by creating a new VM in a loop, in which each failed attempt to create a VM is immediately followed by another creation attempt. This results in a transient error state in the cluster that is not recorded as the VM is recreated.
 :::
 
+:::important
+
+Due to the [Overhead Memory of Virtual Machine](#overhead-memory-of-virtual-machine), each VM needs some additional memory to work. When setting **Memory Limit**, this should be taken into account. For example, when the project **Memory Limit** is `24 Gi`, it is not possible to run 3 VMs each has `8 Gi` memory.
+
+:::
+
 ## Overhead memory of virtual machine
 Upon creating a virtual machine (VM), the VM controller seamlessly incorporates overhead resources into the VM's configuration. These additional resources intend to guarantee the consistent and uninterrupted functioning of the VM. It's important to note that configuring memory limits requires a higher memory reservation due to the inclusion of these overhead resources.
 
@@ -66,7 +72,13 @@ This calculation demonstrates that the VM instance necessitates an additional me
 
 For more information, see [Memory Overhead](https://kubevirt.io/user-guide/virtual_machines/virtual_hardware/#memory-overhead).
 
-For more information on how the memory overhead is calculated in Kubevirt, refer to [kubevirt/pkg/virt-controller/services/template.go](https://github.com/kubevirt/kubevirt/blob/v0.54.0/pkg/virt-controller/services/template.go#L1804).
+For more information on how the memory overhead is calculated in Kubevirt, refer to the source code [GetMemoryOverhead](https://github.com/kubevirt/kubevirt/blob/e8e638edc22587ec7be2cc3d983b61763e33f973/pkg/virt-controller/services/renderresources.go#L299).
+
+:::note
+
+The `Overhead Memory` varies between different Harvester releases (with different Kubevirt releases) because all those backing components are keeping adding new features and fixing bugs, they need more memory.
+
+:::
 
 ## Automatic adjustment of ResourceQuota during migration
 When the allocated resource quota controlled by the `ResourceQuota` object reaches its limit, migrating a VM becomes unfeasible. The migration process automatically creates a new pod mirroring the resource requirements of the source VM. If these pod creation prerequisites surpass the defined quota, the migration operation cannot proceed.
@@ -80,4 +92,4 @@ Please be aware of the following constrains of the automatic resizing of `Resour
 - When raising the `ResourceQuota` value, if you create, start, or restore other VMs, Harvester will verify if the resources are sufficient based on the original `ResourceQuota`. If the conditions are not met, the system will alert that the migration process is not feasible.
 - After expanding `ResourceQuota`, potential resource contention may occur between non-VM pods and VM pods, leading to migration failures. Therefore, deploying custom container workloads and VMs to the same namespace is not recommended.
 - Due to the concurrent limitation of the webhook validator, the VM controller will execute a secondary validation to confirm resource sufficiency. If the resource is insufficient, it will auto config the VM's `RunStrategy` to `Halted`, and a new annotation `harvesterhci.io/insufficient-resource-quota` will be added to the VM object, informing you that the VM was shut down due to insufficient resources.
-  ![](/img/v1.2/rancher/vm-annotation-insufficient-resource-quota.png)
+  ![](/img/v1.2/rancher/vm-annotation-insufficient-resource-quota.png)