Sync master to feature/dynamic-firewalld-control #6662

BengangY · 2025-09-11T07:26:37Z

No description provided.

Desclare the string parameter holding unattend.xml as secret to avoid logging it. Signed-off-by: Christian Lindig <[email protected]> # Conflicts: # ocaml/idl/datamodel_vm.ml # ocaml/xapi/vm_sysprep.mli # ocaml/xapi/xapi_vm.mli

@lindig

This is simply a cherry pick of @lindig 's commit from #6547 now that we have fixed #6622. Let me know if anything more is needed. > CP-308455 VM.sysprep declare XML content as SecretString > Desclare the string parameter holding unattend.xml as secret to avoid logging it. > > Signed-off-by: Christian Lindig <[email protected]> Conflicts: - ocaml/idl/datamodel_vm.ml - ocaml/xapi/vm_sysprep.mli - ocaml/xapi/xapi_vm.mli

…rShell. Signed-off-by: Konstantina Chremmou <[email protected]>

…onf (#6586)" This reverts commit 05e6317, reversing changes made to 1fbdaae. Signed-off-by: Gabriel Buica <[email protected]>

@psafont

This reverts @psafont's PR, #6586, this causing the `/etc/resolve.conf` to be overwritten during update, after the dhclient writes it. This happens because `set_dns` no longer checks the mode before applying the DNS config, This reverts commit 05e6317, reversing changes made to 1fbdaae.

…rShell. (#6642)

Before shutting down, xapi calls `xapi_pre_shutdown` to execute several pre-shutdown scripts, which require the current host's UUID as a parameter. Currently, xapi obtains this UUID in a redundant manner: 1. It retrieves the UUID from the local inventory file. 2. It queries the database for the host's reference using the UUID. 3. It queries the database again for the host's UUID using the reference obtained in step 2. Steps 2 and 3 are unnecessary since the UUID is already available from step 1. Moreover, when the master stops, the slave fails to query the database, increasing xapi shutdown times on the slave. The solution is to directly use the UUID obtained in step 1, eliminating the redundant database queries. Signed-off-by: Bengang Yuan <[email protected]>

Absolute metric should work as follows: Timestamp = 300, 600, 900, 1200 Step = 300 seconds ABSOLUTE DS = 1, 2, 3, 4 But they do not seem to have ever worked correctly - they were previously divided by the interval but incorrectly handled NaNs, resulting in wrong behaviour. Then the refactoring (including 73ca3cc ("CA-404597: rrd - Pass Gauge and Absolute data source values as-is")) broke them further. Divide absolute metrics by the interval in process_ds_value. Fix the unit test to expect the right behaviour - it passes with this fix. Signed-off-by: Andrii Sultanov <[email protected]>

This is a simplified version of commit 846ce82. The feature introduced that commit is to remove the network device renaming functionality. When the feature is merged, there will be 3 states of releases: 1. without the feature. 2. with the feature but renaming is still working. 3. with the feature and renaming has been removed. Originally, the upgrade path was intended to support transitions from state 2 to state 3 only. However, it has become necessary to also support upgrades directly from state 1 to state 3. This commit is to enable the release in state 1 to upgrade to state 3. The change is kept extremely small so it can be merged independently without waiting for the full feature to be merged. In details, during the upgrade, the host-installer can't know the management interface as the eth<N> will not present in the environment the host-installer running. So it needs the MAC address to find out the management interface for setting up network. Signed-off-by: Ming Lu <[email protected]>

This is a simplified version of commit 846ce82. The feature introduced that commit is to remove the network device renaming functionality. When the feature is merged, there will be 3 states of releases: 1. without the feature. 2. with the feature but renaming is still working. 3. with the feature and renaming has been removed. Originally, the upgrade path was intended to support transitions from state 2 to state 3 only. However, it has become necessary to also support upgrades directly from state 1 to state 3. This commit is to enable the release in state 1 to upgrade to state 3. The change is kept extremely small so it can be merged independently without waiting for the full feature to be merged. In details, during the upgrade, the host-installer can't know the management interface as the eth<N> will not present in the environment the host-installer running. So it needs the MAC address to find out the management interface for setting up network.

Absolute metric should work as follows: Timestamp = 300, 600, 900, 1200 Step = 300 seconds ABSOLUTE DS = 1, 2, 3, 4 But they do not seem to have ever worked correctly - they were previously divided by the interval but incorrectly handled NaNs, resulting in wrong behaviour. Then the refactoring (including 73ca3cc ("CA-404597: rrd - Pass Gauge and Absolute data source values as-is")) broke them further. Divide absolute metrics by the interval in process_ds_value. Fix the unit test to expect the right behaviour - it passes with this fix. --- This means absolute metrics are now correctly calculated as unit/s, changing the behaviour in #6640 I am not 100% sure this fix goes all the way either - unit tests should cover expected values in RRAs as well, and the logic is still really confusing to me.

Cross-pool migrations set scheduled_to_be_resident_on to the destination host, reserving memory and vGPUs. If the VM is still halted when migration is finished, the field is not cleared on the destination, preserving the reservations even though they're not necessarily going to be used anytime soon. Call force_state_reset_keep_current_operations in pool_migrate_complete on the destination to clear the reservations among other things at the end of the migration. This fixes an issue when VMs migrated across pools in a halted state would take up memory in xapi's view (but not in RRD's view), which is not intuitive and could prevent further migrations from claiming enough free memory on the host. Signed-off-by: Andrii Sultanov <[email protected]>

This metric represents the rate of VM migrations with vGPUs per second. The total count can be calculated as AVERAGE * seconds. For example, for one-day data granularity, the total count for one day is AVERAGE * 86400, in which 86400 is the seconds of one day. Signed-off-by: Ming Lu <[email protected]>

see cgroups v2 "no internal processes" rule if cgroup.subtree_control is not empty, and we attach a pid to cgroup.procs, kernel would return EBUSY Signed-off-by: Chunjie Zhu <[email protected]>

) Cross-pool migrations set `scheduled_to_be_resident_on` to the destination host, reserving memory and vGPUs. If the VM is still halted when migration is finished, the field is not cleared on the destination, preserving the reservations even though they're not necessarily going to be used anytime soon. Call `force_state_reset_keep_current_operations` in `pool_migrate_complete` on the destination to clear the reservations among other things at the end of the migration. This fixes an issue when VMs migrated across pools in a halted state would take up memory in xapi's view (but not in RRD's view), which is not intuitive and could prevent further migrations from claiming enough free memory on the host. --- I've tested this manually, confirming that migrating halted and running VMs works as designed, with the `scheduled_to_be_resident_on` field cleared at the end of the migration (in the first case it's because of the fix, in the second one it's because the VM is actually started on the host)

Signed-off-by: Rob Hoes <[email protected]>

This metric represents the rate of VM migrations with vGPUs per second. The total count can be calculated as AVERAGE * seconds. For example, for one-day data granularity, the total count for one day is AVERAGE * 86400, in which 86400 is the seconds of one day.

Currently a manually disabled host will be re-enabled on toolstack restarts and host reboots, which will provoke VM migrations in an HA cluster. If maintenance requires many restarts, that could be painful. To allow for keeping a host persistently disabled across toolstack restarts and host reboots, add a new localdb flag "host_auto_enable" (set through the parameter on Host.disable). This coexists with the internal flag of host_disabled_until_reboot, which is only set on host poweroff internally and cannot be controlled by the user directly. With host_auto_enable set to false, xapi will not re-enable a host on its own no matter what: toolstack restarts, host reboots, calls to consider_enabling_host (triggered by PBD plugs etc.) will have no effect. Only a manual call to Host.enable will re-enable the host. Expose the new parameter in the CLI. Also fix up the comment in xapi_host_helpers.mli. Signed-off-by: Andrii Sultanov <[email protected]>

…6653)

Before shutting down, xapi calls `xapi_pre_shutdown` to execute several pre-shutdown scripts, which require the current host's UUID as a parameter. Currently, xapi obtains this UUID in a redundant manner: 1. It retrieves the UUID from the local inventory file. 2. It queries the database for the host's reference using the UUID. 3. It queries the database again for the host's UUID using the reference obtained in step 2. Steps 2 and 3 are unnecessary since the UUID is already available from step 1. Moreover, when the master stops, the slave fails to query the database, increasing xapi shutdown times on the slave. The solution is to directly use the UUID obtained in step 1, eliminating the redundant database queries.

see cgroups v2 "no internal processes" rule if cgroup.subtree_control is not empty, and we attach a pid to cgroup.procs, kernel would return EBUSY

* Use the decoder from the OCaml standard library instead of our own implementation, which this patch removes. * Validate UTF-8/XML conformance for maps and sets, in addition to strings. This is XSA-474 / CVE-2025-58146. Signed-off-by: Christian Lindig <[email protected]> Reviewed-by: Edwin Török <[email protected]>

Signed-off-by: Andrii Sultanov <[email protected]>

Currently `Host.disable` only disables the host for the current xapi's runtime - it will be re-enabled on toolstack restarts and host reboots (if other conditions are met). This greatly complicates manual maintenance in an HA cluster when reboots are needed, since VMs will start moving to the host as soon as it's enabled. To allow for keeping a host persistently disabled across toolstack restarts and host reboots, add a new localdb flag `host_auto_enable` (set through the parameter on Host.disable). This coexists with the internal flag of `host_disabled_until_reboot`, which is only set on host poweroff internally and cannot be controlled by the user directly. With `host_auto_enable` set to false, xapi will not re-enable a host on its own no matter what: toolstack restarts, host reboots, calls to consider_enabling_host (triggered by PBD plugs etc.) will have no effect. Only a manual call to `Host.enable` will re-enable the host. Expose the new parameter in the CLI. Also fix up the comment in `xapi_host_helpers.mli`. I've verified the new functionality manually.

* Use the decoder from the OCaml standard library instead of our own implementation, which this patch removes. * Validate UTF-8/XML conformance for maps and sets, in addition to strings. This is XSA-474 / CVE-2025-58146. Reviewed-by: Edwin Török <[email protected]> Patch from: https://xenbits.xen.org/xsa/advisory-474.html

Christian Lindig and others added 26 commits August 28, 2025 16:06

CP-308455 VM.sysprep declare XML content as SecretString

8c5053a

Desclare the string parameter holding unattend.xml as secret to avoid logging it. Signed-off-by: Christian Lindig <[email protected]> # Conflicts: # ocaml/idl/datamodel_vm.ml # ocaml/xapi/vm_sysprep.mli # ocaml/xapi/xapi_vm.mli

CP-308539: Updated certificate validation to support .NET 8.0 in Powe…

ddaf4ce

…rShell. Signed-off-by: Konstantina Chremmou <[email protected]>

Revert "xapi/nm: Send non-empty dns to networkd when using IPv6 autoc…

ec61f8e

…onf (#6586)" This reverts commit 05e6317, reversing changes made to 1fbdaae. Signed-off-by: Gabriel Buica <[email protected]>

CP-308539: Updated certificate validation to support .NET 8.0 in Powe…

cc2f09e

…rShell. (#6642)

CA-416516: vm.slice/cgroup.procs write operation gets EBUSY

e343dfe

see cgroups v2 "no internal processes" rule if cgroup.subtree_control is not empty, and we attach a pid to cgroup.procs, kernel would return EBUSY Signed-off-by: Chunjie Zhu <[email protected]>

CA-367765: remove reference to obsolete default URL

ec8f3d6

Signed-off-by: Rob Hoes <[email protected]>

Remove obsolete test script

3fecb75

Signed-off-by: Rob Hoes <[email protected]>

CA-367765: remove reference to obsolete default URL and test script (#…

4412cea

…6653)

CA-416516: vm.slice/cgroup.procs write operation gets EBUSY (#6650)

3cc9241

see cgroups v2 "no internal processes" rule if cgroup.subtree_control is not empty, and we attach a pid to cgroup.procs, kernel would return EBUSY

Adjust quality-gate.sh

6aa075a

Signed-off-by: Andrii Sultanov <[email protected]>

BengangY marked this pull request as ready for review September 11, 2025 07:54

changlei-li approved these changes Sep 11, 2025

View reviewed changes

last-genius approved these changes Sep 11, 2025

View reviewed changes

BengangY merged commit 343972b into feature/dynamic-firewalld-control Sep 11, 2025
73 of 74 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sync master to feature/dynamic-firewalld-control #6662

Sync master to feature/dynamic-firewalld-control #6662

Uh oh!

BengangY commented Sep 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Sync master to feature/dynamic-firewalld-control #6662

Sync master to feature/dynamic-firewalld-control #6662

Uh oh!

Conversation

BengangY commented Sep 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants