-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Changes the value of config vm_memballoon_stats_period to 60 #8520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes the value of config vm_memballoon_stats_period to 60 #8520
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #8520 +/- ##
=============================================
+ Coverage 18.28% 30.80% +12.52%
- Complexity 16826 33998 +17172
=============================================
Files 4848 5341 +493
Lines 324301 375027 +50726
Branches 45561 54554 +8993
=============================================
+ Hits 59290 115525 +56235
+ Misses 256437 244211 -12226
- Partials 8574 15291 +6717
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@andrijapanicsb @alexandremattioli any ideas on this? |
this value can be set manually, or by ansible/chef/puppet. |
many of the global settings are set to a certain value, many of them to 0 - and can be easily changed (like most of the other global settings). I prefer to keep it as it is, and allow operator to decide what to change. |
The value of this configuration was originally proposed with BTW: using the value 0 makes the monitoring stop working in KVM. |
we are free to change the default value in a PR, or before official release. Once the configuration is included in a official release, we should be careful to change the default value.@RodrigoDLopez |
worth to mention that I have faced an issue caused by it in the past . #8453 (comment) although @shaerul said it worked well in his testing, I think it would be better to enable/disable memory ballooning and statistics at vm level. |
(For what is worth it - I've been told by a RH engineer (in 2019) that memory auto-ballooning is an abandoned project - so be careful with the feature) |
I believe that metrics monitoring should work without operators having to worry about configurations and enabling/disabling the feature. Changing the configuration value from '60' to '0' compromised the functionality for the KVM hypervisor; which in turn, can make it seem as it does not work out of the box. An example of this is operators lacking visibility regarding the necessity of adding this configuration in the agent.properties file (see: #8453 (comment)). From my perspective, using the value '60' resolves issues similar to those faced by @shaerul in #8453. What are the main impacts you foresee with this change? Regarding the issue encountered during the migration of a Windows instance with the memory balloon enabled (#8453 (comment)), could you explain the error scenario in detail? This would allow me to attempt to reproduce locally and, if necessary, extend this PR to address the issue related to the migration of Windows VMs with the memory balloon feature enabled. |
In my opinion, the wrong memory usage is not a critical issue for operators.
As I said, I have faced an issue in the past. I am not sure if the issue was caused by windows virtio drivers or qemu/libvirt.
I do not remember the details very clearly. the kvm host is Ubuntu 18 or 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrijapanicsb @weizhouapache, what are your concerns with this change? Without it, a nice feature in ACS, that enables users to monitor users' VMs via UI does not work out of the box for KVM deployments, which can cause disappointments and misunderstandings as the ones already noticed in the mailing list and issues in Github.
can please see my comments above ? @GutoVeronezi |
@weizhouapache you said you have faced an issue in the past, but did not specify how to reproduce it or its details. On the other hand, you said that you are not sure if the issue was caused by the virtio drivers; therefore, we do not have a solid reason to not change it. |
@GutoVeronezi Anyone who wants to enable the memory stats collection and happy to take the risky, go for it. ACS has already provided the option to do it. |
@weizhouapache do you have steps to reproduce it? |
@GutoVeronezi |
But then we do not have a solid reason to not change the value, just speculations. If at least we could reproduce the error you saying that exist, we could look for the real cause and solve the problem (or at least find an explanation). |
@GutoVeronezi changing vm_memballoon_stats_period to 60, will add a new setting in the vm definition, which will probably cause some regression which is out of our control. We should disable it by default. |
But again, it is a speculation. We have solid use cases of operators having to contact the community to understand why their Windows VMs are with wrong stats; and every time, the solution is to change the Please, bring solid use cases so we can discuss it further. |
interesting |
Description
Currently, the default value for the
vm_memballoon_stats_period
configuration is set to0
. As a result, out of the box in ACS, memory metrics for Windows instances (with virtio drivers correctly installed and configured) is compromised when using KVM hypervisor. This pull request aims to propose setting thevm_memballoon_stats_period
configuration value to60
. This correction addresses situations such as the one reported through the issue: #8453.fixes: #8453
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
Before changes:

After changes:

How Has This Been Tested?
A Windows VM was created, and it was confirmed that the VirtIO drivers were correctly installed. However, upon inspecting the metrics collected by ACS, it was observed that they did not align with the actual consumption of the VM. Subsequently, after changing the configuration value from
0
to60
, the metrics collected by ACS matched the consumption values reported by the VM.