Replies: 3 comments
-
|
Thanks for the report, we are always looking to improve the expression. The OOM handler had a fix for the expression in one of the first 1.12.x releases, so I hope you're not running an outdated version. The OOM handler doesn't see if the pod is Cilium or not, it relies on scheduling classes, so if the only group (in the default setup) which doesn't get killed is You can see the default expressions for triggering OOM handler and ranking here: https://docs.siderolabs.com/talos/v1.13/reference/configuration/runtime/oomconfig Every "kill" is recorded into |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the insight, it’s really helpful, as always.
I suppose you are talking about 7ddb37b1f, which was shipped in
AFAIK, while it’s really important to closely monitor its resource usage, it’s not really recommended to have hard memory limit on core pods like cilium, and even less recommended to have cpu limit (throttling issues). So having it with QOS class guaranteed is not really an option ?
Thanks, I did not know about that. It’s a bit hard to correlate these with actual OOMs though, since the OOMAction does not contains the cgroup it did attempt to kill, nor the QOS class (edited: actually it does include the process name, my bad), nor details about the EDIT: my bad, we can actually see processes that gets killed on OOMAction, that’s a misread on my part. |
Beta Was this translation helpful? Give feedback.
-
|
See also #13330 - this change should help. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
Yesterday we noticed some "weird" behaviour with talos OOM handler and were wondering if it was expected behaviour or not.
Context:
We had a node that was using 85-90% of its ram and then on top of that some pods started to spike, triggering the OOM handler.
Not here to discuss the actual usage of the node (which was bad due to bad sizing on our part).
Nodes were running talos
1.12with default OOM settings. No specific configuration on this part.Observations:
Here are the (filtered) logs of the OOM handler:
We can see two things:
burstable/podc6eb0aa5-d30e-4daf-8ea8-16c3c2c76f36three times.besteffort/pod1988631d-7143-4b06-888e-54d5dc6ccca6repeatedlyThe first pod (
burstable) was actually ...cilium.So killing the
cnipod actually made things way worse on the node as some other pods started losing network connection, and started pilling even more memory as they were unable to send the data they were holding. And at the time there were a lot of other (more relevant in term of memory usage)burstablepods to be killed instead.The second pod (that gets killed repeatedly) was actually the only
besteffortpod on the node.So as per default configuration, it gets the highest "priority" at getting killed by the handler. (btw it was a daemonset pod)
The usage of the pod was quite low and killing it again and again did not really help.
In the end, the kernel OOM killer did its job and the node came back on its own.
I am wondering though, is talos OOM handler supposed to do this ?
I would have expected it to have some kind of protection for important pods such as cilium (but I am not sure how the handler can actually know about this)
Also is it expected that it tries to kill the same pod again and again, even considering priority ?
We feel like we have not properly understood the intent of this OOM handler. Or we have not configured it properly ? TBH I am not sure how to tell the OOM handler to not even try to kill cilium for example.
While we see the value it brings, we are also considering the "chaos" that go with it, and are wondering if we should not disable the handler in the end, as other have already done.
Thank you for your guidance 🙏
Beta Was this translation helpful? Give feedback.
All reactions