-
|
Heya! The status was always marked as CrashLoopBackoff or somesuch, but I noticed at some point that the messaged briefly flashed "OOMKilled". Looking at the image-automation-controller deployment manifest, I saw that the resource requests for memory was 64mb, with a limit of 1gb. I manually edited it to the double of that (128mb request, 2gb limit) and now I can get my services to update again. Hurray!...? So my questions are :
Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
The current version and the version you have upgraded to are both after the recent transition, where in we removed libgit2 (the C library) in favor of go-git, that simplified a lot of things and removed many opportunities for memory leaks to sneak in. This should perform better, but it is possible that it has a higher memory usage at baseline. Of the changes you made (increasing the request and increasing the limit) only one thing was likely to improve the situation, that was increasing the limit. The OOM condition is reached when the controller hits the memory limit. It is possible but unlikely in most conditions (only if your cluster is extremely tight for memory all-around) where increasing the request could help, but if your cluster is constrained for memory, that's a different issue. There is a direct relationship between the repository size and the amount of memory used by the controller. If you use Git config repos (rather than storing all of your manifests together with a monorepo or in the application repo, where there's a lot of extraneous data) then you can reduce the size of the git repo and Image Update Automation's job will be easier, with respect to memory load. There are techniques such as shallow clone and sparse checkout which all cannot be used for Image Update Automation, since it needs to make a commit, it needs the whole clone in order to produce a commit. So if your repository is large, the performance is going to be poor. But it sounds like a performance regression and so maybe there's something we can do about that yet. If you have some hard numbers, what version it started crashing (I suspect it will be the first version that dropped libgit2) what is the repository size, what does it look like in |
Beta Was this translation helpful? Give feedback.
The current version and the version you have upgraded to are both after the recent transition, where in we removed libgit2 (the C library) in favor of go-git, that simplified a lot of things and removed many opportunities for memory leaks to sneak in.
This should perform better, but it is possible that it has a higher memory usage at baseline. Of the changes you made (increasing the request and increasing the limit) only one thing was likely to improve the situation, that was increasing the limit. The OOM condition is reached when the controller hits the memory limit. It is possible but unlikely in most conditions (only if your cluster is extremely tight for memory all-around) where incre…