GC in container with multiple processes #84828
-
We're running our application (.net 7.0.4, sdk 7.0.103) in a container (k8s) with a cgroups memory limit. The .NET GC defaults to 75% of this limit for it's heap, all good. Our app uses a couple of child processes for some isolated workloads, and as far as I can tell they do not appear to collaborate well to fit within that memory limit. I say this because we have some OOM issues where the symptoms are being terminated either by the oomkiller of the linux host node or by the k8 scheduler; this does not happen for a container with a single process, we instead get OutOfMemoryException from the GC when breaching that 75% HeapHardLimitPercent. How does the GC handle the memory limit above in a multiprocess scenario within containers? I don't believe there is a way to calculate Caveats: We do a lot of roslyn compilation in both processes which might cause a lot of native memory allocation which could impact this analysis? Also we are potentially seeing some issues here #78959 (comment) with the .net7 GC not freeing up the managed heap available space properly (maybe)? I'm thinking about how to solve this issue and the obvious but even more painful answer is to move these extra processes to their own containers such that we only have one .NET process running on each container, while we're already using GRPC there is still a lot of work to do with scheduling these processes which are not static and are continuously recycled. Perhaps we're getting something else wrong with the GC settings and memory limits can be obeyed properly in a .net container multiprocess scenario? The other obvious solution is to do something like 70% main process and 10% sub processes but would still need a certain amount of math to work with the scaling of the container available resources (and any mistake bringing up an extra process leaves the whole container vulnerable to being terminated). |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
In early .NET Core versions, GC relied on the cgroup usage vs limit to trigger GCs. That didn't work too well. If you're running multiple .NET apps in a container, they will have the same issues as the single process containers before 3.0. |
Beta Was this translation helpful? Give feedback.
-
cc @Maoni0 |
Beta Was this translation helpful? Give feedback.
In early .NET Core versions, GC relied on the cgroup usage vs limit to trigger GCs. That didn't work too well.
With 3.0, the heap limits were introduced which cause a single process to nicely stay within the limits.
If you're running multiple .NET apps in a container, they will have the same issues as the single process containers before 3.0.
cc @richlander @janvorli