title | description | ms.date | ms.reviewer | ms.service | ms.custom |
---|---|---|---|---|---|
Troubleshoot High Memory Consumption in Disk-Intensive Applications |
Identify and resolve issues causing excessive memory usage due to Linux kernel behavior in pods. |
05/10/2025 |
claudiogodoy, |
azure-kubernetes-service |
sap:Node/node pool availability and performance |
Disk input and output operations are costly, and most operating systems implement caching
strategies for reading and writing data to the filesystem. In the case of the Linux Kernel, it employs several strategies, such as the Page Cache, whose primary goal is to store data read from the filesystem in cache
, making it available in memory
for future read operations.
This article helps you identify when some Kernel behavior leads to a high memory consumption pattern on a pod
.
- The
Kubernetes
kubectl tool, or a similar tool to connect to the cluster. To install kubectl by using Azure CLI, run the az aks install-cli command.
The following table outlines the common symptoms of memory saturation.
Symptom | Description |
---|---|
Working_Set metric too high | When there is a lot difference between the working_set reported by the Kubernetes metrics API and the actual memory consumed by application. |
Out-of-memory (OOM) kill | An OOM problem is a indicative that you pod is facing memory issues. |
If you don't already know which pod
is consuming much memory refer to the guide: Troubleshoot memory saturation in AKS clusters.
The kubectl top pods command shows the actual Working_Set reported by the Kubernetes metrics API.
$ kubectl top pods -A | grep -i "<DEPLOYMENT_NAME>"
NAME CPU(cores) MEMORY(bytes)
my-deployment-fc94b7f98-m9z2l 1m 344Mi
We can inspect the memory statistics of the cgroup of the pod
by following the steps below.
- Connect to the
pod
using the kubectl exec command. - Navigate to the directory of
cgroup
statistics. - List all memory-related files:
$ kubectl exec <POD_NAME> -it -- bash
$ ls /sys/fs/cgroup | grep -e memory.stat -e memory.current
memory.current memory.stat
-
memory.current
: The total amount of memory currently being used by the cgroup and its descendants. -
memory.stat
: This breaks down thecgroup’s
memory footprint into different types of memory, type-specific details, and other information on the state and past events of the memory management system.
All the values listed on those files are in bytes
. Run the following step to get an overview of how the memory consumption is distributed on the pod
:
$ cat /sys/fs/cgroup/memory.current
10645012480
$ cat /sys/fs/cgroup/memory.stat
anon 5197824
inactive_anon 5152768
active_anon 8192
...
file 10256240640
active_file 32768
inactive_file 10256207872
...
slab 354682456
slab_reclaimable 354554400
slab_unreclaimable 128056
...
cAdvisor
uses memory.current
and inactive_file
to compute the working_set
metric. You can replicate the calculation using the following formula:
working_set = (memory.current - inactive_file) / 1048576
= (10645012480 - 10256207872) / 1048576
= 370 MB
As mentioned in the title, this issue may occur in applications that perform frequent filesystem operations
. Usually the Kernel
uses strategies such as the Page Cache to improve the overall performance.
The following table describes some memory segments we saw earlier:
Segment | Description |
---|---|
anon | Amount of memory used in anonymous mappings. The majority languages use this segment to allocate memory |
file | Amount of memory used to cache filesystem data, including tmpfs and shared memory. |
slab | Amount of memory used for storing in-kernel data structures. |
The majority of languages use the anon
memory segment to allocate resources. In this case the anon
represents 5197824 bytes
which is not even close to the total amount reported by the working_set
metric.
On the other hand we have one of the segments that Kernel
uses the slab
representing 354682456 bytes
, which is almost all the memory reported by working_set
metric on this pod
.
Note
The following steps may lead to availability and performance issues, do not run it on a production environment.
Get the node
name by the kubectl get pod
command:
$ kubectl get pod -A -o wide | grep "POD_NAME"
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
my-deployment-fc94b7f98-m9z2l 1/1 Running 0 37m 10.244.1.17 aks-agentpool-26052128-vmss000004 <none> <none>
Create a debugger pod
using the kubectl debug command and create a kubectl
session:
kubectl debug node/<NODE_NAME> -it --image=mcr.microsoft.com/cbl-mariner/busybox:2.0
chroot /host
Run the command to drop the kernel cache
:
echo 1 > /proc/sys/vm/drop_caches
To determine if the last command caused any effect go through the Steps 1 and 2 again:
$ kubectl top pods -A | grep -i "<DEPLOYMENT_NAME>"
NAME CPU(cores) MEMORY(bytes)
my-deployment-fc94b7f98-m9z2l 1m 4Mi
$ kubectl exec <POD_NAME> -it -- cat /sys/fs/cgroup/memory.stat
anon 4632576
file 1781760
...
slab_reclaimable 219312
slab_unreclaimable 173456
slab 392768
If you had the same effect of a huge decrease in both working_set
and slab
memory segment you are definitely experiencing the issue of a great amount of pod's
memory being used by the Kernel
.
The only effective workaround for high memory consumption in Kubernetes
is to set realistic resource limits and requests
on your pods
. By configuring appropriate memory limits and requests in the Kubernetes
or specification, you can ensure that Kubernetes
manages memory allocation
more efficiently, mitigating the impact of excessive kernel-level caching on pod
memory usage. For example:
resources:
requests:
memory: 30Mi
limits:
memory: 60Mi
Note
A misconfiguration on pod memory limits
may lead to problems such as OOM-Killed.
-
Learn more about Azure Kubernetes Service (AKS) best practices
-
Monitor your Kubernetes cluster performance with Container insights
[!INCLUDE Third-party information disclaimer]
[!INCLUDE Third-party contact information disclaimer]
[!INCLUDE Azure Help Support]