- Release Signoff Checklist
- Summary
- Motivation
- Proposal
- Design Details
- Implementation History
- Drawbacks
- Alternatives
Items marked with (R) are required prior to targeting to a milestone / release.
- (R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
- (R) KEP approvers have approved the KEP status as
implementable
- (R) Design details are appropriately documented
- (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- e2e Tests for all Beta API Operations (endpoints)
- (R) Ensure GA e2e tests meet requirements for Conformance Tests
- (R) Minimum Two Week Window for GA e2e tests to prove flake free
- (R) Graduation criteria is in place
- (R) all GA Endpoints must be hit by Conformance Tests
- (R) Production readiness review completed
- (R) Production readiness review approved
- "Implementation History" section is up-to-date for milestone
- User-facing documentation has been created in kubernetes/website, for publication to kubernetes.io
- Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
Split, Clean and Rotate container logs to avoid disk pressure on kubelet host.
- We manage kubernetes ecosystem at our organization. A lot of our kubelet hosts experienced Disk pressure as a certain set of pods was generating very high logs. The rate was around 3-4Gib logs generated in 15 minutes. We had kubelet configs containerLogMaxSize set to 200Mib and containerLogMaxFiles set to 6. But the .gz files (tar log files of pods) were of size around 500-600Gib. We observed that container log rotation was slow for us.
We expect that the log file size is always under set kubelet config limit (i.e., containerLogMaxSize) which can help such disk pressure issues in the future.
It often happens that the containers generating heavy log data have compressed log file with size exceeding the containerLogMaxSize limit set in kubelet config.
For example, kubelet has
containerLogMaxSize = 200M
containerLogMaxFiles = 6
Continuously generating 10Mib with 0.1 sec sleep in between
apiVersion: batch/v1
kind: Job
metadata:
name: generate-huge-logs
spec:
template:
spec:
containers:
- name: log-generator
image: busybox
command: ["/bin/sh", "-c"]
args:
- |
# Generate huge log entries to stdout
start_time=$(date +%s)
log_size=0
target_size=$((4 * 1024 * 1024 * 1024)) # 4 GB target size in bytes
while [ $log_size -lt $target_size ]; do
# Generate 1 MB of random data and write it to stdout
echo "Generating huge log entry at $(date) - $(dd if=/dev/urandom bs=10M count=1 2>/dev/null)"
log_size=$(($log_size + 1048576)) # Increment size by 1MB
sleep 0.1 # Sleep to control log generation speed
done
end_time=$(date +%s)
echo "Log generation completed in $((end_time - start_time)) seconds"
restartPolicy: Never
backoffLimit: 4
File sizes
-rw-r----- 1 root root 24142862 Jan 1 11:41 0.log
-rw-r--r-- 1 root root 183335398 Jan 1 11:40 0.log.20250101-113948.gz
-rw-r--r-- 1 root root 364144934 Jan 1 11:40 0.log.20250101-114003.gz
-rw-r--r-- 1 root root 487803789 Jan 1 11:40 0.log.20250101-114023.gz
-rw-r--r-- 1 root root 577188544 Jan 1 11:41 0.log.20250101-114047.gz
-rw-r----- 1 root root 730449620 Jan 1 11:41 0.log.20250101-114115
Continuously generating 10Mib with 10 sec sleep in between
apiVersion: batch/v1
kind: Job
metadata:
name: generate-huge-logs
spec:
template:
spec:
containers:
- name: log-generator
image: busybox
command: ["/bin/sh", "-c"]
args:
- |
# Generate huge log entries to stdout
start_time=$(date +%s)
log_size=0
target_size=$((4 * 1024 * 1024 * 1024)) # 4 GB target size in bytes
while [ $log_size -lt $target_size ]; do
# Generate 1 MB of random data and write it to stdout
echo "Generating huge log entry at $(date) - $(dd if=/dev/urandom bs=10M count=1 2>/dev/null)"
log_size=$(($log_size + 1048576)) # Increment size by 1MB
sleep 0.1 # Sleep to control log generation speed
done
end_time=$(date +%s)
echo "Log generation completed in $((end_time - start_time)) seconds"
restartPolicy: Never
backoffLimit: 4
File sizes
-rw-r----- 1 root root 181176268 Jan 1 11:31 0.log
-rw-r--r-- 1 root root 183336647 Jan 1 11:20 0.log.20250101-111730.gz
-rw-r--r-- 1 root root 183323382 Jan 1 11:23 0.log.20250101-112026.gz
-rw-r--r-- 1 root root 183327676 Jan 1 11:26 0.log.20250101-112321.gz
-rw-r--r-- 1 root root 183336376 Jan 1 11:29 0.log.20250101-112616.gz
-rw-r----- 1 root root 205360966 Jan 1 11:29 0.log.20250101-112911
If the pod had been generating logs in Gigabytes with minimal delay, it can cause disk pressure on kubelet host and that can affect other pods running in the same kubelet.
- There is a ContainerLogManager in every kubelet. It runs an infinite go routine and checks active log file(on which all container log read write happens) size. If that exceeds the above mentioned limit (containerLogMaxSize), it starts parallel workers. Each worker creates a tar of the active file, deletes tars till there are containerLogMaxFiles files in total and creates a new active file for container.
- Goal is to Split large active log file in size containerLogMaxSize, and then do the rest of the operations done by containerLogManager.
- The container log rotation working now shpuld work as is, but it will ensure that before rotating file, it is under the size limit set. This way, every tar present for a container under host will surely be under containerLogMaxSize. This can avoid disk pressure on the host.
Do not see any risk as of now.
- Implement a new function (splitAndRotateLatestLog) to be called by rotateLog function (https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/logs/container_log_manager.go#L235-L257)
- The rotateLog is being called by each worker for the container assigned to it.
- It does cleanup to delete all non-rotated (rotated logs have time suffix in name) and .tmp files generated (and if not deleted) in last log rotation.
- Then it rotates un rotated files (it does not rotate the active file) and deletes oldest rotated files till containerLogMaxFiles-2 files are left. This is because the non rotated active file be rotated and new active file will be created. Which will add upto containerLogMaxFiles.
- Before doing this exact above step, in the new design, it will split the large active log file in size of containerLogMaxSize and name them .part.
- Let's say it created n parts, after this, it can do rotate of n-1 parts, keeping last nth part active and do delete. (Basically step 4)
[X] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.
- Add detailed unit tests with 100% coverage.
<package>
:<date>
-<test coverage>
- Scenarios will be covered in e2e tests.
- Add test under
kubernetes/test/e2e_node
. - Set low value for
containerLogMaxSize
andcontainerLogMaxFiles
. - Create a pod with generating heavy logs and expect the container's combined log size to be within
containerLogMaxSize
*containerLogMaxFiles
.
Note: Not required until targeted at a release.
- Other
- Describe the mechanism:
- Will enabling / disabling the feature require downtime of the control plane? Yes (kubelet restart)
- Will enabling / disabling the feature require downtime or reprovisioning of a node? No, restart of kubelet with updated configurations and version should work.
No
Yes
Add UTs.
No identified risk.
No identified risk.
e2e tests covered.
Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
No
Emit cleanup logs.
Yes, from logs.
Na
What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
NA
Are there any missing metrics that would be useful to have to improve observability of this feature?
NA
No
No
No
No
Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
No
Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
CPU cycles usage of ContainerLogManager of kubelet will increase.
Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
No
NA
NA
NA
No identified drawbacks.
Define 2 new flags logRotateDiskCheckInterval
, logRotateDiskPressureThreshold
in kubelet config.
logRotateDiskCheckInterval
is the time interval within which the ContainerLogManager will check Disk usage on the kubelet host.logRotateDiskPressureThreshold
is the threshold of overall Disk usage on the kubelet. If actual Disk usage is equal or more than this threshold, it will rotate logs of all the containers of the kubelet.
Provide a means for an external tool to trigger the kubelet to rotate its logs. That would move the policy decisions outside of the kubelet, for example, into a DaemonSet.