newlog: prevent kernel log loss under heavy system load#5626
newlog: prevent kernel log loss under heavy system load#5626rucoder wants to merge 1 commit intolf-edge:masterfrom
Conversation
Under heavy system load, newlogd loses the beginning of kernel logs because backpressure from downstream processing cascades up to the kernel message reader, which stops reading /dev/kmsg and causes the kernel ring buffer (128KB) to overflow silently. The backpressure chain: doMoveCompressFile slow (disk I/O) -> movefileChan full (cap 5) -> trigMoveToGzip blocks in writelogFile -> unbuffered downstream channels block -> sendLogsToVector blocks -> loggerChan full (cap 10) -> getKernelMsg blocks -> /dev/kmsg not read -> kernel ring buffer overflows -> EPIPE -> early messages gone Fix all choke points in the chain: 1. Dedicated kernel buffer channel (cap 500): getKernelMsg now writes to its own channel with non-blocking sends, so it always drains /dev/kmsg regardless of downstream backpressure. A merge goroutine forwards messages to the main pipeline. 2. Sequence number gap detection: track /dev/kmsg sequence numbers to detect and count ring buffer overflows. Populate the new NumKmsgDropped metric (added in pkg/pillar/types) so kernel log loss becomes observable via the controller. 3. Buffer downstream channels: appLogChan, uploadLogChan, keepLogChan changed from unbuffered to cap 100. A single blocked app log send could previously freeze the entire pipeline. 4. Increase loggerChan from 10 to 500 and movefileChan from 5 to 20. 5. Non-blocking gzip handoff: trigMoveToGzip uses non-blocking send to moveChan. If the main goroutine is busy compressing, the log file grows beyond the size limit until the next attempt succeeds, rather than blocking the entire write pipeline. Note: the vendored copy of newlogtypes.go is manually updated to include the NumKmsgDropped field. This should be re-vendored properly after the companion pillar PR is merged. Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #5626 +/- ##
==========================================
+ Coverage 19.52% 29.49% +9.96%
==========================================
Files 19 18 -1
Lines 3021 2417 -604
==========================================
+ Hits 590 713 +123
+ Misses 2310 1552 -758
- Partials 121 152 +31 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
europaul
left a comment
There was a problem hiding this comment.
It looks good to me, but I would remove most of those AI-generated comments.
| // /dev/kmsg assigns a monotonically increasing sequence number | ||
| // to each message. A gap means the kernel ring buffer overflowed | ||
| // and messages were lost (EPIPE from the reader's perspective). | ||
| if lastSeqNum >= 0 && msg.SequenceNumber > lastSeqNum+1 { |
There was a problem hiding this comment.
we don't need the check lastSeqNum >= 0
|
we decided to update eve-api as well. moving to draft for now |
|
Can one do additional manually stress testing with the fix by giving newlogd a SIGSUSP and let it sit for a while before a SIGCONT to make sure log entries pile up in the system? |
just be sure not to trigger the watchdog |
Description
Under heavy system load,
newlogdloses the beginning of kernel logs. The root cause is a cascading backpressure chain and slow system boot that preventsgetKernelMsgfrom reading/dev/kmsg, causing the kernel ring buffer(128KB,
CONFIG_LOG_BUF_SHIFT=17) to overflow and silently drop the earliest messages — exactly the ones needed to diagnose the problem.The backpressure chain:
This PR fixes all choke points:
Dedicated kernel buffer channel (cap 500):
getKernelMsgnow writes to its own channel with non-blocking sends, so it always drains/dev/kmsgregardless of downstream backpressure. A merge goroutine forwards messages to the main pipeline.Sequence number gap detection: track
/dev/kmsgsequence numbers to detect and count ring buffer overflows. Populate the newNumKmsgDroppedmetric so kernel log loss becomes observable via the controller.Buffer downstream channels:
appLogChan,uploadLogChan,keepLogChanchanged from unbuffered to cap 100. A single blocked app log send could previously freeze the entire pipeline.Increase
loggerChanfrom 10 to 500 andmovefileChanfrom 5to 20.
Non-blocking gzip handoff:
trigMoveToGzipuses non-blocking send tomoveChan. If the main goroutine is busy compressing, the log file grows beyond the size limit until the next attempt succeeds, rather than blocking the entire write pipeline.Kernel message timestamps come from the kernel's own monotonic clock (not from when
newlogdreads them), so buffering does not affect timestamp correctness — logs can still be sorted chronologically.Note: the vendored copy of
newlogtypes.gois manually updated to include theNumKmsgDroppedfield. This should be re-vendored properly after the companion pillar PR is merged.PR dependencies
Depends on #5625 (
pillar/types: add NumKmsgDropped metric to NewlogMetrics) — the vendored copy needs to be regenerated after that PR is merged.How to test and validate this PR
cd pkg/newlog && go test ./cmd/— all 15tests pass.
stress).
are no longer truncated at the beginning.
NumKmsgDroppedin newlog metrics — if non-zero, itindicates the kernel ring buffer overflowed but the loss was
detected and counted (previously invisible).
getKernelMsg: detected kernel log gapwarning messagesin device logs.
Changelog notes
Fixed kernel log loss under heavy system load. Previously, when the device was under heavy I/O or CPU load, the beginning of kernel logs (dmesg) could be silently lost due to backpressure in the logging pipeline causing the kernel ring buffer to overflow. Kernel log loss is now detected and reported via the
NumKmsgDroppedmetric.PR Backports
Checklist
I've provided a proper description
I've added the proper documentation
I've tested my PR on amd64 device
I've tested my PR on arm64 device
I've written the test verification instructions
I've set the proper labels to this PR
I've checked the boxes above, or I've provided a good reason why I didn't
check them.