Component(s)
loki.source.file
What's wrong?
This is a repost of a 2+ year old unaddressed promtail bug. It is posted here in response to @JStickler's closing of that issue and saying promtail would receive no future support or updates
This report is an updated version of the original posted by @tiimwsuqld on 15 Nov 2023
Describe the bug Alloy consumes all RAM (doesn't start swapping) and causes the VM to freeze. OOM doesn't appear to kick in. This is caused when /var/log/lastlog ends up in the pattern match, which is a massively sparse file with almost no data in it. Alloy shouldn't consume all memory to read large files.
Expected behavior Alloy should limit used memory (even at startup) so it can't consume everything on the machine causing it to trigger an oom-kill. Yes, this can be avoided by excluding the lastlog file from being processed, but we should have limits on memory usage.
Environment:
* Infrastructure: Google Cloud VM (e2-standard-2, 8GB RAM)
* Deployment tool: docker-compose
Screenshots, Alloy config, or terminal output If applicable, add any output to help explain your problem.
This is a system exhibiting the "lastlog looks really big but really isn't" issue:
hostname> /bin/ls -sh /var/log/lastlog
76K /var/log/lastlog
hostname> /bin/ls -lh /var/log/lastlog
-rw-rw-r--. 1 root utmp 1.2T Apr 8 14:45 /var/log/lastlog
hostname>
Yes, this can be mitigated by not monitoring lastlog or using __path_exclude__ for globs that would include it, but alloy really shouldn't crash systems if it points to a sparse file.
Steps to reproduce
- Configure a user on a Linux system with a very large UID (1,000,000+ or so). Run
ls -l /var/log/lastlog to verify the file size is reported to be at least as high as the amount of RAM on the system.
- Have a stanza in your
config.alloy watching /var/log/lastlog
- Start alloy.
- Watch memory yo-yo between normal and 100% while
alloy is oom-killed, then restarted by systemd.
System information
Linux 5.14.0-611.27.1 x86_64
Software version
Grafana Alloy 1.13.0
Configuration
loki.write "logs_base" {
endpoint {
url = "https://loki.example.com/loki/api/v1/push"
basic_auth {
username = "loki_writer"
password = "loki_password"
}
}
external_labels = {}
}
local.file_match "logs_base_lastlog" {
path_targets = [{
__address__ = "hostname.example.com",
__path__ = "/var/log/lastlog",
}]
}
loki.source.file "logs_base_lastlog" {
targets = local.file_match.logs_base_lastlog.targets
forward_to = [loki.write.logs_base.receiver]
}
Logs
Tip
React with 👍 if this issue is important to you.
Component(s)
loki.source.file
What's wrong?
This is a repost of a 2+ year old unaddressed promtail bug. It is posted here in response to @JStickler's closing of that issue and saying
promtailwould receive no future support or updatesThis report is an updated version of the original posted by @tiimwsuqld on 15 Nov 2023
Describe the bug Alloy consumes all RAM (doesn't start swapping) and causes the VM to freeze. OOM doesn't appear to kick in. This is caused when
/var/log/lastlogends up in the pattern match, which is a massively sparse file with almost no data in it. Alloy shouldn't consume all memory to read large files.Expected behavior Alloy should limit used memory (even at startup) so it can't consume everything on the machine causing it to trigger an oom-kill. Yes, this can be avoided by excluding the
lastlogfile from being processed, but we should have limits on memory usage.Environment:
* Infrastructure: Google Cloud VM (e2-standard-2, 8GB RAM)
* Deployment tool: docker-compose
Screenshots, Alloy config, or terminal output If applicable, add any output to help explain your problem.
This is a system exhibiting the "
lastloglooks really big but really isn't" issue:Yes, this can be mitigated by not monitoring
lastlogor using__path_exclude__for globs that would include it, butalloyreally shouldn't crash systems if it points to a sparse file.Steps to reproduce
ls -l /var/log/lastlogto verify the file size is reported to be at least as high as the amount of RAM on the system.config.alloywatching/var/log/lastlogalloyis oom-killed, then restarted by systemd.System information
Linux 5.14.0-611.27.1 x86_64
Software version
Grafana Alloy 1.13.0
Configuration
Logs
Tip
React with 👍 if this issue is important to you.