Description
Describe the bug
Fluentd tail plugin was outputting If you keep getting this message, please restart Fluentd
. After coming across #3614, we implemented the workaround suggested there.
- changed
follow_inodes
totrue
- set
rotate_wait
to0
Since than we are not seeing the original If you keep getting this message, please restart Fluentd
but still seeing lots of Skip update_watcher because watcher has been already updated by other inotify event
.
This is paired with a pattern of memory leaking and gradual increase in CPU usage until a restart occurs.
To mitigate this I added pos_file_compaction_interval 20m
as suggested here but this had no affect on the resource usage.
Related to #3614. More specifically #3614 (comment)
The suspicion is that some Watchers are not handled properly thus leaking and increasing CPU/Memory consumption until the next restart.
To Reproduce
Deploy fluentd (version v1.16.3-debian-forward-1.0) as a daemonset in a dynamic kubernetes cluster. Cluster is consisting of 50-100 nodes. This is the fluentd config:
Expected behavior
CPU / Memory should stay stable.
Your Environment
- Fluentd version: [v1.16.3-debian-forward-1.0](https://github.com/fluent/fluentd-kubernetes-daemonset#:~:text=debian%2Dcloudwatch%2D1-,Forward,-docker%20pull%20fluent)
Your Configuration
<source>
@type tail
@id in_tail_container_logs
path /var/log/containers/*.log
pos_file /var/log/fluentd-containers.log.pos
tag kubernetes.*
read_from_head true
follow_inodes true
rotate_wait 0
exclude_path ["/var/log/containers/fluentd*.log", "/var/log/containers/*kube-system*.log", "/var/log/containers/*calico-system*.log", "/var/log/containers/prometheus-node-exporter*.log", "/var/log/containers/opentelemetry-agent*.log"]
pos_file_compaction_interval 20m
<parse>
@type multi_format
<pattern>
format json
time_key time
time_type string
time_format "%Y-%m-%dT%H:%M:%S.%NZ"
keep_time_key true
</pattern>
<pattern>
format /^(?<time>.+?) (?<stream>stdout|stderr) (?<logtag>[FP]) (?<log>.+)$/
time_format "%Y-%m-%dT%H:%M:%S.%N%:z"
</pattern>
</parse>
emit_unmatched_lines true
</source>
### Your Error Log
```shell
Skip update_watcher because watcher has been already updated by other inotify event
Additional context
Metadata
Metadata
Assignees
Type
Projects
Status
To-Do