-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Bug Report
Describe the bug
prometheus_remote_write output plugin drops metrics older than 1 hour
To Reproduce
- Create directories
mkdir -p /srv/logshipper/{conf,data,log}
- Create config which enables file system storage and simulates an unreachable server
---
# /srv/logshipper/conf/fluent-bit.yaml
service:
log_level: trace
log_file: /fluent-bit/log/main.log
flush: 300
storage.path: /fluent-bit/data/storage
scheduler.base: 60
scheduler.cap: 3600
pipeline:
inputs:
- name: node_exporter_metrics
tag: node_metrics
metrics: "loadavg"
scrape_interval: 60
storage.type: filesystem
outputs:
- name: prometheus_remote_write
match: node_metrics
host: unreachable.test
storage.total_limit_size: 100M
retry_limit: 1000
-
Run fluent-bit
podman run --rm -d -v /srv/logshipper/conf:/fluent-bit/etc:ro -v /srv/logshipper/data:/fluent-bit/data:rw -v /srv/logshipper/log:/fluent-bit/log:rw docker.io/fluent/fluent-bit:4.2.2 -c /fluent-bit/etc/fluent-bit.yaml -
Wait for 1.5-2 hours
-
See the chunk files being deleted from
/srv/logshipper/datawith messages in/srv/logshipper/log/main.loglike these:
[2026/01/27 11:16:45.224905057] [debug] [output:prometheus_remote_write:prometheus_remote_write.0] task_id=1 assigned to thread #1
[2026/01/27 11:16:45.224971777] [debug] [output:prometheus_remote_write:prometheus_remote_write.0] cmetrics msgpack size: 2110
[2026/01/27 11:16:45.225003737] [debug] [output:prometheus_remote_write:prometheus_remote_write.0] cmetric_id=0 decoded 0-422 payload_size=0
[2026/01/27 11:16:45.225012738] [debug] [output:prometheus_remote_write:prometheus_remote_write.0] cmetric_id=1 decoded 422-844 payload_size=0
[2026/01/27 11:16:45.225024378] [debug] [output:prometheus_remote_write:prometheus_remote_write.0] cmetric_id=2 decoded 844-1266 payload_size=0
[2026/01/27 11:16:45.225031738] [debug] [output:prometheus_remote_write:prometheus_remote_write.0] cmetric_id=3 decoded 1266-1688 payload_size=0
[2026/01/27 11:16:45.225040218] [debug] [output:prometheus_remote_write:prometheus_remote_write.0] cmetric_id=4 decoded 1688-2110 payload_size=0
[2026/01/27 11:16:45.225042938] [debug] [output:prometheus_remote_write:prometheus_remote_write.0] final payload size: 0
[2026/01/27 11:16:45.225058978] [debug] [out flush] cb_destroy coro_id=43
[2026/01/27 11:16:45.225092898] [ info] [engine] flush chunk '1-1769507997.474870989.flb' succeeded at retry 7: task_id=1, input=node_exporter_metrics.0 > output=prometheus_remote_write.0 (out_id=0)
[2026/01/27 11:16:45.225102658] [debug] [task] destroy task=0xfc3a1fa23960 (task_id=1)
[2026/01/27 11:16:45.225109058] [debug] [input chunk] remove chunk 1-1769507997.474870989.flb with 4096 bytes from plugin prometheus_remote_write.0, the updated fs_chunks_size is 65536 bytes
Expected behavior
Chunk files are not deleted and being kept for as long as it's configured with storage.total_limit_size and retry_limit parameters
Your Environment
- Version used: 4.2.2, official Docker image
Additional context
It makes it impossible to use Fluent Bit as a temporary buffer in the event of a prolonged server outage. Old metrics in this scenario are lost.