Open
Description
Is your feature request related to a problem? Please describe.
Currently when starting up Fluentd outputs, we try to check if each buffer chunk
is non-empty, and if it has some bytes, we assume it contains valid data.
It turned out that this operation model has a few issues:
- If the data was indeed corrupted, the behavior is undefined. It likely causes
many kinds of errors in various parts of the pipeline. - It is also hard to tell which chunks was corrupted from td-agent.log.
This is important because users probably want to recover the lost data.
We should perform more rigorous buffer checks on startup,
so that Fluentd can handle corrupted chunks gracefully.
Describe the solution you'd like
- Perform more sanity checks on buffer chunks on startup.
- Emit more error logs regarding the corrupted chunks.
Describe alternatives you've considered
N/A
Additional context
No response