Skip to content

Buffer flush blocked because of empty chunks #4945

Open
@whitepiratebaku

Description

@whitepiratebaku

Describe the bug

Sometimes (rarely) situation happens where fluentd buffer contains chunks which is empty and usually .meta file missing. If we have only one worker/thread + retry_forever then fluentd repeatedly tries to flush that empty chunk fails with

2025-04-10 16:33:04 +0000 [warn]: [opensearch] failed to flush the buffer. retry_times=32 next_retry_time=2025-04-10 16:33:33 +0000 chunk="6324310443eec1c01824af9ad547fba4" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure error="could not push logs to OpenSearch cluster (system): [400] {"error":{"root_cause":[{"type":"parse_exception","reason":"request body is required"}],"type":"parse_exception","reason":"request body is required"},"status":400}"

this blocks whole buffer and will not fix until we manually remove the empty chunk. Buffer section config is like following.

        <buffer>
          @type file
          path /var/log/fluentd-buffers/system.buffer
          flush_mode interval
          retry_type exponential_backoff
          flush_thread_count 1
          flush_interval 15s
          retry_forever
          retry_max_interval 30
          chunk_limit_size "#{ENV['OUTPUT_BUFFER_CHUNK_LIMIT']}"
          total_limit_size "#{ENV['OUTPUT_BUFFER_TOTAL_LIMIT']}"
          overflow_action block
        </buffer>

We know retry_forever + 'overflow_action block' + 'flush_thread_count 1' makes empty chunk never discarded and block logs but we do not wanna use retry_max_times which would drop the empty chunk, because it is not gonna drop chunks when chunks bad it would also be dropped in OpenSearch can't accept logs/down, we can drop chunks only if it is chunk which has issue. Can you recommend any solution?

To Reproduce

running following in buffer folder can help to re-produce the issue but it is not always happening, usually need to run it multiple times to get reproduction

file=$(ls buffer.*.log | shuf -n1); echo "Selected: $file"; rm -v "${file}.meta"; truncate -s 0 "$file"

Expected behavior

it would be nice if fluentd can detect empty chunk and discard it

Your Environment

- Fluentd version: 1.18
- Operating system: Debian 12
- Kernel version: 5.15.0-102-generic

Your Configuration

<buffer>
          @type file
          path /var/log/fluentd-buffers/system.buffer
          flush_mode interval
          retry_type exponential_backoff
          flush_thread_count 1
          flush_interval 15s
          retry_forever
          retry_max_interval 30
          chunk_limit_size "#{ENV['OUTPUT_BUFFER_CHUNK_LIMIT']}"
          total_limit_size "#{ENV['OUTPUT_BUFFER_TOTAL_LIMIT']}"
          overflow_action block
        </buffer>

Your Error Log

2025-04-10 16:33:04 +0000 [warn]: [opensearch] failed to flush the buffer. retry_times=32 next_retry_time=2025-04-10 16:33:33 +0000 chunk="6324310443eec1c01824af9ad547fba4" error_class=Fluent::Plugin::OpenSearchOutput::RecoverableRequestFailure error="could not push logs to OpenSearch cluster (system): [400] {\"error\":{\"root_cause\":[{\"type\":\"parse_exception\",\"reason\":\"request body is required\"}],\"type\":\"parse_exception\",\"reason\":\"request body is required\"},\"status\":400}"

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    waiting-for-userSimilar to "moreinfo", but especially need feedback from user

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions