Skip to content

in_tail: group setting judges the limit by the average number, not the total number #4184

Open
@daipom

Description

@daipom

Describe the bug

#3535 (comment)

The limit of group setting should control the total number of lines collected for a group (https://docs.fluentd.org/input/tail#less-than-group-greater-than-section).

However, it judges the limit by the average number, not the total number.

To Reproduce

Run Fluentd by the following command with the config below.
(limit: 10 and rate_limit: 1m)

$ rm -f /test/fluentd/pos/pos; echo A-1 > /test/fluentd/input/A.log; echo B-1 > /test/fluentd/input/B.log; \
bundle exec fluentd -c /test/fluentd/config/group/fluent.conf

After running Fluentd, run the following command on another console to add logs to /test/fluentd/input/A.log.

$ for i in `seq 2 9`; do sleep 1; echo A-$i >> /test/fluentd/input/A.log; done;

After 9 seconds, the contents of these log files to be collected are as follows.

  • /test/fluentd/input/A.log
A-1
A-2
A-3
A-4
A-5
A-6
A-7
A-8
A-9
  • /test/fluentd/input/B.log
B-1

However, the output of Fluentd is as Your Error Log below.
This shows that the data is limited by the average number, not the total number.

After collecting 5 lines from the /test/fluentd/input/A.log, in_tail stops collecting.

2023-05-23 11:41:08.271353203 +0900 test: {"message":"A-1"}
2023-05-23 11:41:08.271511446 +0900 test: {"message":"B-1"}
# Run the command to add logs
2023-05-23 11:41:12.717511479 +0900 test: {"message":"A-2"} 
2023-05-23 11:41:13.721082681 +0900 test: {"message":"A-3"}
2023-05-23 11:41:14.724453738 +0900 test: {"message":"A-4"}
2023-05-23 11:41:15.727744777 +0900 test: {"message":"A-5"}

After 1 minutes, in_tail restarts collecting.

2023-05-23 11:42:08.272066658 +0900 test: {"message":"A-6"}
2023-05-23 11:42:08.272076384 +0900 test: {"message":"A-7"}
2023-05-23 11:42:08.272079574 +0900 test: {"message":"A-8"}
2023-05-23 11:42:08.272082425 +0900 test: {"message":"A-9"}

Expected behavior

The limit of group setting should control the total number of lines collected for a group (https://docs.fluentd.org/input/tail#less-than-group-greater-than-section).

So all 10 lines in To Reproduce must be collected in one period.

Your Environment

- Fluentd version: 1.16.1
- Operating system: Ubuntu 20.04.6 LTS
- Kernel version: 5.15.0-71-generic

Your Configuration

<source>
  @type tail
  tag test
  path /test/fluentd/input/*.log
  pos_file /test/fluentd/pos/pos
  read_from_head true
  refresh_interval 5s
  <group>
    rate_period 1m
    pattern /^(?<log>.*)$/
    <rule>
      match {"log": "/./"}
      limit 10
    </rule>
  </group>
  <parse>
    @type none
  </parse>
</source>

<match test.**>
  @type stdout
</match>

Your Error Log

2023-05-23 11:41:08 +0900 [info]: #0 starting fluentd worker pid=1334352 ppid=1334332 worker=0
2023-05-23 11:41:08 +0900 [info]: #0 following tail of /test/fluentd/input/A.log
2023-05-23 11:41:08.271353203 +0900 test: {"message":"A-1"}
2023-05-23 11:41:08 +0900 [info]: #0 following tail of /test/fluentd/input/B.log
2023-05-23 11:41:08.271511446 +0900 test: {"message":"B-1"}
2023-05-23 11:41:08 +0900 [info]: #0 fluentd worker is now running worker=0
2023-05-23 11:41:12.717511479 +0900 test: {"message":"A-2"}
2023-05-23 11:41:13.721082681 +0900 test: {"message":"A-3"}
2023-05-23 11:41:14.724453738 +0900 test: {"message":"A-4"}
2023-05-23 11:41:15.727744777 +0900 test: {"message":"A-5"}
2023-05-23 11:42:08.272066658 +0900 test: {"message":"A-6"}
2023-05-23 11:42:08.272076384 +0900 test: {"message":"A-7"}
2023-05-23 11:42:08.272079574 +0900 test: {"message":"A-8"}
2023-05-23 11:42:08.272082425 +0900 test: {"message":"A-9"}

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions