Skip to content

bug: add flush_interval_sec #1459

Open
Open
@SohilShri

Description

@SohilShri

Describe the issue

We are using

  1. fluentbit image: kubesphere/fluentbit-3.1.8

  2. FLuent operator version: https://github.com/fluent/fluent-operator/releases/tag/v3.2.0

We have recently implemented logToMetrics plugin using fluentbit operator CRD.
we are seeing error in our Prometheus that duplicate metrics with diffrent value and same time stamp is coming to Prometheus.

ts=2025-01-27T12:51:31.436Z caller=scrape.go:1754 level=warn component="scrape manager" scrape_pool=serviceMonitor/cloud/fluent-bit-product-01/0 target=http://10.149.29.143:2021/api/v2/metrics/prometheus msg="Error on ingesting samples with different value but same timestamp" num_dropped=16
ts=2025-01-27T12:51:32.029Z caller=scrape.go:1754 level=warn component="scrape manager" scrape_pool=serviceMonitor/cloud/fluent-bit-product-01/0 target=http://10.149.39.145:2021/api/v2/metrics/prometheus msg="Error on ingesting samples with different value but same timestamp" num_dropped=8
ts=2025-01-27T12:51:42.463Z caller=scrape.go:1754 level=warn component="scrape manager" scrape_pool=serviceMonitor/cloud/fluent-bit-product-01/0 target=http://10.149.7.132:2021/api/v2/metrics/prometheus msg="Error on ingesting samples with different value but same timestamp" num_dropped=30
ts=2025-01-27T12:51:43.822Z caller=scrape.go:1754 level=warn component="scrape manager" scrape_pool=serviceMonitor/cloud/fluent-bit-product-01/0 target=http://10.149.1.52:2021/api/v2/metrics/prometheus msg="Error on ingesting samples with different value but same timestamp" num_dropped=3

Below is our Endpoint in Service monitor:

  endpoints:
    - port: metrics
      path: /api/v2/metrics/prometheus
      interval: 30s
{{- end }}
Below is our log to metric plugin:
 - logToMetrics:
      addLabel:
      - timestamp os.time()
      kubernetesMode: true
      metricDescription: Count of logs processed by fluent-bit - Gauge
      metricMode: counter
      metricName: product_log_to_metrics
      tag: product_log_to_metrics

This issues is already opened in fluentbit end, The workaround suggested to use flush_interval_sec is not available in fluent operator CRD latest version:
fluent/fluent-bit#9413

To Reproduce

Same as: fluent/fluent-bit#9413

Steps to reproduce the problem:
Deploy fluent-bit with the below tail input config as a daemonset into a k8s cluster using version 3.1.4 to see container logs and metrics to validate success.
Update fluent-bit image to 3.1.5 (or newer, <= 3.1.8) and verify /metrics endpoint on port 2021

Expected behavior

No duplicate metrics on the additional endpoint /metrics for log_to_metrics feature usually on port 2021, no warnings in Prometheus logs, no PrometheusDuplicateTimestamps errors.

Your Environment

- Fluent Operator version: 3.2.0
- Environment name and version (e.g. Kubernetes? What version?): EKS Kubernetes 1.27
- Filter and Plugins: kubernetes, log_to_metrics

How did you install fluent operator?

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions