Skip to content

Hints annotations based autodiscover does not work as expected #10641

@belimawr

Description

@belimawr

When using Hints annotations based autodiscover with packages, two
problems happen:

  • Packages that use multiple datasets duplicate data across all
    datasets
  • No ingest pipelines are created to process the data, so fields are
    not set as expected.

This can reproduced by following our documentation:

How to reproduce

For this example I'll use:

  • Kind to create our Kubernets cluster.
  • Nginx as the application.
  • The Nginx for Standalone Elastic Agent.

1. Create a Kind cluster mapping one port to the host

kind-config.yaml

apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
- role: control-plane
  extraPortMappings:
  - containerPort: 30080
    hostPort: 30080
    listenAddress: "0.0.0.0" # Optional, defaults to "0.0.0.0"
    protocol: tcp # Optional, defaults to tcp

kind create cluster --config kind-config.yaml 

2. Deploy Elastic Agent Standalone with hints enabled

Make sure to edit the output credentials to match your setup

elastic-agent-standalone-kubernetes.yaml

# For more information https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-standalone.html
apiVersion: v1
kind: ConfigMap
metadata:
  name: agent-node-datastreams
  namespace: kube-system
  labels:
    app.kubernetes.io/name: elastic-agent-standalone
data:
  agent.yml: |-
    outputs:
      default:
        type: elasticsearch
        hosts:
          - >-
            https://localhost:9200
        username: elastic
        password: changeme
    agent:
      monitoring:
        enabled: true
        use_output: default
        logs: true
        metrics: true
    providers.kubernetes:
      node: ${NODE_NAME}
      scope: node
      #Uncomment to enable hints' support - https://www.elastic.co/guide/en/fleet/current/hints-annotations-autodiscovery.html
      hints.enabled: true
      hints.default_container_logs: false
---
# For more information refer https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-standalone.html
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: elastic-agent-standalone
  namespace: kube-system
  labels:
    app.kubernetes.io/name: elastic-agent-standalone
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: elastic-agent-standalone
  template:
    metadata:
      labels:
        app.kubernetes.io/name: elastic-agent-standalone
    spec:
      # Tolerations are needed to run Elastic Agent on Kubernetes control-plane nodes.
      # Agents running on control-plane nodes collect metrics from the control plane components (scheduler, controller manager) of Kubernetes
      tolerations:
        - key: node-role.kubernetes.io/control-plane
          effect: NoSchedule
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
      serviceAccountName: elastic-agent-standalone
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      # Uncomment if using hints feature
      initContainers:
       - name: k8s-templates-downloader
         image: docker.elastic.co/elastic-agent/elastic-agent:9.1.5
         restartPolicy: Never
         command: ['bash']
         args:
           - -c
           - >-
             mkdir -p /etc/elastic-agent/inputs.d &&
             curl -sL https://github.com/elastic/elastic-agent/archive/9.0.tar.gz | tar xz -C /etc/elastic-agent/inputs.d --strip=5 "elastic-agent-9.0/deploy/kubernetes/elastic-agent-standalone/templates.d"
         volumeMounts:
           - name: external-inputs
             mountPath: /etc/elastic-agent/inputs.d
      containers:
        - name: elastic-agent-standalone
          image: docker.elastic.co/elastic-agent/elastic-agent:9.1.5
          args: ["-c", "/etc/elastic-agent/agent.yml", "-e"]
          env:
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            # The following ELASTIC_NETINFO:false variable will disable the netinfo.enabled option of add-host-metadata processor. This will remove fields host.ip and host.mac.
            # For more info: https://www.elastic.co/guide/en/beats/metricbeat/current/add-host-metadata.html
            - name: ELASTIC_NETINFO
              value: "false"
          securityContext:
            runAsUser: 0
            # The following capabilities are needed for Universal Profiling.
            # More fine graded capabilities are only available for newer Linux kernels.
            # If you are using the Universal Profiling integration, please uncomment these lines before applying.
            #procMount: "Unmasked"
            #privileged: true
            #capabilities:
            #  add:
            #    - SYS_ADMIN
          resources:
            limits:
              memory: 1Gi
            requests:
              cpu: 100m
              memory: 500Mi
          volumeMounts:
            - name: datastreams
              mountPath: /etc/elastic-agent/agent.yml
              readOnly: true
              subPath: agent.yml
            - name: proc
              mountPath: /hostfs/proc
              readOnly: true
            - name: cgroup
              mountPath: /hostfs/sys/fs/cgroup
              readOnly: true
            - name: varlibdockercontainers
              mountPath: /var/lib/docker/containers
              readOnly: true
            - name: varlog
              mountPath: /var/log
              readOnly: true
            - name: etc-full
              mountPath: /hostfs/etc
              readOnly: true
            - name: var-lib
              mountPath: /hostfs/var/lib
              readOnly: true
            - name: sys-kernel-debug
              mountPath: /sys/kernel/debug
            - name: elastic-agent-state
              mountPath: /usr/share/elastic-agent/state
            # Uncomment if using hints feature
            - name: external-inputs
              mountPath: /usr/share/elastic-agent/state/inputs.d
      volumes:
        - name: datastreams
          configMap:
            defaultMode: 0644
            name: agent-node-datastreams
        - name: proc
          hostPath:
            path: /proc
        - name: cgroup
          hostPath:
            path: /sys/fs/cgroup
        - name: varlibdockercontainers
          hostPath:
            path: /var/lib/docker/containers
        - name: varlog
          hostPath:
            path: /var/log
        # The following volumes are needed for Cloud Security Posture integration (cloudbeat)
        # If you are not using this integration, then these volumes and the corresponding
        # mounts can be removed.
        - name: etc-full
          hostPath:
            path: /etc
        - name: var-lib
          hostPath:
            path: /var/lib
        # Needed for Universal Profiling
        # If you are not using this integration, then these volumes and the corresponding
        # mounts can be removed.
        - name: sys-kernel-debug
          hostPath:
            path: /sys/kernel/debug
        # Mount /var/lib/elastic-agent-managed/kube-system/state to store elastic-agent state
        # Update 'kube-system' with the namespace of your agent installation
        - name: elastic-agent-state
          hostPath:
            path: /var/lib/elastic-agent-standalone/kube-system/state
            type: DirectoryOrCreate
        # Uncomment if using hints feature
        - name: external-inputs
          emptyDir: {}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: elastic-agent-standalone
subjects:
  - kind: ServiceAccount
    name: elastic-agent-standalone
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: elastic-agent-standalone
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  namespace: kube-system
  name: elastic-agent-standalone
subjects:
  - kind: ServiceAccount
    name: elastic-agent-standalone
    namespace: kube-system
roleRef:
  kind: Role
  name: elastic-agent-standalone
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: elastic-agent-standalone-kubeadm-config
  namespace: kube-system
subjects:
  - kind: ServiceAccount
    name: elastic-agent-standalone
    namespace: kube-system
roleRef:
  kind: Role
  name: elastic-agent-standalone-kubeadm-config
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: elastic-agent-standalone
  labels:
    app.kubernetes.io/name: elastic-agent-standalone
rules:
  - apiGroups: [""]
    resources:
      - nodes
      - namespaces
      - events
      - pods
      - services
      - configmaps
      # Needed for cloudbeat
      - serviceaccounts
      - persistentvolumes
      - persistentvolumeclaims
    verbs: ["get", "list", "watch"]
  # Enable this rule only if planing to use kubernetes_secrets provider
  #- apiGroups: [""]
  #  resources:
  #  - secrets
  #  verbs: ["get"]
  - apiGroups: ["extensions"]
    resources:
      - replicasets
    verbs: ["get", "list", "watch"]
  - apiGroups: ["apps"]
    resources:
      - statefulsets
      - deployments
      - replicasets
      - daemonsets
    verbs: ["get", "list", "watch"]
  - apiGroups: ["batch"]
    resources:
      - jobs
      - cronjobs
    verbs: ["get", "list", "watch"]
  - apiGroups:
      - ""
    resources:
      - nodes/stats
    verbs:
      - get
  # Needed for apiserver
  - nonResourceURLs:
      - "/metrics"
    verbs:
      - get
  # Needed for cloudbeat
  - apiGroups: ["rbac.authorization.k8s.io"]
    resources:
      - clusterrolebindings
      - clusterroles
      - rolebindings
      - roles
    verbs: ["get", "list", "watch"]
  # Needed for cloudbeat
  - apiGroups: ["policy"]
    resources:
      - podsecuritypolicies
    verbs: ["get", "list", "watch"]
  - apiGroups: [ "storage.k8s.io" ]
    resources:
      - storageclasses
    verbs: [ "get", "list", "watch" ]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: elastic-agent-standalone
  # Should be the namespace where elastic-agent is running
  namespace: kube-system
  labels:
    app.kubernetes.io/name: elastic-agent-standalone
rules:
  - apiGroups:
      - coordination.k8s.io
    resources:
      - leases
    verbs: ["get", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: elastic-agent-standalone-kubeadm-config
  namespace: kube-system
  labels:
    app.kubernetes.io/name: elastic-agent-standalone
rules:
  - apiGroups: [""]
    resources:
      - configmaps
    resourceNames:
      - kubeadm-config
    verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: elastic-agent-standalone
  namespace: kube-system
  labels:
    app.kubernetes.io/name: elastic-agent-standalone
---

# Set the current namespace to kube-system
kubectl config set-context $(kubectl config current-context) --namespace kube-system

# Deply Elastic Agent
kubectl apply -f elastic-agent-standalone-kubernetes.yaml

# Ensure the Elastic Agent pod is running
kubectl get pods

Go to Kibana and ensure you're receiving logs from your Elastic Agent

3. Deploy Nginx with hints enabled

nginx-k8s.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-config
data:
  nginx.conf: |
    events {}

    http {
        server {
            listen 80;
            server_name localhost;

            location / {
                return 200 "Welcome to Nginx!\n";
                add_header Content-Type text/plain;
            }

            location /error {
                root /usr/share/nginx/html/errors;
                index index.html;
            }
        }
    }

---
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    app: nginx
  annotations:
    co.elastic.hints/package: "nginx"
spec:
  containers:
    - name: nginx
      image: nginx
      ports:
        - containerPort: 80
      volumeMounts:
        - name: nginx-config-volume
          mountPath: /etc/nginx/nginx.conf
          subPath: nginx.conf
        - name: errors-volume
          mountPath: /usr/share/nginx/html/errors
  volumes:
    - name: nginx-config-volume
      configMap:
        name: nginx-config
    - name: errors-volume
      emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  type: NodePort
  selector:
    app: nginx
  ports:
    - protocol: TCP
      port: 80
      # targetPort: 80
      nodePort: 30080

# Deply Nginx
kubectl apply -f nginx-k8s.yaml

# Ensure the container is running
kubectl get pods

Make a few requests that will be successful:

curl http://localhost:30080/
curl http://localhost:30080/
curl http://localhost:30080/

Then some others that will fail:

curl http://localhost:30080/error
curl http://localhost:30080/error

If yo go to Kibana you will see all messges duplicated in both datastreams (nginx.access and nginx.error):

Duplicated data in Kibana

Image

Looking at the details of each event you will also see that no fields
have been parsed from the Nginx logs.

Example event as JSON

{
  "_index": ".ds-logs-nginx.access-default-2025.10.16-000001",
  "_id": "AZntwG-0FtZvvkYXMKpi",
  "_version": 1,
  "_source": {
    "container": {
      "image": {
        "name": "nginx"
      },
      "runtime": "containerd",
      "id": "f235123c3da3d87c05053b6df0883ddcbb1df316e69ac1cc64da5aa9c39a52d0"
    },
    "kubernetes": {
      "container": {
        "name": "nginx"
      },
      "node": {
        "uid": "bc29818f-e886-4717-bc5d-7681a6b5678c",
        "hostname": "kind-control-plane",
        "name": "kind-control-plane",
        "labels": {
          "kubernetes_io/hostname": "kind-control-plane",
          "node-role_kubernetes_io/control-plane": "",
          "beta_kubernetes_io/os": "linux",
          "kubernetes_io/arch": "amd64",
          "kubernetes_io/os": "linux",
          "beta_kubernetes_io/arch": "amd64"
        }
      },
      "pod": {
        "uid": "01327f9c-d4ff-4428-a277-d8dc19fe6f78",
        "ip": "10.244.0.5",
        "name": "nginx"
      },
      "namespace": "kube-system",
      "namespace_uid": "84f83d2d-67ef-415b-a078-58e92088bd93",
      "namespace_labels": {
        "kubernetes_io/metadata_name": "kube-system"
      },
      "labels": {
        "app": "nginx"
      }
    },
    "agent": {
      "name": "kind-control-plane",
      "id": "cef8bf9b-8b10-4257-9590-28cfd74f9b2d",
      "type": "filebeat",
      "ephemeral_id": "45be407f-0412-47f8-8c32-ed2257193556",
      "version": "9.1.5"
    },
    "log": {
      "file": {
        "inode": "6697079",
        "path": "/var/log/containers/nginx_kube-system_nginx-f235123c3da3d87c05053b6df0883ddcbb1df316e69ac1cc64da5aa9c39a52d0.log",
        "device_id": "64768",
        "fingerprint": "1189074e47c90922e09907adf8ff6b0d42478522699edcccb3036a944f83fef4"
      },
      "offset": 5629
    },
    "elastic_agent": {
      "id": "cef8bf9b-8b10-4257-9590-28cfd74f9b2d",
      "version": "9.1.5",
      "snapshot": false
    },
    "message": "10.244.0.1 - - [16/Oct/2025:16:00:10 +0000] \"GET /error HTTP/1.1\" 404 153 \"-\" \"curl/8.16.0\"",
    "tags": [
      "nginx-access"
    ],
    "input": {
      "type": "filestream"
    },
    "orchestrator": {
      "cluster": {
        "name": "kind",
        "url": "kind-control-plane:6443"
      }
    },
    "@timestamp": "2025-10-16T16:00:10.512Z",
    "ecs": {
      "version": "8.0.0"
    },
    "stream": "stdout",
    "data_stream": {
      "namespace": "default",
      "type": "logs",
      "dataset": "nginx.access"
    },
    "host": {
      "hostname": "kind-control-plane",
      "os": {
        "kernel": "6.17.1-arch1-1",
        "codename": "Plow",
        "name": "Red Hat Enterprise Linux",
        "type": "linux",
        "family": "redhat",
        "version": "9.6 (Plow)",
        "platform": "rhel"
      },
      "containerized": false,
      "name": "kind-control-plane",
      "id": "671d8b1fdab2123ffbbbe8f6aad4da95",
      "architecture": "x86_64"
    },
    "event": {
      "timezone": "+00:00",
      "dataset": "nginx.access"
    }
  },
  "fields": {
    "kubernetes.node.uid": [
      "bc29818f-e886-4717-bc5d-7681a6b5678c"
    ],
    "orchestrator.cluster.name": [
      "kind"
    ],
    "elastic_agent.version": [
      "9.1.5"
    ],
    "kubernetes.namespace_uid": [
      "84f83d2d-67ef-415b-a078-58e92088bd93"
    ],
    "host.os.name.text": [
      "Red Hat Enterprise Linux"
    ],
    "host.hostname": [
      "kind-control-plane"
    ],
    "kubernetes.node.labels.kubernetes_io/os": [
      "linux"
    ],
    "container.id": [
      "f235123c3da3d87c05053b6df0883ddcbb1df316e69ac1cc64da5aa9c39a52d0"
    ],
    "container.image.name": [
      "nginx"
    ],
    "agent.name.text": [
      "kind-control-plane"
    ],
    "host.os.version": [
      "9.6 (Plow)"
    ],
    "kubernetes.labels.app": [
      "nginx"
    ],
    "kubernetes.namespace": [
      "kube-system"
    ],
    "kubernetes.node.labels.beta_kubernetes_io/os": [
      "linux"
    ],
    "host.os.name": [
      "Red Hat Enterprise Linux"
    ],
    "agent.name": [
      "kind-control-plane"
    ],
    "host.name": [
      "kind-control-plane"
    ],
    "host.os.type": [
      "linux"
    ],
    "input.type": [
      "filestream"
    ],
    "log.offset": [
      5629
    ],
    "kubernetes.container.name.text": [
      "nginx"
    ],
    "data_stream.type": [
      "logs"
    ],
    "tags": [
      "nginx-access"
    ],
    "host.architecture": [
      "x86_64"
    ],
    "container.runtime": [
      "containerd"
    ],
    "kubernetes.node.name.text": [
      "kind-control-plane"
    ],
    "agent.id": [
      "cef8bf9b-8b10-4257-9590-28cfd74f9b2d"
    ],
    "ecs.version": [
      "8.0.0"
    ],
    "host.containerized": [
      false
    ],
    "kubernetes.node.labels.node-role_kubernetes_io/control-plane": [
      ""
    ],
    "agent.version": [
      "9.1.5"
    ],
    "host.os.family": [
      "redhat"
    ],
    "kubernetes.node.name": [
      "kind-control-plane"
    ],
    "kubernetes.pod.name.text": [
      "nginx"
    ],
    "kubernetes.node.hostname": [
      "kind-control-plane"
    ],
    "kubernetes.pod.uid": [
      "01327f9c-d4ff-4428-a277-d8dc19fe6f78"
    ],
    "agent.type": [
      "filebeat"
    ],
    "orchestrator.cluster.url": [
      "kind-control-plane:6443"
    ],
    "stream": [
      "stdout"
    ],
    "host.os.kernel": [
      "6.17.1-arch1-1"
    ],
    "log.file.device_id": [
      "64768"
    ],
    "log.file.path.text": [
      "/var/log/containers/nginx_kube-system_nginx-f235123c3da3d87c05053b6df0883ddcbb1df316e69ac1cc64da5aa9c39a52d0.log"
    ],
    "kubernetes.pod.name": [
      "nginx"
    ],
    "elastic_agent.snapshot": [
      false
    ],
    "host.id": [
      "671d8b1fdab2123ffbbbe8f6aad4da95"
    ],
    "event.timezone": [
      "+00:00"
    ],
    "kubernetes.pod.ip": [
      "10.244.0.5"
    ],
    "kubernetes.container.name": [
      "nginx"
    ],
    "elastic_agent.id": [
      "cef8bf9b-8b10-4257-9590-28cfd74f9b2d"
    ],
    "data_stream.namespace": [
      "default"
    ],
    "host.os.codename": [
      "Plow"
    ],
    "kubernetes.namespace_labels.kubernetes_io/metadata_name": [
      "kube-system"
    ],
    "message": [
      "10.244.0.1 - - [16/Oct/2025:16:00:10 +0000] \"GET /error HTTP/1.1\" 404 153 \"-\" \"curl/8.16.0\""
    ],
    "kubernetes.node.labels.kubernetes_io/hostname": [
      "kind-control-plane"
    ],
    "kubernetes.node.labels.beta_kubernetes_io/arch": [
      "amd64"
    ],
    "orchestrator.cluster.name.text": [
      "kind"
    ],
    "@timestamp": [
      "2025-10-16T16:00:10.512Z"
    ],
    "host.os.platform": [
      "rhel"
    ],
    "log.file.inode": [
      "6697079"
    ],
    "data_stream.dataset": [
      "nginx.access"
    ],
    "log.file.path": [
      "/var/log/containers/nginx_kube-system_nginx-f235123c3da3d87c05053b6df0883ddcbb1df316e69ac1cc64da5aa9c39a52d0.log"
    ],
    "agent.ephemeral_id": [
      "45be407f-0412-47f8-8c32-ed2257193556"
    ],
    "kubernetes.node.labels.kubernetes_io/arch": [
      "amd64"
    ],
    "container.image.name.text": [
      "nginx"
    ],
    "log.file.fingerprint": [
      "1189074e47c90922e09907adf8ff6b0d42478522699edcccb3036a944f83fef4"
    ],
    "event.dataset": [
      "nginx.access"
    ]
  }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions