Skip to content

[extension/opampextension] Enrich AgentDescription with OS, Kubernetes, deployment mode, and user-configurable identifying attributes #47279

@rohit-sonawane

Description

@rohit-sonawane

Component(s)

extension/opamp

Is your feature request related to a problem? Please describe.

When managing a fleet of OpenTelemetry Collectors via OpAMP, the AgentDescription reported by the upstream opampextension lacks several attributes that are critical for effective fleet management:

  1. No OS version: The extension reports os.description (a free-form string like "Ubuntu 22.04.3 LTS") but not a machine-parseable os.version (e.g. "22.04"). This makes it difficult to programmatically filter, sort, or alert on OS versions across the fleet — for example, identifying all agents running on a kernel version affected by a CVE.

  2. No Kubernetes context: When collectors run as DaemonSets or Deployments in Kubernetes, the OpAMP server receives no information about the pod name, namespace, node, cluster, or owning workload. Operators are forced to cross-reference agent IDs against Kubernetes metadata in a separate system, which defeats the purpose of having a centralized fleet management protocol.

  3. No deployment mode: There is no way to distinguish whether a collector is running as an "agent" (per-node) or a "gateway" (centralized). This makes it impossible to apply mode-specific configuration policies or to get accurate counts of agents vs. gateways in a mixed fleet.

  4. No executable path: In environments with multiple collector binaries (e.g. side-by-side upgrades, canary deployments), there is no attribute to identify which binary is actually running, making rollout verification harder than it needs to be.

  5. No user-configurable identifying attributes: The identifying attributes are fixed to service.instance.id, service.name, and service.version. Operators who need to include custom identifiers — such as a tenant ID, fleet group, or deployment region — at the identifying level have no mechanism to do so. This forces workarounds like overloading service.name with encoded metadata, which breaks semantic conventions and downstream tooling.

Collectively, these gaps mean that the OpAMP server receives a minimal, inflexible description of each agent, requiring operators to build and maintain external enrichment pipelines to get the context they need for day-to-day fleet operations.

Describe the solution you'd like

Enrich the AgentDescription reported by opampextension with the following additions:

New non-identifying attributes

Attribute Source Notes
os.version host.Info().PlatformVersion Machine-parseable OS version (e.g. "15.3", "22.04")
k8s.node.name K8S_NODE_NAME env var Auto-detected when KUBERNETES_SERVICE_HOST is present
k8s.pod.name K8S_POD_NAME env var "
k8s.namespace.name K8S_NAMESPACE env var "
k8s.pod.uid K8S_POD_UID env var "
k8s.cluster.name OTEL_K8S_CLUSTER_NAME env var "
k8s.daemonset.name K8S_DAEMONSET_NAME env var "
k8s.deployment.name K8S_DEPLOYMENT_NAME env var "
otelcol.deployment.mode OTEL_COLLECTOR_MODE env var → config → default "agent" Values: "agent" or "gateway"
process.executable.path os.Executable() Silently omitted if detection fails

New config fields

extensions:
  opamp:
    agent_description:
      # Explicit deployment mode; overridden by OTEL_COLLECTOR_MODE env var.
      # Valid values: "agent", "gateway". Default: "agent".
      deployment_mode: "gateway"

      # Custom key-value pairs merged into identifying attributes.
      # Override auto-detected values; excluded from non-identifying attrs.
      identifying_attributes:
        fleet.id: "us-west-2-prod"
        custom.tenant: "acme-corp"

### Describe alternatives you've considered

_No response_

### Additional context



### Implementation reference

This feature set has been implemented, tested, and is running in production within the [Splunk OpAMP fork](https://github.com/signalfx/splunk-otel-collector). The changes in this PR port and refine that work for upstream contribution, ensuring alignment with OTel semantic conventions and upstream code style.

### Example: agent description before and after

**Before (upstream today):**

```json
{
  "identifying_attributes": {
    "service.instance.id": "f47ac10b-58cc",
    "service.name": "otelcol-contrib",
    "service.version": "0.102.0"
  },
  "non_identifying_attributes": {
    "os.type": "linux",
    "os.description": "Ubuntu 22.04.3 LTS",
    "host.arch": "amd64",
    "host.name": "worker-node-3"
  }
}

After (with this feature):

{
  "identifying_attributes": {
    "service.instance.id": "f47ac10b-58cc",
    "service.name": "otelcol-contrib",
    "service.version": "0.102.0",
    "fleet.id": "us-west-2-prod"
  },
  "non_identifying_attributes": {
    "os.type": "linux",
    "os.description": "Ubuntu 22.04.3 LTS",
    "os.version": "22.04",
    "host.arch": "amd64",
    "host.name": "worker-node-3",
    "process.executable.path": "/usr/bin/otelcol-contrib",
    "otelcol.deployment.mode": "agent",
    "k8s.node.name": "worker-node-3",
    "k8s.pod.name": "otelcol-agent-7b9f4",
    "k8s.namespace.name": "monitoring",
    "k8s.pod.uid": "a1b2c3d4-e5f6-7890",
    "k8s.cluster.name": "prod-us-west-2",
    "k8s.daemonset.name": "otelcol-agent"
  }
}

Example config

extensions:
  opamp:
    server:
      ws:
        endpoint: wss://opamp-server.example.com/v1/opamp
    agent_description:
      deployment_mode: "gateway"
      identifying_attributes:
        fleet.id: "us-west-2-prod"
        custom.tenant: "acme-corp"

Kubernetes Downward API setup
For K8s attribute auto-detection to work, operators need to expose the standard Downward API env vars in their pod spec. Example:

env:
  - name: K8S_NODE_NAME
    valueFrom:
      fieldRef:
        fieldPath: spec.nodeName
  - name: K8S_POD_NAME
    valueFrom:
      fieldRef:
        fieldPath: metadata.name
  - name: K8S_NAMESPACE
    valueFrom:
      fieldRef:
        fieldPath: metadata.namespace
  - name: K8S_POD_UID
    valueFrom:
      fieldRef:
        fieldPath: metadata.uid
  - name: OTEL_K8S_CLUSTER_NAME
    value: "prod-us-west-2"
  - name: K8S_DAEMONSET_NAME
    value: "otelcol-agent"

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions