Skip to content

Latest commit

 

History

History
1142 lines (863 loc) · 65.6 KB

File metadata and controls

1142 lines (863 loc) · 65.6 KB

Integrating with Agent Control

Defining a sub-agent workload

AC manages a workload by reading a description of how to run, configure, manage and check the health of it, in the form of a YAML file. This file describes what we call an agent type definition. In some places of the codebase, we might refer to the workload created for a certain agent type as an agent type instance.

A set of agent type definitions is shipped built into AC, but the AC team or external teams can also add new supported agents without rebuilding AC. See Where agent type definitions come from for the sources AC resolves definitions from and their precedence.

Each agent type definition targets a single (platform, operating_system) pair. The platform is either host or kubernetes; operating_system is required when platform: host (linux or windows) and must not be set when platform: kubernetes. An agent that supports more than one such pair (for example, the Infrastructure Agent which runs on host Linux, host Windows and Kubernetes) is defined by one YAML file per pair, all sharing the same namespace, name and version. At startup, Agent Control loads only the definitions whose platform (and operating_system, when platform: host) match the binary it's running in.

The definition for an agent type consists on a single YAML file with three main areas defined below.

We recommend that you read the following sections, but at any time feel free to check the currently available definitions in its dedicated docs to see working examples of the explained concepts.

Agent Type Metadata

Contains top-level fields for the name of the agent type, with a namespace, the version, and the platform this definition targets.

The version used here is not the version of your agent, but the version of the agent type definition. For example, at the time of writing this we may use version: 0.1.0 for our Infrastructure Agent definition, but the version of the actual Infrastructure Agent binary that AC ends up running as sub-agent would be the most recent one (1.60.1).

Agent Types are versioned to ensure compatibility with a given configuration values (no breaking changes, see below). As of now, we maintain only one version per agent type and use a fixed 0.1.0 value for it because these definitions are not easily visible to FC, but FC needs to know what are the agent types and their versions to make the metadata visible on New Relic's UI. As of now we prohibit pushing breaking changes for these definitions, and any exceptions to this need to be validated at least by both AC and FC teams.

Separately from the agent type version, every definition must declare a top-level protocol_version. This is not a metadata field — it is parsed on its own and versions the agent-type schema language itself: the set of fields and their meaning that Agent Control knows how to parse, including the shape of the metadata block described here. It is decoupled from both the agent type version (semver) and the Agent Control release version. It is a quoted MAJOR.MINOR string (for example "1.0"); the value must be quoted, otherwise YAML interprets 0.1 as a float and the field is rejected.

Because it gates the rest of the document, Agent Control reads and validates protocol_version first, at the registry ingestion boundary, before the metadata and the other sections are interpreted. Each Agent Control release understands a single maximum protocol version, and the protocol_version is treated as a single ordered MAJOR.MINOR value. The compatibility rules are:

  • Newer than supported (higher major, or same major with a higher minor): rejected. The file is newer than this Agent Control understands.
  • Equal to or older than supported: accepted. Agent Control understands every protocol version up to and including the supported one.

For example, an Agent Control that supports protocol version 1.6 accepts everything up to 1.6 (including 0.9 and 1.0..=1.6) and rejects anything newer (1.7, 2.0, ...).

The platform field is required, and operating_system is required when platform: host. The supported combinations are:

  • platform: kubernetes (no operating_system).
  • platform: host with operating_system: linux.
  • platform: host with operating_system: windows.

Any other combination (for example platform: host without an OS, or platform: kubernetes with one) is rejected at parse time.

This is an example section for the metadata fields, using the Kubernetes definition of the New Relic Infrastructure Agent.

namespace: newrelic
name: com.newrelic.infrastructure
version: 0.1.0
protocol_version: "1.0"
platform: kubernetes
# ...

The Linux and Windows host variants share the same namespace/name/version and platform: host, and differ only in the operating_system value:

namespace: newrelic
name: com.newrelic.infrastructure
version: 0.1.0
protocol_version: "1.0"
platform: host
operating_system: linux
# ...

Agent Type Variables

This section, defined under the top-level field variables, enables the dynamic configuration of the workload created by AC by exposing arbitrary variables. Variables are declared as a flat tree directly under variables — there are no per-platform sub-keys. If an agent type supports multiple platforms, each per-platform YAML file declares its own variables independently (they may overlap or differ between platforms).

Variables can be arbitrarily grouped into common fields forming a tree, where the final leaf will determine the actual variable, its type and its allowed contents.

Defining variables is entirely optional, but if no variables are defined then no dynamic configuration will be possible for this sub-agent, AC will be only capable of adding or removing it as a workload using its deployment instructions and at most the environment variables available to AC at the time it's running (see the deployment section below).

The following is a section of the defined configuration variables for the Kubernetes definition of the New Relic Infrastructure Agent. You can read a detailed explanation below.

variables:
  chart_values:
    newrelic-infrastructure:
      description: "newrelic-infrastructure chart values"
      type: yaml
      required: false
      default: {}
    nri-metadata-injection:
      description: "nri-metadata-injection chart values"
      type: yaml
      required: false
      default: {}
    global:
      description: "Global chart values"
      type: yaml
      required: false
      default: {}
  chart_version:
    description: "nri-bundle chart version"
    type: string
    required: true

Here, chart_values is a grouping field that contains three nested variables (newrelic-infrastructure, nri-metadata-injection and global), while chart_version is a sibling top-level variable.

When referencing these variables elsewhere, as you will see in the deployment and applying configuration sections, you would access these nested fields using a dot (.), as usual for accessing fields in programming languages. For our example, we would use chart_values.newrelic-infrastructure, chart_values.nri-metadata-injection, chart_values.global and chart_version respectively.

The variables can theoretically be nested this way indefinitely, but for usability purposes we advise to keep this at a reasonable level.

Variable definition

For the leaf nodes of the variable definitions, we currently support the following fields:

description (String)

A description of the variable, for documentation purposes.

type (String)

The value type that is accepted for this variable. As of now, the following types are supported (using the allowed values for the field):

  • string.
  • bool.
  • number: Integer or floating point are supported.
  • yaml: An arbitrary YAML value, like an array, an object or even a scalar.
  • map[string]yaml: A YAML value where the top-level is guaranteed to consist on string keys for other values.
required (bool)

Specifies if providing a value for this variable is required or not. If required is false, a default value of its specified type needs to be provided. If required is true, then a default value cannot be specified.

default (optional)

A default value for this variable, for the cases where no configuration value has been passed for this variable when creating an instance for the agent type. Its value must be of the same type as the one declared for the variable.

In the case of the yaml variable type, is recommended to explicitly set a 'null' default value as default: null.

variants (optional)

Only available for String variables.

A list of accepted values for this variable. If any configuration includes a value for this variable that is not among the specified variants, the configuration will be invalid. The accepted values can be changed in the Agent Control configuration, as in the example below:

Agent type:

my_variable:
  # ...
  type: string
  variants:
    ac_config_field: "my_variable_variants" # If the field is set in `agent_type_var_constraints.variants`, the configures values will be used instead of the default ones.
    values: ["value1", "value2"] # Otherwise the values defined here are used

AC config:

agent_type_var_constraints:
  variants: # map of variants
    my_variable_variants: ["supported_value1", "supported_value2"] # The key should match what is defined in the Agent Type

By default, no variants are set, resulting in no variant validation.

Agent Type Deployment

This actually defines how the workload will be created and managed by AC, and it's defined under the top-level field deployment. The shape of deployment depends on the platform declared in the metadata: an on-host definition uses on-host deployment fields (executables, filesystem, packages, …), and a Kubernetes definition uses Kubernetes deployment fields (objects, …). Each per-platform YAML file describes a single deployment block.

The deployment field is required and cannot be empty.

The role of variables

These instructions can be dynamically rendered using as inputs the values for the variables exposed above, environment variables and other internal information exposed by AC. To reference any of these contents we use a template syntax with the form ${<NAMESPACE>:<VARIABLE_REF>}. The NAMESPACE section can have the following values and determines what the VARIABLE_REF section represents:

  • nr-var: a variable exposed as in the previous section. If you defined a variable called configs.some_toggle then you can reference it inside the deployment section as ${nr-var:configs.some_toggle}.
  • nr-env: environment variables. So, if AC started running with an env var called MY_ENV defined, it can be used inside the deployment section with ${nr-env:MY_ENV}.
  • nr-sub: metadata variables related to the current workload populated automatically by AC. As of now, only the variable agent_id is exposed, which is a unique, human-friendly identifier of the current workload.
  • nr-ac: global metadata used by AC (see Global metadata list).

When talking about the variables that were defined in the variables field for an agent type definition, that can be used as local or received as remote configuration for an agent type instance, we will often use the term configuration values or just values.

All of these variable references will be replaced with actual values, either provided with configuration values or their defaults if missing and the variables are not required, on a rendering stage that will create the final instructions for the deployment.

When adding these values as a user, either as a local config for AC (a file in the filesystem for on-host or a ConfigMap for Kubernetes) or as remote configs made available from FC, the format used is a YAML file with the values following the same tree-like structure defined for the variables in the agent type definition, but the leaf nodes being the actual values.

For examples of this with actual agent type definitions, see kubernetes config examples and host config examples in the official New Relic documentation site.

Environment variable expansion on configuration values

Any AC environment variable can be referenced within local or remote configuration values using the ${nr-env:<ENVIRONMENT_VARIABLE>} syntax. During the rendering process, AC will resolve the ENVIRONMENT_VARIABLE and replace the placeholder with its corresponding value.

For example, consider the following configuration snippet:

config_agent:
  license_key: ${nr-env:LICENSE_KEY}

In this case, AC will look for an environment variable named LICENSE_KEY and substitute its value into the configuration.

It is important to note that the availability of environment variables depends on the environment where AC is running:

  • On-host installations: AC will have access to environment variables configured at the systemd service level. Ensure that any required variables are properly defined in the service configuration.
  • Kubernetes deployments: AC will have access to environment variables attached to the AC Pod. These variables can be defined in the Pod's manifest, typically under the env section.

By leveraging this mechanism, you can dynamically inject environment-specific values into your configurations, simplifying deployment and ensuring flexibility across different environments.

The following examples show the deployment block for the Linux, Windows and Kubernetes definitions of the New Relic Infrastructure Agent — each in its own per-platform YAML file.

Linux (platform: host, operating_system: linux):

deployment:
  enable_file_logging: ${nr-var:enable_file_logging}
  health:
    interval: 5s
    initial_delay: 5s
    timeout: 5s
    http:
      path: "/v1/status/health"
      port: ${nr-var:health_port}
  packages:
    infra-agent:
      download:
        oci:
          repository: ${nr-var:oci.repository}
          version: ${nr-var:version}
  filesystem:
    config:
      kind: dir
      entries:
        newrelic-infra.yaml:
          kind: file
          text: |
            ${nr-var:config_agent}
    integrations.d:
      kind: dir_content_from_map
      source: ${nr-var:config_integrations}
    logging.d:
      kind: dir_content_from_map
      source: ${nr-var:config_logging}
  executables:
    - id: newrelic-infra
      path: ${nr-sub:packages.infra-agent.dir}/newrelic-infra
      args:
        - --config
        - ${nr-sub:filesystem_agent_dir}/config/newrelic-infra.yaml
      env:
        NRIA_PLUGIN_DIR: "${nr-sub:filesystem_agent_dir}/integrations.d"
        NRIA_LOGGING_CONFIGS_DIR: "${nr-sub:filesystem_agent_dir}/logging.d"
        NRIA_STATUS_SERVER_ENABLED: true
        NRIA_STATUS_SERVER_PORT: "${nr-var:health_port}"
        NR_HOST_ID: "${nr-ac:host_id}"
      restart_policy:
        backoff_strategy:
          type: fixed
          backoff_delay: ${nr-var:backoff_delay}

Windows (platform: host, operating_system: windows):

deployment:
  enable_file_logging: ${nr-var:enable_file_logging}
  health:
    interval: 5s
    initial_delay: 5s
    timeout: 5s
    http:
      path: "/v1/status/health"
      port: ${nr-var:health_port}
  packages:
    infra-agent:
      download:
        oci:
          repository: ${nr-var:oci.repository}
          version: ${nr-var:version}
  filesystem:
    config:
      kind: dir
      entries:
        newrelic-infra.yaml:
          kind: file
          text: |
            ${nr-var:config_agent}
    integrations.d:
      kind: dir_content_from_map
      source: ${nr-var:config_integrations}
    logging.d:
      kind: dir_content_from_map
      source: ${nr-var:config_logging}
  executables:
    - id: newrelic-infra
      path: ${nr-sub:packages.infra-agent.dir}\\newrelic-infra.exe
      args:
        - --config
        - ${nr-sub:filesystem_agent_dir}\\config\\newrelic-infra.yaml
      env:
        NRIA_PLUGIN_DIR: "${nr-sub:filesystem_agent_dir}\\integrations.d"
        NRIA_LOGGING_CONFIGS_DIR: "${nr-sub:filesystem_agent_dir}\\logging.d"
        NRIA_STATUS_SERVER_ENABLED: true
        NRIA_STATUS_SERVER_PORT: "${nr-var:health_port}"
        NR_HOST_ID: "${nr-ac:host_id}"
      restart_policy:
        backoff_strategy:
          type: fixed
          backoff_delay: ${nr-var:backoff_delay}

Kubernetes (platform: kubernetes):

deployment:
  health:
    interval: 30s
    initial_delay: 30s
    checks:
      - namespace: ${nr-ac:namespace}
        name: ${nr-sub:agent_id}
        kind: HelmReleaseWorkload
        target_namespace: ${nr-ac:namespace_agents}
  objects:
    release:
      apiVersion: helm.toolkit.fluxcd.io/v2
      kind: HelmRelease
      metadata:
        name: ${nr-sub:agent_id}
        namespace: ${nr-ac:namespace}
      spec:
        targetNamespace: ${nr-ac:namespace_agents}
        releaseName: ${nr-sub:agent_id}
        interval: 3m
        # ... omitted for brevity
        values:
          newrelic-infrastructure: ${nr-var:chart_values.newrelic-infrastructure}
          nri-metadata-injection: ${nr-var:chart_values.nri-metadata-injection}
          kube-state-metrics: ${nr-var:chart_values.kube-state-metrics}
          nri-kube-events: ${nr-var:chart_values.nri-kube-events}
          global: ${nr-var:chart_values.global}
Global metadata list

We have some global metadata available both for on-host and k8s. Be aware that the metadata are different.

For on-host, we have:

  • host_id: contains an identifier calculated from the retrieved information about the host, such as the hostname or cloud-related data (when available).
  • filesystem_agent_dir: contains the absolute path to a dedicated file system directory for this sub-agent. The default value in Linux systems is /var/lib/newrelic_agent_control/filesystem/<AGENT_ID>. Note how the agent type definition uses this variable for content added via the filesystem field (see below).

For k8s, we have:

  • namespace: the namespace where Agent Control and Flux will be created.
  • namespace_agents: the namespace where sub-agents will be created. Due to a limitation in the k8s-agents-operator, Instrumentation CRs are created in this namespace too.

On-host deployment definition

The following fields are used for configuring the on-host deployment of a sub-agent.

executables

Instructions to actually run the sub-agent process. It is composed of the following fields:

  • path: Full path to the executable binary. A string.
  • args: Command line arguments passed to the executable. This is an array of string.
  • env: A key-value mapping of environment variables and their respective values. Strings.
  • restart_policy: How the sub-agent should behave if it ends execution. If this policy limits are exceeded the sub-agent will be marked as unhealthy (see Health status below) and not restarted anymore. Accepts the following fields:
    • backoff_strategy: Timing-related configuration for the restart, to prevent wasteful crash-loops. Accepts the following values:
      • type: either fixed, linear or exponential.
      • backoff_delay: Time between restarts. This is a time string in the form of 10s, 1h, etc.
      • max_retries: Maximum number of restart tries. A number.
      • last_retry_interval: Time interval for the back-off number of retries to maintain its number. That is, if the process spends more than this interval after the restart policy was triggered, the restart policy values like the current tries or the back-off delays will be reset. This is a time string in the form of 10s, 1h, etc.

As of now, the executables field is array and is actually optional. This was intended to cover the APM agents use case for on-host, in which the agents are not processes but libraries or plugins injected to other processes, customer applications, whose lifecycle AC must not manage (see Agent-less supervisors below). However, this is not yet supported. An agent without executables is accepted as valid, but AC will just spawn an internal supervisor structure for the sub-agent without actually doing anything besides checking health, if it was configured.

filesystem

Represents the file system configuration for the deployment of a host agent. Consisting of a set of directories (map keys) which in turn contain a set of files (nested map keys) with their respective content (map values).

The contents defined here will be written to the sub-agent's dedicated directory for filesystem files, which can be referenced in other fields via the variable ${nr-sub:filesystem_agent_dir}.

The files can be hardcoded, with the contents possibly containing templates, or the whole set of files can be templated, so a directory contains an arbitrary number of files (a place to use a map[string]yaml variable type). The paths cannot be templated individually.

Every directory and every file is declared with a kind, and directory trees are built recursively via an entries: field. A directory's contents can also be templated from a map[string]yaml variable using kind: dir_content_from_map, the map's keys become filenames and the values become file contents.

Each key names a single entry at its own level — it must be a single path segment (a leaf), not a slash-separated sub-path. A nested directory has to be spelled out level by level with explicit kind: dir + entries: blocks; a key such as newrelic-infra/newrelic-integrations/logging is rejected. Declare it as:

newrelic-infra:
  kind: dir
  entries:
    newrelic-integrations:
      kind: dir
      entries:
        logging:
          kind: dir

This applies to projected filenames too: the keys of a map[string]yaml used by dir_content_from_map must also be single segments.

The example below uses these variables:

variables:
  config_agent:
    description: "Newrelic infra configuration"
    type: yaml
    required: false
    default: ""
  config_integrations:
    description: "map of YAML configs for the OHIs"
    type: map[string]yaml
    required: false
    default: {}
  config_logging:
    description: "map of YAML config for logging"
    type: map[string]yaml
    required: false
    default: {}

And this filesystem block:

filesystem:
  newrelic-infra.yaml:
    kind: file
    persistent: true
    text: |
      ${nr-var:config_agent}

  config:
    kind: dir
    persistent: true

  logging.d:
    kind: dir_content_from_map
    source: ${nr-var:config_logging}

  agent:
    kind: dir
    entries:
      data:
        kind: dir
        persistent: true
      integrations.d:
        kind: dir_content_from_map
        source: ${nr-var:config_integrations}
      newrelic-infra.yaml:
        kind: file
        text: |
          ${nr-var:config_agent}
Worked examples

Given these user-supplied values:

config_agent: |
  license_key: REDACTED
  log:
    level: info

config_integrations:
  nri-mysql.yaml: |
    integrations:
      - name: nri-mysql
        env:
          HOSTNAME: localhost
  nri-redis.yaml: |
    integrations:
      - name: nri-redis
        env:
          HOSTNAME: localhost

config_logging:
  syslog.yaml: |
    logs:
      - name: syslog
        file: /var/log/syslog

The runtime produces the following on disk under ${nr-sub:filesystem_agent_dir}. Each kind is shown in isolation.

kind: file: single file rendered from the templated text: field. persistent: true keeps it across agent-control stop and restarts.

newrelic-infra.yaml      ← contents from ${nr-var:config_agent}

kind: dir: an explicitly declared directory. With no entries: it's just an (optionally persistent) empty directory; with entries: it builds a tree, where each child is itself any of the three kinds, including another dir, so recursion is uniform.

config/                  ← empty, persistent

agent/
├── data/                ← empty, persistent
├── integrations.d/      ← projected from config_integrations (see below)
│   ├── nri-mysql.yaml
│   └── nri-redis.yaml
└── newrelic-infra.yaml  ← contents from ${nr-var:config_agent}

kind: dir_content_from_map: a directory whose entries are projected from a map[string]yaml variable at deploy time. Map keys become filenames; map values become file bodies.

logging.d/
└── syslog.yaml          ← contents from config_logging["syslog.yaml"]

agent/integrations.d/
├── nri-mysql.yaml       ← contents from config_integrations["nri-mysql.yaml"]
└── nri-redis.yaml       ← contents from config_integrations["nri-redis.yaml"]
Entry kinds reference

file — a single file with literal or templated content.

Field Required Default Description
kind yes Must be file.
text yes File body. May reference ${nr-var:…} / ${nr-sub:…}.
persistent no false If true, survives sub-agent stop/restart.

dir — an explicitly declared directory. Its children, if any, live under entries:.

Field Required Default Description
kind yes Must be dir.
entries no {} Map of child entries (any kind). Recursive. Each key must be a single path segment, not a sub-path.
persistent no false If true, this directory survives stop/restart. Not inherited, each child is judged by its own persistent flag (see Persistence).

dir_content_from_map — a directory whose set of files is computed at deploy time from a map[string]yaml variable. The map's keys become filenames; the values become file contents.

Field Required Default Description
kind yes Must be dir_content_from_map.
source yes Reference to a map[string]yaml variable (${nr-var:…}).
Persistence in Filesystem

Every file and dir entry accepts a boolean persistent: (default false). Two independent mechanisms govern lifecycle:

  • The persistent flag controls whether the entry's on-disk path is wiped when the tree is cleaned: on sub-agent stop, and just before every (re)write of the tree (start, restart, and config apply). Ephemeral entries are wiped at those points; persistent entries are kept. Wiping before each write means leftover ephemeral content never carries across — even after an ungraceful shutdown (crash/SIGKILL) that skipped the stop-time cleanup.
  • The manifest drives reconciliation on every write event. Anything Agent Control wrote on the previous successful write is recorded in the manifest. On the next write, Agent Control diffs the manifest against the new declared set: paths it owned previously and no longer owns are deleted; paths it never owned are left alone.

The flag does not shield the entry from intentional removal: if you delete an entry from the agent type (or remove a key from a dir_content_from_map source map), the manifest diff catches it and the on-disk path is deleted on the next write event.

persistent applies per entry and does not cascade to children. When cleaning (on stop, and before each (re)write), cleanup walks the declared tree: a persistent entry is kept and the walk descends into its children, while an ephemeral entry is deleted together with its entire on-disk subtree (a recursive remove_dir_all, which stops the walk there). So a nested path survives cleanup only if every declared node on the path is persistent: true.

dir_content_from_map has no persistent flag. Agent Control owns and re-renders the projected files on every write, so it is always ephemeral. A persistent: key left in the YAML is silently ignored, so older configs still parse.

Manifest

After every successful write, Agent Control writes .ac-managed-paths.json inside the sub-agent's filesystem directory listing the absolute paths it just wrote. This filename is reserved — agent types must not declare it.

The manifest is the source of truth for "what Agent Control owns." Files the sub-agent process creates at runtime are never in the manifest, so they're invisible to reconciliation: they survive every write event, every sub-agent restart, and every config update. They're only removed if some declared ancestor directory is itself removed from the agent type (the remove_dir_all of the parent takes them as collateral) or if the agent is removed from the fleet.

The manifest stores rendered paths, not agent-type declarations. Reconciliation runs on the tree after variable substitution, so the manifest records the concrete absolute paths Agent Control actually wrote. This matters most for dir_content_from_map: the agent type only names the directory and a source: variable, but at render time each map key is expanded into its own file path, and every one of those paths is recorded individually in the manifest.

As a result, removing a key from the source map is reconciled exactly like deleting a literal entry from the agent type: the rendered path is in the previous manifest but absent from the new declared set, so it is deleted on the next write.

Example. Agent type:

filesystem:
  integrations.d:
    kind: dir_content_from_map
    source: ${nr-var:config_integrations}

The config_integrations variable (a map[string]yaml) supplied at deploy time:

config_integrations:
  nri-mysql.yaml: |
    integrations:
      - name: nri-mysql
  nri-redis.yaml: |
    integrations:
      - name: nri-redis

With ${nr-sub:filesystem_agent_dir} resolving to /var/lib/newrelic-agent-control/filesystem/nr-infra, the write produces integrations.d/nri-mysql.yaml and integrations.d/nri-redis.yaml, and the resulting .ac-managed-paths.json is:

{
  "managed_paths": [
    "/var/lib/newrelic-agent-control/filesystem/nr-infra/integrations.d",
    "/var/lib/newrelic-agent-control/filesystem/nr-infra/integrations.d/nri-mysql.yaml",
    "/var/lib/newrelic-agent-control/filesystem/nr-infra/integrations.d/nri-redis.yaml"
  ]
}

Note the directory plus one entry per rendered map key, it's the variable's content that lands in the manifest, not the source: reference. If the next deploy drops nri-redis.yaml from config_integrations, the new declared set no longer contains …/integrations.d/nri-redis.yaml while the previous manifest still does, so that file is deleted on the next write.

Lifecycle
  • Ephemeral (persistent: false, default). Wiped on sub-agent stop, and again just before every (re)write of the tree (start, restart, config apply); the declared entry itself is then re-created by the write. Leftover content (including files the agent created inside an ephemeral directory) never carries across a restart, even an ungraceful one.
  • Persistent (persistent: true). Kept on stop and across (re)writes; only write re-renders its declared content.
  • Removed from fleet. When an agent is removed from the fleet config (via remote config or by being absent at AC startup after a previous deploy), its entire filesystem directory is deleted by ResourceCleaner. The persistent flag is bypassed.
Event Ephemeral (persistent: false) Persistent (persistent: true)
Agent start Wiped, then reconcile (manifest diff) + write Kept; reconcile (manifest diff) + write
Agent stop Path deleted Path kept
Agent restart Wiped, then reconcile + write Kept; reconcile + write
Config update Wiped, then reconcile + write Kept; reconcile + write
Removed from fleet Filesystem dir deleted by ResourceCleaner Filesystem dir deleted by ResourceCleaner

Agent-process-created files survive a reconcile + write (they're not in the manifest and not declared) except files inside an ephemeral directory, which are wiped along with it on stop and before each (re)write. To keep agent-created content across restarts, place it under a persistent directory.

packages

Defines OCI packages containing the executables and data to be downloaded and installed for the sub-agent. This is a map where keys are package identifiers and values contain package metadata and download configuration.

The value yaml look like:

  download:
    oci:
      repository: ${nr-var:oci.repository}
      version: ${nr-var:version}
      public_key_url: https://publickeys.newrelic.com/g/agent-control-oci/global/nrinfraagent/jwks.json

Note that a Package version. Can be:

  • A tag (:v1.0.0)
  • A digest (@sha256:...)
  • Both tag and digest (:v1.0.0@sha256:...), when both are specified the digest takes precedence.

public_key_url is an optional field, when not configured signature verifications is skipped and logged with warn level.

Warning

The package in the OCI repository MUST follow a specific structure.

Post-Download Hook:

The post_download_hook is an optional field that allows executing a custom script after the package is downloaded and extracted. This is useful for:

  • Installing system dependencies
  • Compiling native code
  • Performing system configuration
  • Validating installation requirements
  • Running setup scripts that cannot be handled through simple file extraction

The hook runs with a hardcoded timeout of 300 seconds (5 minutes) and is not configurable. If the script exits with a non-zero status code, the package installation fails.

  post_download_hook:
    path: /bin/bash           # or just "bash" (searches in PATH)
    args:
      - /absolute/path/to/script.sh
      - --arg1
      - --arg2
    env:
      PACKAGE_VERSION: ${nr-var:version}
      CUSTOM_VAR: some-value

Fields:

  • path: Path to the command/interpreter. Can be absolute (e.g., /bin/bash, C:\Windows\System32\cmd.exe) or relative (e.g., bash, python3, cmd) which will be searched in the system PATH. Required.
  • args: List of arguments passed to the command. The structure depends on your use case (see examples below). Can be empty for binaries that don't require arguments. Required.
  • env: Optional map of environment variables passed to the script process.

The script execution environment includes:

  • PACKAGE_DIR: Automatically set to the package installation directory
  • Current working directory: Set to the package directory
  • stdout: Discarded (to avoid log noise)
  • stderr: Captured and logged on failure

Note

On Unix systems, if path points to a file, it will be automatically made executable (chmod +x) before execution. This ensures scripts extracted from OCI packages work even if they don't have execute permissions in the archive.

Linux Examples:

# Using bash from PATH with absolute script path
post_download_hook:
  path: bash
  args:
    - /opt/newrelic/install.sh
    - --check-dependencies
  env:
    AGENT_VERSION: ${nr-var:version}

# Using absolute interpreter path
post_download_hook:
  path: /usr/bin/python3
  args:
    - /opt/newrelic/setup.py
    - --install

# Using relative script path (relative to package directory)
post_download_hook:
  path: bash
  args:
    - ./install.sh
    - --verbose

# Direct binary execution without arguments
post_download_hook:
  path: /usr/bin/validate-system
  args: []

Windows Examples:

# Using cmd.exe with /c flag
post_download_hook:
  path: cmd
  args:
    - /c
    - C:\newrelic\install.bat
    - --check-dependencies
  env:
    AGENT_VERSION: ${nr-var:version}

# Using PowerShell
post_download_hook:
  path: powershell
  args:
    - -ExecutionPolicy
    - Bypass
    - -File
    - C:\newrelic\setup.ps1

# Direct batch script execution (Windows can execute .bat/.cmd directly)
post_download_hook:
  path: C:\newrelic\install.bat
  args:
    - --verbose

Complete package example:

    packages:
      ebpf-agent:
        download:
          oci:
            repository: ${nr-var:oci.repository}
            version: ${nr-var:version}
        post_download_hook:
          path: bash
          args:
            - ./install.sh
            - --check-dependencies
          env:
            AGENT_VERSION: ${nr-var:version}

Accessing Package Contents:

After installation, the package directory path is available via the reserved variable ${nr-sub:packages.<package-id>.dir}, where <package-id> is the key used in the packages map.

Example:

In this example:

  • A package named infra-agent is downloaded from an OCI registry
  • The package installation directory is referenced in the executable path using ${nr-sub:packages.infra-agent.dir}
    packages:
      infra-agent:
        download:
          oci:
            repository: ${nr-var:oci.repository}
            version: ${nr-var:version}
    executables:
      - id: newrelic-infra
        path: ${nr-sub:packages.infra-agent.dir}\\newrelic-infra.exe
enable_file_logging (bool)

When set, this redirects the stdout and stderr of the created process to files inside AC's logging directory (see on-host troubleshooting in the official public documentation). These log files will reside inside a directory dedicated to the current sub-agent, identifiable by its agent_id.

health (on-host)

Enables periodically checking the health of the sub-agent. See Health status below for more details. Accepts the following values:

  • interval: Periodicity of the check. A duration string.
  • initial_delay: Initial delay before the first health check is performed. A duration string.
  • timeout: Maximum duration a health check may run before considered failed.
  • http or file: The type of health check used.
    • http means that the supervisor for this sub-agent will attempt to query an HTTP endpoint and will decide on healthiness depending on the status code. Accepts the following fields:
      • host, string.
      • path, string.
      • port, a number.
      • headers: key-value pairs for authentication or other required info.
      • healthy_status_codes: The status codes that mean a healthy state. If not set, as of now the 200s will be considered healthy and the rest unhealthy.
    • file means that the supervisor for this sub-agent will attempt to read a file and find expected contents. Failing to do so, or reading information that means an unhealthy state, will mark the sub-agent as unhealthy. Accepts path as its only field.

If no health configuration is defined, AC will use the exceeding of the restart policy (if also defined) to determine if the sub-agent should be labelled as unhealthy.

Kubernetes namespace usage

Agent Control in Kubernetes uses two distinct namespaces for resource management:

  • Agent Control namespace (namespace): This is where Agent Control, Flux, and their supporting resources are created and managed.
  • Agents namespace (namespace_agents): This is dedicated to sub-agents and their managed resources. Ideally, Instrumentation CRs should be in the Agent Control namespace, but due to a limitation in the k8s-agents-operator, they must be in the same namespace as the operator.

This separation makes it more secure. That way, agents can't use Flux or Agent Control Service Accounts with wide privileges. When defining agent types or configuring deployments, ensure that resources are created in the correct namespace. The variables ${nr-ac:namespace} and ${nr-ac:namespace_agents} are available for templating these values in your agent type definitions.

Kubernetes deployment definition

The following fields are used for configuring the Kubernetes deployment of a sub-agent.

health (Kubernetes)

The health configuration for Kubernetes. See Health status below for more details. Accepts the following values:

  • interval: Periodicity of the check. A duration string. Default to 60s.
  • initial_delay: Initial delay before the first health check is performed. A duration string. Default to zero.
  • checks: An optional list of Kubernetes resources to health-check. If omitted or empty, health checking is disabled for this sub-agent. Each entry accepts:
    • name: The name of the Kubernetes object (supports template variables).
    • namespace: The namespace where the object lives (supports template variables).
    • kind: The kind of resource to check. One of:
      • Deployment, DaemonSet, StatefulSet: checks the named workload directly. If the resource does not exist, the sub-agent is considered healthy (a missing workload is not treated as a failure). Health is computed considering the workload's status. Eg: desired vs. available replicas.
      • Instrumentation: checks a New Relic Instrumentation CR. If the resource does not exist, the health check reports an error.
      • HelmReleaseWorkload: checks the named HelmRelease CR plus the Deployment, DaemonSet, and StatefulSet workloads belonging to the release (discovered via the Flux label helm.toolkit.fluxcd.io/name). If the HelmRelease CR does not exist, the health check reports an error.
    • target_namespace: the namespace where the Helm-deployed workloads run. Defaults to namespace. Use this when the HelmRelease installs workloads into a different namespace than the one containing the HelmRelease CR itself.

Example for a Helm-based agent deploying workloads into a separate namespace:

health:
  interval: 30s
  initial_delay: 30s
  checks:
    - namespace: ${nr-ac:namespace}
      name: ${nr-sub:agent_id}
      kind: HelmReleaseWorkload
      target_namespace: ${nr-ac:namespace_agents}

Example for an APM agent using an Instrumentation CR:

health:
  interval: 30s
  initial_delay: 30s
  checks:
    - namespace: ${nr-ac:namespace_agents}
      name: ${nr-sub:agent_id}
      kind: Instrumentation

Example checking individual workload kinds explicitly:

health:
  interval: 30s
  initial_delay: 30s
  checks:
    - namespace: ${nr-ac:namespace_agents}
      name: my-deployment
      kind: Deployment
    - namespace: ${nr-ac:namespace_agents}
      name: my-daemonset
      kind: DaemonSet
    - namespace: ${nr-ac:namespace_agents}
      name: my-statefulset
      kind: StatefulSet

Note

In the example above the agent will be considered unhealthy if any of the corresponding resources is found but its status doesn't meet the workload criteria. This allows supporting agents with configurable workloads.

objects

Key-value pairs of the Kubernetes Objects to be created by this sub-agent on deployment. The key is an internal identifier of the object, while the value is the object itself which accepts the following values:

  • apiVersion, a string.
  • kind, a string.
  • metadata: Accepting the following:
    • name, a string.
    • namespace, a string.
    • labels: key-value pair of strings representing Kubernetes labels.
  • And a collection of arbitrary fields representing the actual data (e.g. the spec) of the object.

Most of Agent Control sub-agents currently deploy Flux CRs which end up in helm chart installation.

You can check an existing agent type with a Kubernetes deployment as an example. This file includes all necessary Flux CR configurations required for Agent Control to manage sub-agent deployments effectively. It serves as a comprehensive reference for understanding the integration and deployment process.

Where agent type definitions come from

Before AC can create a sub-agent, it must resolve the agent type referenced in your config (for example newrelic/com.newrelic.infrastructure:0.1.0) to an actual definition. AC looks for that definition in three sources, in a fixed order of precedence. The first source that provides a matching definition wins, so the order is:

  1. Custom (local) definitions — highest precedence. These are YAML files you place in AC's dynamic agent types directory (on-host: /etc/newrelic-agent-control/dynamic-agent-types), read from disk at startup. A custom definition whose id matches a built-in one overrides the built-in. If two custom files declare the same id, the one whose file name sorts last wins.

    Custom definitions are intended for development and testing only, not for production use. They are a way to iterate on a definition locally before it is shipped as an embedded definition or published to a remote registry.

  2. Embedded (built-in) definitions — the agent types shipped with AC. They are compiled into the binary, so they are always available with no network or filesystem dependency. The currently embedded definitions are listed in the agent type registry.

  3. Remote definitions — lowest precedence. If the agent type is not found locally, AC fetches it from an OCI registry.

Only definitions that target the environment of the running binary are considered: the on-host binary only sees host definitions matching its operating system, and the Kubernetes binary only sees kubernetes definitions (see Agent Type Metadata). A definition that targets a different platform is treated as not found, so the lookup falls through to the next source.

This precedence is what makes the custom directory useful for development: you can add a brand-new agent type, or iterate on and override an existing one, simply by dropping a file there — without rebuilding AC or editing the embedded registry — while still falling back to the built-in and remote sources for everything else. For a step-by-step walkthrough of adding a custom on-host agent type, see the development guide in the agent type overview.

Applying configurations

The first time it runs, whether it's using static configs or when already running and receiving remote configuration values from FC, AC will create an internal entity called a supervisor for each of the declared sub-agents. Each of these supervisors have the following responsibilities:

  1. Retrieve the configuration available for it, either locally or by listening for remote if FC is enabled.
  2. Attempt to assemble the actual, effective config that the sub-agent will have.
  3. If the assembly is successful, attempt to deploy (spawn process or create Kubernetes resources) the sub-agent using the effective config.
  4. Once the sub-agent is deployed:
  • Perform regular health checks.
  • Restart it if it crashes, according to the configured restart policy (for on-host).
  • Assure that the resources match the ones defined in the agent-type (for k8s).
  1. If Fleet Control is enabled, the supervisor will listen for incoming remote configs different from the one currently in use:
  • When receiving one, the supervisor will stop its workload and restart from step 1 again.
  • If an empty config is passed it means that this agent should be retired, so the supervisor will just stop its workload and exit.
  1. On failure of assembly or deployment, the supervisor will be kept alive, but will report itself as unhealthy. If FC is enabled, this offers the user the possibility of pushing a new remote config, in case the sub-agent was left in a bad state due to receiving an invalid one.

Agent Control itself shares much of the behavior of a supervisor, that's how, if FC is enabled, it can receive remote configs (mainly the desired list of sub-agents) and apply them.

A note on the effective config

When we mention a sub-agent's effective config, we actually mean a concept from the OpAMP protocol. It consists on the configuration values that can be received remotely from an OpAMP server, so it does not necessarily (and often just won't) match the configuration of the workload itself. The configuration values are expected to couple with the agent type definition to render the final instructions on how to render agents. You can assume that the effective config is more for the supervisor than for the sub-agent itself.

Of course, these values might still contain your observability agent's own config among the rest of the values, but it should not be assumed that these values fully determine and represent the actual state of your agent's config at all times. For example, if your agent is designed in a way that can accept remote configs through other means (like over the network) that take precedence over the config it first runs with or the configs present in files it watches (as these could be rendered by the supervisor), integrating your agent with AC does not make it aware of these other configuration means, so a mismatch of what we call the effective config vs the actual config of your agent is to be expected.

The following flowchart illustrates the config application of a sub-agent via its supervisor, though it omits the health checks and its explicit reporting (along the effective config) to FC.

%%{ init: { 'theme': 'neutral' }}%%
flowchart TB
    classDef central fill:#00E580;
    classDef optional stroke-dasharray: 2 2

    AC@{ label: Agent Control process }
    ACC@{ shape: doc, label: "AC config values
      (local or remote)"}
    S@{ shape: procs, label: Agent Supervisor }
    C@{ shape: doc, label: "Sub-agent config values
      (local or remote)"}
    T@{ shape: doc, label: Agent type definition}
    A@{ shape: diamond, label: Assemble }
    G@{ shape: lean-r, label: Assembled Agent }
    P@{ label: Config assets }
    D@{ shape: diamond, label: Deploy }
    DA@{ label: Deployed Agent }
    F@{ shape: doc, label: Effective config }

    AC -->|reads or listens for| ACC
    AC -->|creates| S
    S -->|read or listen for| C
    S -->|for| T
    C & T --> A
    A -->|ok| G
    G -.->|"persists (if any)"| P
    G --> D
    G -->|has| F
    DA -.->|"reads (if any)"| P
    A & D -->|error| S
    D --> DA

    class AC central
    class C central
    class ACC central
    class T central
    class DA central
    class F central

    class P optional
Loading

Health status

The health status that AC reports to FC follows the definition of component health used by the OpAMP protocol. Essentially, for each sub-agent we will send the following information:

  • If it is healthy or not.
  • The time the sub-agent was started (as UNIX time in nanoseconds).
  • The time of the last health check (as UNIX time in nanoseconds).
  • A status message using agent-specific semantics.
  • If the sub-agent is unhealthy, a human-readable error message commonly called last error.
  • Optionally, we are capable of sending this same information for arbitrary levels of subcomponents, representing a composite, more granular health. As of now, this information won't be used by FC.

However, we don't offer the same degree of support for agent type authors to populate this information, and in some cases we provide this information internally from AC. Where complete support is offered, the author of the agent type is ultimately responsible for the contents of the health messages built (for example, ensuring the status message uses agent-specific semantics or the error message is human-friendly).

Health for On-host

HTTP

Currently, HTTP support for on-host health checks is kept simple. With the host, path and port provided in the agent type definition, AC will compose a URL and the supervisor for any of these sub-agents will periodically perform an HTTP GET request to it. The response body of this request converted to UTF-8 will be used as the health status message.

If the status code is in the 200's (successful) or one of the configured in the list of allowed status codes the sub-agent will be reported as healthy. If the request times out or the status code is not one defined as healthy, AC will report the sub-agent as unhealthy, using "Health check failed with HTTP response status code <CODE>" as the last error string.

File

With file-based health checks, a YAML file is expected to be present at the location configured, with the following format:

healthy: false
status: "some agent-specific message"
last_error: "some error message" # optional, for the case healthy == false
start_time_unix_nano: 1725444000
status_time_unix_nano: 1725444001

AC will periodically attempt to read this file and forward its contents as a health message. If health is true, but there are contents for last_error, the latter will be ignored and the check will be considered healthy.

The workload is responsible for keeping this file updated over time, as AC will not check that property. It will only parse and propagate the values as the health message.

The file-based health check is implemented for the New Relic APM agents, and is leveraged internally when running on Kubernetes. See Instrumentation CR and APM for details.

Health on Kubernetes

The approaches followed by on-host are not trivial to implement for Kubernetes, and Kubernetes already provides built-in mechanisms to inspect the health of its resources, so AC leverages these built-ins.

Health checking for Kubernetes is driven by the checks entries declared in the agent type's health configuration (see health (Kubernetes) above). Each check targets one resource by name, namespace, and kind, and the operations performed depend on the kind. If no checks are declared, health checking is disabled for the sub-agent.

The nature of all these checks is all in all very similar. It involves mostly querying the Kubernetes API server for a certain resource, looking up specific fields of its object representation (like its status or its metadata), and performing an evaluation of the values contained within them. The only difference is that the structure of the Instrumentation objects is defined by New Relic, while the remaining ones are defined by Kubernetes itself or by well-known tooling of the Kubernetes ecosystem such as Helm.

For agents that do not define Helm releases, how they work, and the Instrumentation CRD, see Agent-less supervisors below.

Helm releases

If the agent type's deployment section for Kubernetes defined Helm releases, the health check will query different information sources and evaluate their contents with resource-specific logic. If any of these evaluations determine as unhealthy, the sub-agent will be considered unhealthy.

The sources inspected are listed below:

Helm Release

AC will attempt to retrieve the status field of the Helm release object. Inside this status it will retrieve a list of conditions and check if the ready condition exists and is true, which means healthy. If it's false, it will consider this check unhealthy and emit a message as last error.

StatefulSets

For StatefulSets, AC will set healthy whenever the number of replicas matches the number of ready replicas.

DaemonSets

With DaemonSets, the health check will evaluate if the number of pods is the desired one, if no pods are unavailable, and in the case the DaemonSet upses the rolling update strategy, if all pods are running the latest version.

Deployment

For Deployments, AC will set healthy whenever there is no unavailable replicas.

Instrumentation CR

The Instrumentation is a custom resource defined by New Relic to represent the status of agents inside Kubernetes, mostly to enable the use case of supervising Agent-less workloads. Much of the information specified here will contain details specific to them and how they work on Kubernetes, mentioning components such as the Kubernetes Operator or the Sidecar. Retrieving the health information of Instrumentations is a completely custom procedure that is not strictly related to Kubernetes beyond the retrieval of the resource itself.

As of now, the health check for Instrumentations involves reading their status value, which should contain the following fields:

  • podsMatching is the number of pods which match the Instrumentation.spec.podLabelSelectors and Instrumentation.spec.NamespaceLabelSelectors.
  • podsHealthy is the number of pods which match based on podsMatching and podsInjected (see below) and the operator (see below) was able to get:
    • The correct pod IP/port.
    • A health response which had a healthy status reported via the YAML field healthy.
    • An HTTP status code of 200.
  • podsInjected is the number of pods which matched the Instrumentation based on podsMatching which had the health sidecar injected.
  • podsNotReady is the number of pods which are not in a ready state (Pod.status.phase != "Running") podsMatching and podsInjected.
  • podsOutdated is the number of pods which match based on podsMatching and podsInjected, but where there's a mismatch between the Instrumentation.generation and the injected pod's annotation (to identify changes to the spec).
  • podsUnhealthy is the number of pods which failed a health check, either because the operator couldn't get the pod's IP or port, communication issues, timeout, non-200 HTTP status, failure to decode the HTTP response, and lastly the last_error field in the response.
  • unhealthyPodsErrors is a list of pods (namespace.name/pod.name) and either the last error from the response or the error from the operator while trying to collect health.
    • pod is the name of the pod.
    • last_error is the error string.

As of now, the logic determining if a sub-agent is the following:

  • podsNotReady must be 0.
  • podsUnhealthy must be 0.
  • podsHealthy must be more than 0.
  • podsMatching must be more than 0.
  • podsInjected must be equal to podsMatching.

If unhealthy, the last_error field will be populated with the Instrumentation status' unhealthyPodsErrors field.

Agent-less supervisors

If we hadn't mentioned the possibility of agent-less supervisors before, you might have asked yourself if all the agents that AC can support are limited to ones where an actual, stand-alone process (either traditional one on a server or Kubernetes-based such as a pod) is running. After all, the main example of supported agent type is the Infrastructure Agent, a separate binary that is intended to run alongside the customer business workload, but separate from them.

It turns out that there are other use cases for which New Relic Control might be useful where a process either does not exist or must not be managed by AC. APM is one of them.

APM

With APM, a customer instruments some existing application by plugging in a shared object, library, or some other plug-in component to the programming language runtime. For example, you could add some Java-specific command line options pointing to the New Relic Java agent's JAR file when running your Java application, so the agent is hooked into the JVM. In this case, the only process is the actual customer application, whose lifecycle must not be managed by AC as opposed to a separate observability agent.

This is why AC supports defining agent types that do not include any actual stand-alone process but otherwise have observability agent functionality, can receive remote configs and expose health information. As of now, this is mostly supported in Kubernetes only, with on-host planned for the future.

APM supervisors on Kubernetes

The APM use case for Kubernetes is supported by using some additional components that run in the cluster alongside AC. The main one is Kubernetes APM auto-attach (also known as Kubernetes agents operator or just the Kubernetes operator), which is defined as an agent type for AC.

Normally, the Kubernetes operator will intercept API requests for deploying pods onto nodes and, depending on the configuration specified, adds the appropriate language agent to the application via an init container. This is achieved by the operator creating the Instrumentation Custom Resource Definition (CRD), so it is later possible to create Instrumentation resources configured to match application pods via pod label selectors. After these Instrumentation resources are created, the operator will inject the init container for the new pods matching these labels.

When used with AC, the operator will also inject a sidecar container next to each application pod. This sidecar has the role of retrieving the health status using a similar method to the file-based health check approach for on-host:

  1. The language agents inside the application pods will write the health file (or files) into the file system.
  2. The sidecar will read these files and expose their contents as an HTTP endpoint. If it reads many files, it will coalesce their information into a single health output.
  3. The Kubernetes operator will fetch the health information from the sidecars and will update the Instrumentation CR's status with the value of this health.
  4. The health inside the Instrumentation CR is read periodically by AC, as exposed above when discussing health in Kubernetes on Instrumentation CR, and then reported to FC.

So, to support APM use cases with AC, we can define an agent type that specifies an Instrumentation CR appropriate for the language. Ensuring that this agent type instance is deployed alongside a single instance for the Kubernetes operator (also an agent type deployable with AC), the auto-instrumentation of applications and the health reporting will work.

As the Instrumentation CR also enables configuring the APM agents, pushing remote configurations from FC is also possible.

You can check our agent type definitions for the currently supported APM languages in our hardcoded registry.

The following diagram reflects the flow of an APM agent being added as a remote config to a Kubernetes cluster where the Kubernetes operator is already deployed. Some of the arrows connecting entities are numbered to represent the timing.

%%{ init: { 'theme': 'neutral' }}%%
flowchart TB
    classDef central fill:#00E580;
    classDef optional stroke-dasharray: 2 2

    FC@{ label: Fleet Control }
    AC@{ label: Agent Control }

    KO@{ label: Kubernetes Operator agent }
    I@{ label: Instrumentation CR }

    subgraph POD [Application Pod]
      direction TB
      APM@{ label: APM Agent init container }
      UA@{ label: User application }
      S@{ label: Sidecar }
    end

    class KO central
    class I central
    class APM central
    class S central

    FC -->|1 - adds Operator config to| AC
    AC -->|2 - creates resources of| KO
    KO -->|3 - defines for all languages| I
    FC -->|4 - adds APM agent config to| AC
    AC -->|5 - adds| I
    KO -->|6 - injects| APM & S
    APM -->|injects agent| UA
    UA -->|writes health| S 
    KO -->|7 - monitors| S
    KO -->|8 - updates status| I
    AC -->|manages| KO
    AC -->|monitors health| I & KO
    AC -->|applies remote configs| I & KO
Loading