Skip to content

feat: support global config for agent types#2436

Open
danielorihuela wants to merge 6 commits intofeat/support-oci-auth-in-agent-typesfrom
feat/support-global-config-for-agent-types
Open

feat: support global config for agent types#2436
danielorihuela wants to merge 6 commits intofeat/support-oci-auth-in-agent-typesfrom
feat/support-global-config-for-agent-types

Conversation

@danielorihuela
Copy link
Copy Markdown
Contributor

@danielorihuela danielorihuela commented Apr 17, 2026

This PR adds support for global defaults in Agent Control, which allows setting "base" values for all agent types that will be instantiated in the host machine.

Notice that this feature works for k8s too, even though it's not needed at the moment.

Also, this PR is on top of #2414.

Context

In the refinement, it was decided that the user should be able to configure some default values "globally" (for all agent types) and that these should be overridable in an agent-per-agent basis.

Something along these lines (this is not the final shape of the config).

# Agent Control Config
agent_control:
   oci:
   	registry: ...
   	auth: ...

# Then agent types can be configured like
nr_infra_agent:
  oci:
    repository: ...
    version: ...
# Non configured and mandatory fields will use the global defaults

# We can also override global defaults
nr_infra_agent_specific_host_machine:
  oci:
    repository: ...
    version: ...
    registry: specific-url

Essentially, we have two requirements:

  1. We must have a way of defining global defaults that all agent types can use
    We can promote using these as much as possible, but it's the responsibility of the agent type author to use them.
  2. Global defaults have less weight than specific agent type configurations.
    Basically, if a user sets the value of a variable, this takes precedence over the global default.

From a technical point of view, we have two additional requirement.

  1. Ensure that authentication methods can point to a secret, environment variable, or similar.
  2. Backwards compatibility

Review

I think the easier way to review this PR is to check all changes at once. If you want to go commit by commit, here you have the hight level summary. Read it to understand what each commit adds.

  • commit 1 adds global config and simple string replacement in defaults
  • commit 2 adds templatable strings in defaults
  • commit 3 extends support of templatable defaults for other types (booleans, numbers, etc)
  • commit 4 modifies the docs
  • commit 5 automates the generation of the hashmap with default values
  • commit 6 add support for nr-env in the default field of the variables section

Technical decisions

Added nr-default namespace

I added a new namespace called nr-default to handle the rendering of default values in the variables section. This was not strictly needed, and someone could argue that's even a bad decision. However, I think it helps with consistency. Let me explain this contradiction.

We have several supported namespaces for agent types template rendering. We have nr-env, nr-ac, nr-sub and nr-var. All of them are rendered in the deployment section of the agent type definition. This is the "runtime" part. Adding an nr-default seems a good idea, until you realize that we are going to use this one in the variables section, which is supposed to be static. I don't think that's a big deal. Now default values will be dynamic.
My main concern is that we now have two groups of namespaces. One of them work inside variables and one of them works inside deployments.

I decided to go down that path to reuse abstractions and keep some degree of consistency with our terminology. I acknowledge that it could be confusing for agent type authors. They might assume they can use nr-default in the deployment section. That's not the case, and I assume that we should solve that through documentation. We could use a different abstraction and have something different from nr-default. Maybe global-default:whatever. However, I'm not sure this will be clearer and it requires extra effort to duplicate and generalize all the code or part of the code that we have for templating.

We could also pass nr-default values along with the other namespaces to template the deployment section, but I didn't think it was useful. The idea is to have a global default that can be overridden. With the current implementation that separation is a feature, not a bug. It won't work because it's not supposed to work.

Regarding other namespaces. I don't they they make sense to support in the variables section. I only added support for nr-env. In case an agent type author wants to retrieve the value from an env var by default, and give the user the possibility to change it. I'm not sure this is that useful. We can remove it.

Open to feedback!!!

Add DefaultValue

Now that we can template default values in the variable section, we need a way to postpone type checking. Hence, I introduced this new struct DefaultValue that allows storing in the default field either a value or a template string.

The type checking is postponed from agent type loading to default values templating. Without this, we can't have global default values with types other than string (e.g. bool). This is strictly not necessary for the current feature, but we might need it in the future. It's best to introduce that now.

Two-step rendering

The renderer is getting more complex. Originally, we had a "two-step rendering" process.

  1. Discover secrets
  2. Load secrets
  3. Template user configured values with secrets (first render step)
  4. Fill agent type variables
  5. Template deployment section (second render step)

The master piece in charge of templating is template_string, which assumes that the received variables contain the final value. Thus, this second step rendering was needed. If the template_string was able to detect that a template resulted into another template, we could try replacing them in a loop until we get a final value. This is a potential improvement refactor suggested by @DavSanchez.

Now, we need to templatise the default values in the variables section. This requires a tweak.

  1. Discover secrets (from user configured values AND global defaults)
  2. Load secrets
  3. Template user configured values with secrets (first render step)
  4. Template global defaults with secrets (second render step)
  5. Fill agent type variables
  6. Template deployment section (third render step)

Why is the extra step (4) needed?
Variables have a specific type (bool, yaml, etc), and the only way we can make sure that the template in the default field can be templatised into the expected type, is by doing it at the beginning. That way, when we fill the agent type variables, we can check that the type is as expected. Otherwise, we would always receive a template string, which is not always the expected type.

Backwards compatibility

The new global defaults are optional. We are not using rust Optional type, but we if the fields are missing when deserializing the data, we will use default values. With that, old agent control configs without default values will still work.

@danielorihuela danielorihuela changed the title Feat/support global config for agent types feat: support global config for agent types Apr 17, 2026
@danielorihuela danielorihuela force-pushed the feat/support-global-config-for-agent-types branch 7 times, most recently from 35c52c3 to 2766a4c Compare April 20, 2026 11:58
@danielorihuela danielorihuela force-pushed the feat/support-global-config-for-agent-types branch from 2766a4c to 8fab117 Compare April 20, 2026 14:18
@danielorihuela danielorihuela force-pushed the feat/support-global-config-for-agent-types branch from 8fab117 to e348494 Compare April 20, 2026 14:44
@danielorihuela danielorihuela added k8s-extended-e2e Trigger extended k8s e2e on a PR onhost-extended-e2e Execution of on host e2e in the current branch labels Apr 20, 2026
@danielorihuela danielorihuela force-pushed the feat/support-global-config-for-agent-types branch from acc0ccf to bd3c3e4 Compare April 21, 2026 09:06
@danielorihuela danielorihuela marked this pull request as ready for review April 21, 2026 10:09
@danielorihuela danielorihuela requested a review from a team as a code owner April 21, 2026 10:09
Comment on lines +36 to 39
default: "${nr-default:oci.registry}"
variants:
ac_config_field: "oci_registry_urls"
values: [ "docker.io" ]
Copy link
Copy Markdown
Contributor

@sigilioso sigilioso Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, the default should be consistent with the values in variants (if defined). We can sort it out with documentation but It would be nice to provide some tools to avoid easy failures. Eg:

        default: "${nr-default:oci.registry}"
        variants:
          ac_config_field: "oci_registry_urls"
          values: [ "${nr-default:oci.registry}" ]

@sigilioso
Copy link
Copy Markdown
Contributor

Added nr-default namespace
[...]
We have several supported namespaces for agent types template rendering. We have nr-env, nr-ac, nr-sub and nr-var. All of them are rendered in the deployment section of the agent type definition. This is the "runtime" part. Adding an nr-default seems a good idea, until you realize that we are going to use this one in the variables section, which is supposed to be static. I don't think that's a big deal. Now default values will be dynamic.
My main concern is that we now have two groups of namespaces. One of them work inside variables and one of them works inside deployments.

We could also have sorted that out with a completely different tool (same as we already do with the variants support that is also taking values from configuration). Eg:

default:
  value: "default-hardcoded-in-agent-type" # Value used if the field pointed to 'ac_config_field' is not defined
  ac_config_field: "oci_registry" # Gets the value from config in "defaults.oci_registry` if any

We would avoid the problem of having two groups of namespaces and it may be less confusing for Agent Type authors.

@danielorihuela
Copy link
Copy Markdown
Contributor Author

Added nr-default namespace
[...]
We have several supported namespaces for agent types template rendering. We have nr-env, nr-ac, nr-sub and nr-var. All of them are rendered in the deployment section of the agent type definition. This is the "runtime" part. Adding an nr-default seems a good idea, until you realize that we are going to use this one in the variables section, which is supposed to be static. I don't think that's a big deal. Now default values will be dynamic.
My main concern is that we now have two groups of namespaces. One of them work inside variables and one of them works inside deployments.

We could also have sorted that out with a completely different tool (same as we already do with the variants support that is also taking values from configuration). Eg:

default:
  value: "default-hardcoded-in-agent-type" # Value used if the field pointed to 'ac_config_field' is not defined
  ac_config_field: "oci_registry" # Gets the value from config in "defaults.oci_registry` if any

We would avoid the problem of having two groups of namespaces and it may be less confusing for Agent Type authors.

Not sure the value part makes sense. We always want a value from the global defaults I think. So, the agent type shouldn't have a default. Unless they want to make sure that the global default doesn't work. We could then simplify your solution to

default: "oci_registry"

This is what I had in the first commit. But I didn't like have a "magic replaceable string" there. I like having something like nr-default which makes it obvious. Also, it's hard do build more complex things if we need. Like

default: "${nr-default:whatever}/more-stuff"

The current implementation supports that, and I'm not sure we could with your approach. I'm not against it. Just mentioning this for whenever (if) we discuss the implications with the team.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

k8s-extended-e2e Trigger extended k8s e2e on a PR onhost-extended-e2e Execution of on host e2e in the current branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants