Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
8815078
Rename `loadRemoteConfig` to `loadLastRecvdRemoteConfig`
douglascamata May 26, 2025
51f1749
Fix typo in logs
douglascamata May 26, 2025
e985290
Create variable only in the scope where it is used
douglascamata May 26, 2025
b7655ca
Implement magic config files for internal config
douglascamata May 26, 2025
0a98cf6
Add changelog entry
douglascamata May 26, 2025
e0f247b
go fmt
douglascamata May 26, 2025
ea6bbbd
Avoid shadowing err
douglascamata May 26, 2025
5a8f5af
Fix and improve specification docs on configuration
douglascamata May 28, 2025
3df002c
Fix formatting
douglascamata May 28, 2025
5f6bae8
Update example and remove repeated statement
douglascamata May 28, 2025
3ca60a8
Address PR review feedback
douglascamata Jun 2, 2025
5c9fe5b
Update changelog
douglascamata Jun 2, 2025
e92e09b
Fix typo
douglascamata Jun 2, 2025
b913316
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
douglascamata Jun 2, 2025
69c2823
Merge branch 'main' into supervisor-config-file-revamp
douglascamata Jun 12, 2025
2ea3933
Fix conflict resolution
douglascamata Jun 12, 2025
3a383d3
Apply suggestions from code review for comments
douglascamata Jul 2, 2025
d3b4383
Remove default config file enforcmeent from config validation
douglascamata Jul 2, 2025
e55c4f7
Simplify logic to add special config files
douglascamata Jul 2, 2025
e9b5276
Add test case for special config file validation
douglascamata Jul 2, 2025
ce6b08a
Add a few test cases for `Supervisor.addSpecialConfigFiles()`
douglascamata Jul 2, 2025
343812f
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
douglascamata Jul 2, 2025
e3eefa6
Retrigger build due to weird failure
douglascamata Jul 2, 2025
393759e
Add extra changelog entry for breaking change in agent config files b…
douglascamata Jul 2, 2025
d239fb8
Fix typo (`OWN_METRICS` -> `OWN_TELEMETRY`)
douglascamata Jul 3, 2025
59a69f8
Add more tests for special config file handling
douglascamata Jul 3, 2025
cd9e581
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
douglascamata Jul 3, 2025
7596730
Fix comment in special config file loading
douglascamata Jul 3, 2025
a40349d
Fix changelog for breaking change
douglascamata Jul 3, 2025
c33d92f
Rename changelog file
douglascamata Jul 3, 2025
bfd4eb4
Merge `OWN_TELEMETRY_CONFIG` into `BUILTIN_CONFIG`
douglascamata Jul 3, 2025
d0b4864
Rename built-in special config file
douglascamata Jul 3, 2025
07d6db3
Renames for consistency (`extra config` -> `extra telemetry config`, …
douglascamata Jul 3, 2025
ceca3ef
Delete old `extraconfig.yaml` file
douglascamata Jul 3, 2025
56078a1
Rename `konf` to `conf`
douglascamata Jul 7, 2025
d34f600
Merge branch 'main' of github.com:open-telemetry/opentelemetry-collec…
douglascamata Jul 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions .chloggen/supervisor-special-config-merge-order.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: breaking

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: cmd/opampsupervisor

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Remote configuration by default now merges on top of user-provided config files.

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [39963]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext: |
Previous, by default, user-provided config files were merged on top of all
other configuration. This is not the case anymore.

The new default order configuration merging is as follows (from lowest to highest precedence):

- `$OWN_TELEMETRY_CONFIG`
- <USER_PROVIDED_CONFIG_FILES>
- `$OPAMP_EXTENSION_CONFIG`
- `$REMOTE_CONFIG`
# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [user]
58 changes: 58 additions & 0 deletions .chloggen/supervisor-special-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: cmd/opampsupervisor

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Add support for total control of configuration merging through special configuration files

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [39963]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext: |
The special configuration files can be used through the `agent::config_files` option to control the order
in which configuration is merged. This allows greater customization of this feature, so that it can adapt
many use cases without requiring code changes.

Configuration is merged from the top of the list to the bottom, in order. This means that the first configuration
files will get overwritten by the later ones.

Here's a list of the available special configuration options and what they represent:

- "$OWN_TELEMETRY_CONFIG": configuration to set up the agent's own telemetry (resource, identifying and non-identifying attributes, etc.).
- "$OPAMP_EXTENSION_CONFIG": configuration for the agent's OpAMP extension to connect to the Supervisor.
- "$REMOTE_CONFIG": remote configuration received by the Supervisor.

Here's an example that could be used to configure the Agent:

```
agent:
config_files:
- base_config.yaml
- $OWN_TELEMETRY_CONFIG
- $OPAMP_EXTENSION_CONFIG
- $REMOTE_CONFIG
- compliance_config.yaml
```

If **one or more** of the special files are not specified, they are automatically
added at predetermined positions in the list. The order is as follows:

- `$OWN_TELEMETRY_CONFIG`
- <USER_PROVIDED_CONFIG_FILES>
- `$OPAMP_EXTENSION_CONFIG`
- `$REMOTE_CONFIG`
# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [user]
12 changes: 7 additions & 5 deletions cmd/opampsupervisor/e2e_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -459,7 +459,7 @@ func TestSupervisorStartsCollectorWithNoOpAMPServerUsingLastRemoteConfig(t *test
storageDir := t.TempDir()
remoteConfigFilePath := filepath.Join(storageDir, "last_recv_remote_config.dat")

cfg, hash, healthcheckPort := createHealthCheckCollectorConf(t)
cfg, hash, healthcheckPort := createHealthCheckCollectorConf(t, true)
remoteConfigProto := &protobufs.AgentRemoteConfig{
Config: &protobufs.AgentConfigMap{
ConfigMap: map[string]*protobufs.AgentConfigFile{
Expand Down Expand Up @@ -528,7 +528,7 @@ func TestSupervisorStartsCollectorWithRemoteConfigAndExecParams(t *testing.T) {

// create remote config to check agent's health
remoteConfigFilePath := filepath.Join(storageDir, "last_recv_remote_config.dat")
cfg, hash, healthcheckPort := createHealthCheckCollectorConf(t)
cfg, hash, healthcheckPort := createHealthCheckCollectorConf(t, false)
remoteConfigProto := &protobufs.AgentRemoteConfig{
Config: &protobufs.AgentConfigMap{
ConfigMap: map[string]*protobufs.AgentConfigFile{
Expand Down Expand Up @@ -1261,8 +1261,8 @@ func createBadCollectorConf(t *testing.T) (*bytes.Buffer, []byte) {
return bytes.NewBuffer(colCfg), h.Sum(nil)
}

func createHealthCheckCollectorConf(t *testing.T) (cfg *bytes.Buffer, hash []byte, remotePort int) {
colCfgTpl, err := os.ReadFile(path.Join("testdata", "collector", "healthcheck_config.yaml"))
func createHealthCheckCollectorConf(t *testing.T, nopPipeline bool) (cfg *bytes.Buffer, hash []byte, remotePort int) {
colCfgTpl, err := os.ReadFile(path.Join("testdata", "collector", "healthcheck_config.tmpl.yaml"))
require.NoError(t, err)

templ, err := template.New("").Parse(string(colCfgTpl))
Expand All @@ -1271,7 +1271,9 @@ func createHealthCheckCollectorConf(t *testing.T) (cfg *bytes.Buffer, hash []byt
var confmapBuf bytes.Buffer
err = templ.Execute(
&confmapBuf,
map[string]string{},
map[string]any{
"nopPipeline": nopPipeline,
},
)
require.NoError(t, err)

Expand Down
100 changes: 70 additions & 30 deletions cmd/opampsupervisor/specification/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,31 +88,31 @@ capabilities:

# The Supervisor will report EffectiveConfig to the Server.
reports_effective_config: # true if unspecified

# The Supervisor can accept Collector executable package updates.
# If enabled the Supervisor will also report package status to the
# Server.
accepts_packages: # false if unspecified

# The Collector will report own metrics to the destination specified by
# the Server.
reports_own_metrics: # true if unspecified

# The Collector will report own logs to the destination specified by
# the Server.
reports_own_logs: # true if unspecified

# The Collector will report own traces to the destination specified by
# the Server.
reports_own_traces: # true if unspecified

# The Collector will accept connections settings for exporters
# from the Server.
accepts_other_connection_settings: # false if unspecified

# The Supervisor will accept restart requests.
accepts_restart_command: # true if unspecified

# The Collector will report Health.
reports_health: # true if unspecified

Expand All @@ -130,22 +130,26 @@ agent:
# The interval on which the Collector checks to see if it's been orphaned.
orphan_detection_interval: 5s

# The maximum wait duration for retrieving bootstrapping information from the agent
# The maximum wait duration for retrieving bootstrapping information from the agent
bootstrap_timeout: 3s

# Extra command line flags to pass to the Collector executable.
args:

# Extra environment variables to set when executing the Collector.
env:

# Optional user name to drop the privileges to when running the
# Collector process.
run_as: myuser
# Path to optional local Collector config files to be merged with the
# config provided by the OpAMP server.
config_files:
- /etc/otelcol/config.yaml
# List of configuration files to be merged to built the Collector's effective
# configuratio. It includes a few "special" files. Read the "Config Files" section
# below for more details.
config_files:
- $OPAMP_EXTENSION_CONFIG
- $OWN_TELEMETRY_CONFIG
- $REMOTE_CONFIG

# Optional directories that are allowed to be read/written by the
# Collector.
# If unspecified then NO access to the filesystem is allowed.
Expand All @@ -155,7 +159,7 @@ agent:
deny: \[/var/log/secret_logs\]
write:
allow: \[/var/otelcol\]

# Optional key-value pairs to add to either the identifying attributes or
# non-identifying attributes of the agent description sent to the OpAMP server.
# Values here override the values in the agent description retrieved from the collector's
Expand All @@ -166,9 +170,9 @@ agent:
non_identifying_attributes:
custom.attribute: "custom-value"

# The port the Supervisor will start its OpAmp server on and the Collector's
# The port the Supervisor will start its OpAmp server on and the Collector's
# OpAmp extension will connect to
opamp_server_port:
opamp_server_port:

# Supervisor's internal telemetry settings.
telemetry:
Expand Down Expand Up @@ -220,19 +224,57 @@ telemetry:

```

**Note:**
#### Notes on `agent::config_files`, `agent::args`, and `agent::env`

Please be aware that when using the `agent::config_files` parameter,
the configuration files specified are applied in the order they are specified.
In other words, configuration files are merged from the top of the list to the bottom.
Configuration added by files at the top of the list may be overwritten by the later ones.

The indicated configuration files are merged in memory and the resulting configuration
is written to `<storage::directory>/effective.yaml`.

Please be aware that when using the `.agent.config_files` parameter,
the configuration files specified are applied after the configuration from the OpAMP server.
After the configuration files, arguments present in `.agent.args` are passed to the executable binary.
The environmanet variables specified in `.agent.env` are set in the collector process environment.
There are a few "special" configuration files that can be used to completely
customize final configuration given to the Collector. Below are the available
values and what they represent:

The following configuration:
- `$OPAMP_EXTENSION_CONFIG`: configuration for the OpAMP extension to connect to the Supervisor.
- `$OWN_TELEMETRY_CONFIG`: configuration for the agent to report its own telemetry.
- `$REMOTE_CONFIG`: remote configuration received by the Supervisor.

**NOTE**: These configuration snippets, particularly `$OPAMP_EXTENSION_CONFIG`, are essential for the Supervisor and Collector to work together. Overriding values in these may result in the Supervisor failing to properly start the Collector and should be done with caution.

These special files can be mixed with user-provided configuration files to create complex
configuration merge orders, for instance, creating base-layer configuration at the
lowest priority while keeping compliance configuration at the highest priority:

```yaml
agent:
config_files:
- base_config.yaml
- $OWN_TELEMETRY_CONFIG
- $OPAMP_EXTENSION_CONFIG
- $REMOTE_CONFIG
- compliance_config.yaml
```

If **one or more** of the special files are not specified, they are automatically
added at predetermined positions in the list. The order is as follows:

- `$OWN_TELEMETRY_CONFIG`
- <USER_PROVIDED_CONFIG_FILES>
- `$OPAMP_EXTENSION_CONFIG`
- `$REMOTE_CONFIG`

Arguments present in `agent::args` are passed to the executable binary **after** the configuration files.
The environment variables specified in `agent::env` are set in the Collector process environment.

Take the configuration below as an example:

```yaml
agent:
executable: ./otel-binary
config_files:
config_files:
- './custom-config.yaml'
- './another-custom-config.yaml'
args:
Expand All @@ -242,14 +284,12 @@ agent:
GO_HOME: '~/go'
```

results to the following startup parameters for the collector process:
This results in the following Collector process invocation:

```shell
./otel-binary --config opamp-config.yaml --config custom-config.yaml --config another-custom-config.yaml --feature-gates exporter.datadogexporter.UseLogsAgentExporter,exporter.datadogexporter.metricexportnativeclient
./otel-binary --config /var/lib/otelcol/supervisor/effective.yaml --feature-gates exporter.datadogexporter.UseLogsAgentExporter,exporter.datadogexporter.metricexportnativeclient
```

In case of conflicting values in the configuration files, the latest applied value takes precedence.

### Operation When OpAMP Server is Unavailable

When the supervisor cannot connect to the OpAMP server, the collector will
Expand Down Expand Up @@ -335,8 +375,8 @@ configuration.
To overcome this problem the Supervisor starts the Collector with an
"noop" configuration that collects nothing but allows the opamp
extension to be started. The "noop" configuration is a single pipeline
with an nop receiver, a nop exporter, and the opamp extension.
The purpose of the "noop" configuration is to make sure the Collector starts
with an nop receiver, a nop exporter, and the opamp extension.
The purpose of the "noop" configuration is to make sure the Collector starts
and the opamp extension communicates with the Supervisor. The Collector is stopped
after the AgentDescription is received from the Collector.

Expand Down Expand Up @@ -479,7 +519,7 @@ will populate exporter settings from OpAMP ConnectionSettings message
the following way:

| **ConnectionSettings** | **Exporter setting** |
|---------------------------|----------------------|
| ------------------------- | -------------------- |
| destination_endpoint | endpoint |
| headers | headers |
| certificate.public_key | tls.cert_file |
Expand Down
25 changes: 25 additions & 0 deletions cmd/opampsupervisor/supervisor/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ import (
"os"
"path/filepath"
"runtime"
"slices"
"strings"
"time"

"github.com/open-telemetry/opamp-go/protobufs"
Expand Down Expand Up @@ -219,13 +221,36 @@ func (a Agent) Validate() error {
return errors.New("agent::config_apply_timeout must be valid duration")
}

for _, file := range a.ConfigFiles {
if !strings.HasPrefix(file, "$") {
continue
}
if !slices.Contains(SpecialConfigFiles, SpecialConfigFile(file)) {
return fmt.Errorf("agent::config_files contains invalid special file: %q. Must be one of %v", file, SpecialConfigFiles)
}
}

if runtime.GOOS == "windows" && a.UseHUPConfigReload {
return errors.New("agent::use_hup_config_reload is not supported on Windows")
}

return nil
}

type SpecialConfigFile string

const (
SpecialConfigFileOwnTelemetry SpecialConfigFile = "$OWN_TELEMETRY_CONFIG"
SpecialConfigFileOpAMPExtension SpecialConfigFile = "$OPAMP_EXTENSION_CONFIG"
SpecialConfigFileRemoteConfig SpecialConfigFile = "$REMOTE_CONFIG"
)

var SpecialConfigFiles = []SpecialConfigFile{
SpecialConfigFileOwnTelemetry,
SpecialConfigFileOpAMPExtension,
SpecialConfigFileRemoteConfig,
}

type AgentDescription struct {
IdentifyingAttributes map[string]string `mapstructure:"identifying_attributes"`
NonIdentifyingAttributes map[string]string `mapstructure:"non_identifying_attributes"`
Expand Down
Loading