Skip to content

Commit b303a5e

Browse files
teresaromeroclaudeMichelLosier
authored
[Fleet] Restore dataset override for OTel input packages (elastic#262000)
## Summary Restores user-controlled `data_stream.dataset` routing for OTel input packages, fixing a regression introduced in elastic#260385. Two categories of packages were affected: - **Non-dynamic traces packages** (Zipkin, Jaeger, APM intake): spans always landed in `traces-generic.otel-<namespace>` with no way to override, because `generateOtelTypeTransforms` hard-coded `null` for the span context regardless of the dataset passed in. - **`dynamic_signal_types` packages** (Kafka, MySQL, SQL Server): all signals landed in `generic.otel`-prefixed data streams, because `generateOTelAttributesTransform` always passed `null` for dataset, making the `data_stream.dataset` policy variable have no effect on routing. For `dynamic_signal_types`, Fleet **again emits** `data_stream.dataset` in generated OTTL (package default or user override). That is intentional; see `agent_policy_otel_routing.ts` (file header + tests) for acceptance criteria. The **`.otel` suffix on Elasticsearch index templates** remains separate (EPM / `getRegistryDataStreamAssetBaseName` when `isOtelInputType`); see `dev_docs/data_streams.md` (OpenTelemetry section) and related JSDoc on this branch. ### Changes **`otel_collector.ts`** - Restore `dataset` (instead of `null`) for the `span` context in `generateOtelTypeTransforms` traces case. - Use the same `dataset` for `spanevent` (logs routing) so span events follow the policy dataset and overrides. - Restore `dataset` (instead of `null`) in the `dynamic_signal_types` path of `generateOTelAttributesTransform`. **`policy_template.ts`** - Remove the hardcoded `generic.otel` fallback for `dynamic_signal_types` packages from `getNormalizedDataStreams`. The default dataset is now `datasetName || createDefaultDatasetName(packageInfo, policyTemplate)` for all input-only packages, making the package manifest the authority for the default value (via the `data_stream.dataset` var declared in the package). **`package_policies_to_agent_permissions.ts`** - Grant extra `logs-*-*` privileges for OTel span events using the **same** dataset as routing (`compiled_stream` / stream `data_stream.dataset`, with `data_stream.dataset` stream var override parity for `otelcol`). - When resolving span-event `logs` index privileges, normalize the `data_stream.dataset` stream var: accept only a non-empty string (after trim) or an object with a non-empty string `dataset`; otherwise fall back to the compiled/stream dataset. This avoids malformed index patterns from invalid `any`-typed values (e.g. `{}`, arrays). **Tests** - Updated unit tests in `otel_collector.test.ts` to expect `data_stream.dataset` in routing statements for dynamic packages, traces/span context, and related cases. - Updated integration tests in `agent_policy_otel_routing.ts` to assert dataset is set (Test 1: package default `generic` from `test_otel_dynamic` fixture manifest; Test 2: user-provided override). - Updated `package_policies_to_agent_permissions.test.ts` expectations for span-event logs indices (`logs-{streamDataset}-*`), **plus** Jest coverage for invalid dataset var shapes (empty object, array, object `{ dataset }`, whitespace / empty nested `dataset`). - Added Fleet API integration tests in `agent_policy_input_logfile_dataset.ts` for `data_stream.dataset` defaults and overrides on logfile input packages in the full agent policy. - Added unit tests in `package_policy.test.ts` (`_compilePackagePolicyInputs`) for compiling integration streams with `data_stream.dataset` from the package stream var default, user override, and two-level dataset names. - Consolidated redundant `policy_template.test.ts` cases into a single test that documents `dynamic_signal_types` no longer affects the default dataset computation. **Documentation (Fleet dev docs + comments)** - `dev_docs/data_streams.md`: OTel registry vs EPM `.otel` vs policy dataset, overrides, permissions caveat. - JSDoc on `getRegistryDataStreamAssetBaseName` and `generateOtelcolConfig`; comment on `getFullInputStreams` otelcol dataset var. Tracking issue: elastic/ingest-dev#7403 Regression introduced by: elastic#260385 ### Checklist - [ ] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md) - [x] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials (`dev_docs/data_streams.md` OTel section + JSDoc; no end-user tutorial) - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [ ] If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the [docker list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker) - [ ] This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The `release_note:breaking` label should be applied in these situations. - [ ] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed - [x] The PR description includes the appropriate Release Notes section, and the correct `release_note:*` label is applied per the [guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) - [x] Review the [backport guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing) and apply applicable `backport:*` labels. ### Identify risks - **Span event routing**: Span events use the same policy `data_stream.dataset` as spans for OTTL logs routing (not a hardcoded `null`). Extra agent output privileges for span events now follow that dataset (and the `data_stream.dataset` stream var when set) so permissions stay aligned with routing. - **Malformed stream var payloads**: For span-event output index privileges, invalid `data_stream.dataset` var shapes fall back to the compiled/stream dataset so agents are not granted unusable index patterns (the var `value` is typed as `any` in saved objects/API). - **Template vs override**: Installed index templates still use registry dataset + Fleet `.otel` for EPM naming; a custom dataset var can still diverge from templates if not coordinated at the package level (documented in `dev_docs/data_streams.md`). ### Release Notes N/A — `release_note:skip` --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Michel Losier <michel.losier@elastic.co>
1 parent 7fdf33f commit b303a5e

14 files changed

Lines changed: 827 additions & 102 deletions

File tree

x-pack/platform/plugins/shared/fleet/common/services/datastream_es_name.ts

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,23 @@ import {
1212
} from '../constants';
1313

1414
/**
15-
* Creates the base name for Elasticsearch assets in the form of
16-
* {type}-{dataset}
15+
* Creates the base name for Elasticsearch assets (index template patterns,
16+
* related EPM naming) in the form `{type}-{dataset}`, optionally with an
17+
* OpenTelemetry suffix.
18+
*
19+
* When `isOtelInputType` is true (OTel `otelcol` data streams with
20+
* `enableOtelIntegrations`), Fleet appends `.{OTEL_TEMPLATE_SUFFIX}` (`otel`)
21+
* so patterns match bases such as `traces-generic.otel`. This applies only to
22+
* **Elasticsearch asset naming at package install** — not to the
23+
* `data_stream.dataset` string stored on package policies or emitted in
24+
* generated OTel collector OTTL (see `generateOtelcolConfig` and
25+
* `getFullInputStreams`).
26+
*
27+
* If the registry `dataset` already contained `.otel` as part of its logical
28+
* name, this function still appends the suffix; callers should not rely on
29+
* implicit deduplication.
30+
*
31+
* See: `dev_docs/data_streams.md` (OpenTelemetry integrations and the `.otel` suffix).
1732
*/
1833
export function getRegistryDataStreamAssetBaseName(
1934
dataStream: {

x-pack/platform/plugins/shared/fleet/common/services/policy_template.test.ts

Lines changed: 26 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -470,47 +470,33 @@ describe('getNormalizedDataStreams', () => {
470470
expect(useApmVar?.default).toEqual(true);
471471
});
472472

473-
it('should use generic.otel as default dataset for dynamic_signal_types packages', () => {
474-
const result = getNormalizedDataStreams({
475-
...integrationPkg,
476-
type: 'input',
477-
policy_templates: [
478-
{
479-
input: 'otelcol',
480-
name: 'otlpreceiver',
481-
template_path: 'some/path.hbl',
482-
title: 'OTLP',
483-
description: 'OTLP input',
484-
dynamic_signal_types: true,
485-
vars: [],
486-
},
487-
],
488-
});
489-
expect(result).toHaveLength(1);
490-
// Dataset should be 'generic.otel', NOT the policy_template-based default 'nginx.otlpreceiver'
491-
expect(result[0].dataset).toEqual('generic.otel');
492-
expect(result[0].path).toEqual('generic.otel');
493-
});
473+
it('should derive default dataset from packageName.templateName regardless of dynamic_signal_types', () => {
474+
const makePkg = (name: string, dynamicSignalTypes?: boolean) =>
475+
getNormalizedDataStreams({
476+
...integrationPkg,
477+
type: 'input',
478+
policy_templates: [
479+
{
480+
input: 'otelcol',
481+
name,
482+
template_path: 'some/path.hbl',
483+
title: name,
484+
description: name,
485+
...(dynamicSignalTypes !== undefined
486+
? { dynamic_signal_types: dynamicSignalTypes }
487+
: {}),
488+
vars: [],
489+
},
490+
],
491+
});
494492

495-
it('should use policy_template-based default dataset for non-dynamic_signal_types packages', () => {
496-
const result = getNormalizedDataStreams({
497-
...integrationPkg,
498-
type: 'input',
499-
policy_templates: [
500-
{
501-
input: 'otelcol',
502-
name: 'mysqlreceiver',
503-
type: 'metrics',
504-
template_path: 'some/path.hbl',
505-
title: 'MySQL OTel',
506-
description: 'MySQL metrics via OTel',
507-
vars: [],
508-
},
509-
],
510-
});
511-
expect(result).toHaveLength(1);
512-
// Without dynamic_signal_types, dataset is derived from packageName.templateName
513-
expect(result[0].dataset).toEqual('nginx.mysqlreceiver');
493+
// dynamic_signal_types: true — same createDefaultDatasetName behaviour as non-dynamic
494+
expect(makePkg('otlpreceiver', true)[0].dataset).toEqual('nginx.otlpreceiver');
495+
expect(makePkg('otlpreceiver', true)[0].path).toEqual('nginx.otlpreceiver');
496+
497+
// dynamic_signal_types: false / absent — same behaviour
498+
expect(makePkg('mysqlreceiver', false)[0].dataset).toEqual('nginx.mysqlreceiver');
499+
expect(makePkg('mysqlreceiver')[0].dataset).toEqual('nginx.mysqlreceiver');
514500
});
515501
});
516502

x-pack/platform/plugins/shared/fleet/common/services/policy_template.ts

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -199,15 +199,7 @@ export function getNormalizedDataStreams(
199199
}
200200

201201
return policyTemplates.map((policyTemplate) => {
202-
const isOtelDynamicSignalTypes = policyTemplate.dynamic_signal_types === true;
203-
// Packages with dynamic_signal_types defer dataset routing to the ES exporter (via scope.name
204-
// or explicit data_stream.* attrs). Use 'generic.otel' as the default so any fallback lands
205-
// in the generic OTel data streams rather than a policy-template-named data stream.
206-
const dataset =
207-
datasetName ||
208-
(isOtelDynamicSignalTypes
209-
? 'generic.otel'
210-
: createDefaultDatasetName(packageInfo, policyTemplate));
202+
const dataset = datasetName || createDefaultDatasetName(packageInfo, policyTemplate);
211203

212204
let vars = addDatasetVarIfNotPresent(policyTemplate.vars, policyTemplate.name);
213205
if (

x-pack/platform/plugins/shared/fleet/dev_docs/data_streams.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,23 @@ A data stream is an index template with the data stream flag set to true. Each d
1313
Other details to note about the index template:
1414
- we set priority to 200, this is to beat the generic `logs-*-*`, `metrics-*-*`, `synthetics-*-*` index templates. We advise users set their own index template priority below 100 [here](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-templates.html).
1515
- Fleet index templates are set to managed to deter users from editing them. However it is not necessarily safe to assume that Fleet index templates (or any managed asset) haven't been modified by the user, but if they have been modified we do not have to preserve these changes.
16+
17+
### OpenTelemetry integrations and the `.otel` suffix
18+
19+
OpenTelemetry (`otelcol`) integration packages use two related but different notions of **dataset**:
20+
21+
1. **Registry / package `dataset`** — The value declared on the integration data stream in the package manifest (often short, e.g. `generic`). The Fleet UI and saved package policy streams refer to this registry identity when matching streams to package definitions.
22+
23+
2. **Elasticsearch index naming** — When experimental `enableOtelIntegrations` is on and a data stream uses the `otelcol` input, Fleet appends a **`.otel` segment** only when computing Elasticsearch asset names (index template patterns, etc.). This happens in [`getRegistryDataStreamAssetBaseName`](../common/services/datastream_es_name.ts) via `isOtelInputType`, producing bases such as `traces-generic.otel` (not by requiring `.otel` inside the manifest `dataset` string). The Elastic Agent does not add this suffix to templates; **Kibana EPM** installs templates using that naming at package install time.
24+
25+
3. **`data_stream.dataset` on the collector** — The merged agent policy carries `data_stream.dataset` for each OTel stream. Fleet generates OpenTelemetry Collector config (including OTTL `set(attributes["data_stream.dataset"], "...")` statements) from that value **as-is**; it does **not** append `.otel` there. Optional stream variable `data_stream.dataset` overrides replace the dataset string verbatim for policy output (see [`getFullInputStreams`](../server/services/agent_policies/package_policies_to_agent_inputs.ts)). Further routing defaults may still apply inside the collector or Elasticsearch exporter at runtime (outside Kibana).
26+
27+
**Overrides:** If a user sets `data_stream.dataset` to a custom value (including values that already contain `.otel`), Fleet embeds that literal string in generated OTTL. Fleet does not strip or deduplicate a trailing `.otel`. Installed index templates remain tied to the **registry** dataset plus Fleet’s `.otel` suffix for EPM naming, **not** to the live policy variable—so a custom dataset can target backing indices that only resolve correctly when templates, `dataset_is_prefix`, or exporter routing align with that choice.
28+
29+
**Agent output privileges:** [`storedPackagePoliciesToAgentPermissions`](../server/services/agent_policies/package_policies_to_agent_permissions.ts) builds index names from `compiled_stream?.data_stream?.dataset ?? stream.data_stream.dataset`. It does **not** apply the same `stream.vars['data_stream.dataset']` merge as `getFullInputStreams`. When debugging “permission denied” vs routing, compare full agent policy `data_stream.dataset` with the privilege index patterns.
30+
31+
**Acceptance tests:** Routing transforms and dataset override behaviour for full agent policies are covered in [`agent_policy_otel_routing.ts`](../../../../test/fleet_api_integration/apis/agent_policy/agent_policy_otel_routing.ts) (Fleet API integration tests).
32+
1633
### Component Templates (as of 8.2)
1734
In order of priority from highest to lowest:
1835
- `.fleet_agent_id_verification-1` - added when agent id verification is enabled, sets the `.fleet_final_pipeline-1` and agent ID mappings. ([we plan to remove the ability to disable agent ID verification](https://github.com/elastic/kibana/issues/127041) )

x-pack/platform/plugins/shared/fleet/server/services/agent_policies/otel_collector.test.ts

Lines changed: 47 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -615,13 +615,15 @@ describe('generateOtelcolConfig', () => {
615615
context: 'span',
616616
statements: [
617617
'set(attributes["data_stream.type"], "traces")',
618+
'set(attributes["data_stream.dataset"], "zipkinreceiver")',
618619
'set(attributes["data_stream.namespace"], "apmtest")',
619620
],
620621
},
621622
{
622623
context: 'spanevent',
623624
statements: [
624625
'set(attributes["data_stream.type"], "logs")',
626+
'set(attributes["data_stream.dataset"], "zipkinreceiver")',
625627
'set(attributes["data_stream.namespace"], "apmtest")',
626628
],
627629
},
@@ -671,6 +673,39 @@ describe('generateOtelcolConfig', () => {
671673
});
672674
});
673675

676+
it('should include dataset in span routing transform for traces input without use_apm', () => {
677+
const otelTracesInputNoAPM: FullAgentPolicyInput = {
678+
...otelTracesInputWithAPM,
679+
streams: otelTracesInputWithAPM.streams?.map((stream) => {
680+
const { use_apm: _useApm, ...rest } = stream as any;
681+
return rest;
682+
}),
683+
};
684+
const inputs: FullAgentPolicyInput[] = [otelTracesInputNoAPM];
685+
const result = generateOtelcolConfig({ inputs, dataOutput: defaultOutput });
686+
687+
expect(
688+
result.processors?.['transform/test-traces-stream-id-1-routing']?.trace_statements
689+
).toEqual([
690+
{
691+
context: 'span',
692+
statements: [
693+
'set(attributes["data_stream.type"], "traces")',
694+
'set(attributes["data_stream.dataset"], "zipkinreceiver")',
695+
'set(attributes["data_stream.namespace"], "apmtest")',
696+
],
697+
},
698+
{
699+
context: 'spanevent',
700+
statements: [
701+
'set(attributes["data_stream.type"], "logs")',
702+
'set(attributes["data_stream.dataset"], "zipkinreceiver")',
703+
'set(attributes["data_stream.namespace"], "apmtest")',
704+
],
705+
},
706+
]);
707+
});
708+
674709
it('should produce separate aggregated-apm-metrics pipelines for two APM package policies with different namespaces', () => {
675710
const inputA: FullAgentPolicyInput = {
676711
...otelTracesInputWithAPM,
@@ -967,13 +1002,13 @@ describe('generateOtelcolConfig', () => {
9671002
const inputs: FullAgentPolicyInput[] = [otelInputWithMultipleSignalTypes];
9681003
const result = generateOtelcolConfig({ inputs, dataOutput: defaultOutput, packageInfoCache });
9691004

970-
// dynamic_signal_types: data_stream.dataset is NOT set — deferred to ES exporter routing
9711005
expect(result.processors?.['transform/test-multi-signal-stream-id-1-routing']).toEqual({
9721006
log_statements: [
9731007
{
9741008
context: 'log',
9751009
statements: [
9761010
'set(attributes["data_stream.type"], "logs")',
1011+
'set(attributes["data_stream.dataset"], "multidataset")',
9771012
'set(attributes["data_stream.namespace"], "default")',
9781013
],
9791014
},
@@ -983,6 +1018,7 @@ describe('generateOtelcolConfig', () => {
9831018
context: 'datapoint',
9841019
statements: [
9851020
'set(attributes["data_stream.type"], "metrics")',
1021+
'set(attributes["data_stream.dataset"], "multidataset")',
9861022
'set(attributes["data_stream.namespace"], "default")',
9871023
],
9881024
},
@@ -992,13 +1028,15 @@ describe('generateOtelcolConfig', () => {
9921028
context: 'span',
9931029
statements: [
9941030
'set(attributes["data_stream.type"], "traces")',
1031+
'set(attributes["data_stream.dataset"], "multidataset")',
9951032
'set(attributes["data_stream.namespace"], "default")',
9961033
],
9971034
},
9981035
{
9991036
context: 'spanevent',
10001037
statements: [
10011038
'set(attributes["data_stream.type"], "logs")',
1039+
'set(attributes["data_stream.dataset"], "multidataset")',
10021040
'set(attributes["data_stream.namespace"], "default")',
10031041
],
10041042
},
@@ -1008,6 +1046,7 @@ describe('generateOtelcolConfig', () => {
10081046
context: 'profile',
10091047
statements: [
10101048
'set(attributes["data_stream.type"], "profiles")',
1049+
'set(attributes["data_stream.dataset"], "multidataset")',
10111050
'set(attributes["data_stream.namespace"], "default")',
10121051
],
10131052
},
@@ -1019,13 +1058,13 @@ describe('generateOtelcolConfig', () => {
10191058
const inputs: FullAgentPolicyInput[] = [otelInputWithMultipleSignalTypes2];
10201059
const result = generateOtelcolConfig({ inputs, dataOutput: defaultOutput, packageInfoCache });
10211060

1022-
// dynamic_signal_types: data_stream.dataset is NOT set — deferred to ES exporter routing
10231061
expect(result.processors?.['transform/test-multi-signal-stream-id-1-routing']).toEqual({
10241062
log_statements: [
10251063
{
10261064
context: 'log',
10271065
statements: [
10281066
'set(attributes["data_stream.type"], "logs")',
1067+
'set(attributes["data_stream.dataset"], "multidataset")',
10291068
'set(attributes["data_stream.namespace"], "default")',
10301069
],
10311070
},
@@ -1035,6 +1074,7 @@ describe('generateOtelcolConfig', () => {
10351074
context: 'datapoint',
10361075
statements: [
10371076
'set(attributes["data_stream.type"], "metrics")',
1077+
'set(attributes["data_stream.dataset"], "multidataset")',
10381078
'set(attributes["data_stream.namespace"], "default")',
10391079
],
10401080
},
@@ -1044,13 +1084,15 @@ describe('generateOtelcolConfig', () => {
10441084
context: 'span',
10451085
statements: [
10461086
'set(attributes["data_stream.type"], "traces")',
1087+
'set(attributes["data_stream.dataset"], "multidataset")',
10471088
'set(attributes["data_stream.namespace"], "default")',
10481089
],
10491090
},
10501091
{
10511092
context: 'spanevent',
10521093
statements: [
10531094
'set(attributes["data_stream.type"], "logs")',
1095+
'set(attributes["data_stream.dataset"], "multidataset")',
10541096
'set(attributes["data_stream.namespace"], "default")',
10551097
],
10561098
},
@@ -1060,6 +1102,7 @@ describe('generateOtelcolConfig', () => {
10601102
context: 'profile',
10611103
statements: [
10621104
'set(attributes["data_stream.type"], "profiles")',
1105+
'set(attributes["data_stream.dataset"], "multidataset")',
10631106
'set(attributes["data_stream.namespace"], "default")',
10641107
],
10651108
},
@@ -1095,13 +1138,13 @@ describe('generateOtelcolConfig', () => {
10951138
const inputs: FullAgentPolicyInput[] = [otelInputWithSubsetSignalTypes];
10961139
const result = generateOtelcolConfig({ inputs, dataOutput: defaultOutput, packageInfoCache });
10971140

1098-
// dynamic_signal_types: data_stream.dataset is NOT set — deferred to ES exporter routing
10991141
expect(result.processors?.['transform/test-multi-signal-stream-id-1-routing']).toEqual({
11001142
log_statements: [
11011143
{
11021144
context: 'log',
11031145
statements: [
11041146
'set(attributes["data_stream.type"], "logs")',
1147+
'set(attributes["data_stream.dataset"], "multidataset")',
11051148
'set(attributes["data_stream.namespace"], "default")',
11061149
],
11071150
},
@@ -1111,6 +1154,7 @@ describe('generateOtelcolConfig', () => {
11111154
context: 'datapoint',
11121155
statements: [
11131156
'set(attributes["data_stream.type"], "metrics")',
1157+
'set(attributes["data_stream.dataset"], "multidataset")',
11141158
'set(attributes["data_stream.namespace"], "default")',
11151159
],
11161160
},

0 commit comments

Comments
 (0)