Skip to content

Resolve discrepancy between text subfield handling for *.name fields in ecs@mappings #2353

Open
@felixbarny

Description

@felixbarny

The ecs@mappings component template that ships with Elasticsearch by default has a dynamic template with path match on *.name that adds a .text subfield. However, in actual ECS, not all *.name fields have a .text subfield.

The reasons why we still added the *.name rule in elastic/elasticsearch#96171 included making the component template smaller, more consistent, more generic, and more forward compatible, in the sense that we don't need to constantly add new field definitions for new *.name fields.

However, in effect, the ecs@mappings component template isn't technically ECS compliant, which leads to issues like elastic-package reporting errors when integrations rely on ecs@mappings: elastic/elastic-package#1971. It also seems like whether or not a field should have a text subfield is a decision we should make at the ECS level rather than being a side-effect of how ecs@mappings is implemented.

In total, there are 150 ECS fields that end with .name. Out of these, 41 have a .text sub-field and 109 don't.

There are multiple options to move forward from here:

  1. Do nothing: define that having additional sub-fields is not a violation of ECS. We'll update the validation logic for elastic-package accordingly. However, ECS compatibility has less strong guarantees this way. It also means that ecs@mappings has a less efficient mapping compared to "proper ECS".
  2. Align ecs@mappings with the current definition of ECS. We'll probably implement this by listing fields that have a .text subfield as there are fewer of them. In other words, *.name fields won't have a .text subfield by default. We'll need to expect that changes to ecs@mappings are going to be a bit more frequent and that the mapping is less forwards compatible.
  3. Align ECS with ecs@mappings to make sure all *.name fields are mapped consistently. It'll be a bit easier for users to reason about what type of queries they can expect to work on *.name fields. It would also bring us closer to a place where ECS is built around naming conventions rather than one-off per field decisions. However, this would add a bit more storage overhead compared to what we have today.

cc'ing a couple of folks that may have thoughts on this: @ruflin @eyalkoren @jsoriano @zmoog @andrewkroh @P1llus

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions