Skip to content

Many duplicate dynamic mappers are created when indexing arrays of unknown fields. #117593

@original-brownbear

Description

@original-brownbear

In org.elasticsearch.index.mapper.DocumentParserContext#addDynamicMapper we have a TODO that asks to optimize the way we dynamically map object arrays:

        // TODO we may want to stop adding object mappers to the dynamic mappers list: most times they will be mapped when parsing their
        // sub-fields (see ObjectMapper.Builder#addDynamic), which causes extra work as the two variants of the same object field
        // will be merged together when creating the final dynamic update. The only cases where object fields need extra treatment are
        // dynamically mapped objects when the incoming document defines no sub-fields in them:
        // 1) by default, they would be empty containers in the mappings, is it then important to map them?
        // 2) they can be the result of applying a dynamic template which may define sub-fields or set dynamic, enabled or subobjects.
        dynamicMappers.computeIfAbsent(mapper.fullPath(), k -> new ArrayList<>()).add(mapper);

We should address this TODO. The current behavior of potentially creating a bunch of duplicate mapper instances causes a massive spike in heap use for duplicate mapper instances if the first document containing a new (sub-)field via an array of objects is mapped.

Metadata

Metadata

Labels

:Search Foundations/MappingIndex mappings, including merging and defining field types>bugTeam:Search FoundationsMeta label for the Search Foundations team in Elasticsearchpriority:normalA label for assessing bug priority to be used by ES engineers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions