Open
Description
For array fields treated as unordered sets, we should add synthetic_source_keep: "none"
to the mappings to optimize storage under LogsDB. Fields like host.ip
and related.ip
would be candidates because order and duplicates are irrelevant.
Adding this option prevents the array field from being stored in _source
.
Support for this is in-progress in Elasticsearch and will be first available in 8.16.
References
- https://github.com/elastic/elasticsearch/blob/a2df7e7229e772eddc9a8aba2ecfdbd162810c78/server/src/main/java/org/elasticsearch/index/mapper/Mapper.java#L33
- https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-source-field.html#synthetic-source-keep
Related
- LogsDB compatibility: specify if the ordering of arrays needs to be preserved #2372 (no longer relevant as we switched to an opt-in model for array optimization in logsdb)