Skip to content

ESQL: incorrect planning of remote enrich #118531

Open
@bpintea

Description

@bpintea

Description

A remote enrich query using a policy that exports/enriches with a field already present in the query is planned slightly incorrectly -- it works, but fails verification.

Example: policy hosts matches on ip and has as enrich_fields ip and os:

FROM *:events,events
| EVAL ip= TO_STR(host)
| SORT timestamp, user, ip
| LIMIT 5
| ENRICH  _REMOTE:hosts ON ip
| KEEP host, timestamp, user, os

This produces the optimised physical plan:

ProjectExec[[host{f}#14, timestamp{f}#16, user{f}#15, os{r}#21]]
\_TopNExec[[Order[timestamp{f}#16,ASC,LAST], Order[user{f}#15,ASC,LAST], Order[ip{r}#3,ASC,LAST]],5[INTEGER],null]
  \_ExchangeExec[[host{f}#14, timestamp{f}#16, user{f}#15, os{r}#21, ip{r}#3],false]
    \_FragmentExec[filter=null, estimatedRowSize=0, reducer=[], fragment=[<>
        Project[[host{f}#14, timestamp{f}#16, user{f}#15, os{r}#21, ip{r}#3]]
        \_Enrich[REMOTE,[68 6f 73 74 73][KEYWORD],ip{r}#3,{"match":{"indices":[],"match_field":"ip","enrich_fields":["ip","os"]}},{=.enrich-hosts-1733836249291, c1=.enrich-hosts-1733836248939, c2=.enrich-hosts-1733836249107},[ip{r}#20, os{r}#21]]
          \_TopN[[Order[timestamp{f}#16,ASC,LAST], Order[user{f}#15,ASC,LAST], Order[ip{r}#3,ASC,LAST]],5[INTEGER]]
            \_Eval[[TOSTRING(host{f}#14) AS ip]]
              \_EsRelation[events,c1:events,c2:events][host{f}#14, timestamp{f}#16, user{f}#15]<>]]

Note that in the fragment, Project outputs ip{r}#3 (the node is produced by ProjectAwayColumns based on the TopN below Enrich), but Enrich below it outputs ip{r}#20 (since it also enriches with its own ip field). So the verification fails later when remapping the fragment plan on ProjectExec, since its inputs don't provide an ip{r}#3. (If we KEEP ip too, the verification would also fail due to attributes with duplicate name.)

Normally we would drop the ip after TopN, but the Enrich remote planning pushes it to the remote cluster and the ip is still needed for the coordinator TopN.

Related: #118307.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions