Closed
Description
Meta issue: #121652
The initial implementation of FORK will be under snapshot and will have the following restrictions:
- First level data retrieval only - not yet general purpose bifurcation of the stream. This allows us to support multiple different subqueries. For bifurcation of the stream, then the planner will have to determine that the fork is actually being performed in second stage retrieval. This is a pragmatic limitation that we can lift later.
- All branches of the fork must return the same data scheme (same columns). This is a pragmatic limitation that we can lift later. For this reason, only WHERE, SORT, and LIMIT, are supported within fork subqueries.
- No fork within a fork. This is a pragmatic limitation that we can lift later.
- Lucene queries are independent - no point-in-time. We can add this later
- Fork branches are automatically named. We can provide the ability to name the branches later.
Tasks:
- ES|QL: Initial grammar and changes for FORK (snapshot) #121948
- streaming results from subplans, rather than using the Inline stats trick from EsqlSession Streaming execution for FORK #126389
- support queries using semantic search ES|QL: Enable semantic search in FORK #125960
- optimize field resolution so we don't return all fields when FORK is used (see
fieldNames
inEsqlSession
) ES|QL: Make Fork n-ary #126074 - add verification and metric gathering for subplans ES|QL: Make Fork n-ary #126074
- fix nesting of commands-names in expressions (see testCommandNamesAsIdentifiersWithLimit)
- revisit the use of explicit analyser rules, e.g. ImplicitForkCasting, etc, with the intention of generalising them (rather than an explicit "fork variant" for each rule) ES|QL: Make Fork n-ary #126074
- validation when using more than a single FORK Initial FORK with restrictions #121950
- make sure the profile information contains all drivers and that they are properly tagged with the fork branch identifier ES|QL: label drivers in profile info for FORK #128318
- field names shenanigans ES|QL: Fix column references for FORK #127209 ES|QL: Improve field resolution for FORK #128501
- ES|QL: Return columns for unsupported field types in FORK #128508
- ES|QL: Support STATS after FORK #128745 fix for
FROM employees | FORK (EVAL x = 1) ( EVAL y = 1) | stats max(salary)
ends up with"stack_trace": "org.elasticsearch.ElasticsearchException$1: Cannot invoke \"org.elasticsearch.xpack.esql.planner.Layout$ChannelAndType.channel()\" because the return value of \"org.elasticsearch.xpack.esql.planner.Layout.get(org.elasticsearch.xpack.esql.core.expression.NameId)\" is null
- ES|QL: Support LOOKUP JOIN with FORK #128839 fix for
FROM employees | EVAL language_code = languages | LOOKUP JOIN languages_lookup ON language_code | FORK (EVAL x = 1) ( EVAL y = 1) | keep l*
results inorg.elasticsearch.xpack.esql.plan.physical.ProjectExec cannot be cast to class org.elasticsearch.xpack.esql.plan.physical.EsQueryExec
- FORK and union types - fixed as part of ES|QL: Add FORK generative tests #129135
- telemetry (sneaked it in ES|QL: CCS check for FORK/RERANK/COMPLETION #129463)
- Generative tests for FORK using existing CSV tests ES|QL: Add FORK generative tests #129135
- ES|QL: Add test query generator for FORK #129415
- Disable CCS ES|QL: CCS check for FORK/RERANK/COMPLETION #129463
- Fix for FORK branches with mixed outputs and unsupported field types #129636
- Out of snapshot PR ES|QL: Make fork available in release builds #129606
- ES|QL: Add number of max branches for FORK #129834
- ES|QL: Resolve Keep plan added to FORK branches #129754
- docs ES|QL: Add docs for FORK #130314