Skip to content

[ES|QL] Date nanos implicit casting in union types #123678

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

fang-xing-esql
Copy link
Member

@fang-xing-esql fang-xing-esql commented Feb 28, 2025

Resolves #110009

The goal of this PR is to support union typed fields that reference both date and date_nanos fields without explicit casting. Union typed(mixed with date and date_nanos) fields are casted to a common data type during Analyzer phase automatically. numeric or string typed fields are not within the scope of this PR.

Today, queries that reference union(mixed with date and date_nanos) typed fields like below either return nulls or fail, because they reference union typed fields without explicit casting. After this PR, they should not fail or return null.

mapping
curl -u elastic:password -X PUT "localhost:9200/sample_data_1?pretty" -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "properties": {
      "client.ip": {"type": "ip"},
      "message": {"type": "text"},
      "test_date": {"type": "date"},
      "event.duration": {"type": "long"}
    }
  }
}
'
curl -u elastic:password -X PUT "localhost:9200/sample_data_2?pretty" -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "properties": {
      "client.ip": {"type": "ip"},
      "message": {"type": "keyword"},
      "test_date": {"type": "date_nanos"},
      "event.duration": {"type": "integer"}
    }
  }
}
'
queries
+ curl -u elastic:password -X POST 'localhost:9200/_query?format=txt&pretty' -H 'Content-Type: application/json' '-d
{
  "query": "FROM sample_data*"
}
'
       @timestamp       |   client.ip   |event.duration |    message    |   test_date   
------------------------+---------------+---------------+---------------+---------------
2023-10-23T12:15:03.360Z|172.21.2.162   |null           |null           |null           
2023-10-23T12:27:28.948Z|172.21.2.113   |null           |null           |null           
2023-10-23T13:33:34.937Z|172.21.0.5     |null           |null           |null           
2023-10-23T13:51:54.732Z|172.21.3.15    |null           |null           |null           
2023-10-23T13:52:55.015Z|172.21.3.15    |null           |null           |null           
2023-10-23T13:53:55.832Z|172.21.3.15    |null           |null           |null           
2023-10-23T13:55:01.543Z|172.21.3.15    |null           |null           |null           
2023-10-23T12:15:03.360Z|172.22.2.162   |null           |null           |null           
2023-10-23T12:27:28.948Z|172.22.2.113   |null           |null           |null           
2023-10-23T13:33:34.937Z|172.22.0.5     |null           |null           |null           
2023-10-23T13:51:54.732Z|172.22.3.15    |null           |null           |null           
2023-10-23T13:52:55.015Z|172.22.3.15    |null           |null           |null           
2023-10-23T13:53:55.832Z|172.22.3.15    |null           |null           |null           
2023-10-23T13:55:01.543Z|172.22.3.15    |null           |null           |null           

+ curl -u elastic:password -X POST 'localhost:9200/_query?format=txt&pretty' -H 'Content-Type: application/json' '-d
{
  "query": "FROM sample_data* | sort test_date"
}
'
{
  "error" : {
    "root_cause" : [
      {
        "type" : "verification_exception",
        "reason" : "Found 1 problem\nline 1:26: Cannot use field [test_date] due to ambiguities being mapped as [2] incompatible types: [date_nanos] in [sample_data_2], [datetime] in [sample_data_1]"
      }
    ],
    "type" : "verification_exception",
    "reason" : "Found 1 problem\nline 1:26: Cannot use field [test_date] due to ambiguities being mapped as [2] incompatible types: [date_nanos] in [sample_data_2], [datetime] in [sample_data_1]"
  },
  "status" : 400
}

A field can be referenced in nearly all the processing commands, and this PR should cover all the possible places a (union typed)field can appear:

  • evalCommand
  • whereCommand
  • keepCommand
  • sortCommand
  • dropCommand
  • renameCommand
  • statsCommand
  • dissectCommand - works for keyword and text only, out of the scope of this PR
  • grokCommand - works for keyword and text only, out of the scope of this PR
  • enrichCommand - enrich does not support date_nanos yet
  • mvExpandCommand
  • joinCommand - lookup join on date_nanos field does not work today, out of the scope of this PR

@elasticsearchmachine
Copy link
Collaborator

Hi @fang-xing-esql, I've created a changelog YAML for you.

@fang-xing-esql
Copy link
Member Author

@elasticmachine update branch

@costin
Copy link
Member

costin commented Mar 20, 2025

Fang, can you please double check the status of this PR? Is it ready for review or still working on it?

@fang-xing-esql
Copy link
Member Author

Fang, can you please double check the status of this PR? Is it ready for review or still working on it?

I'm trying to make enrich work with union typed fields, and this is the last piece that I want to figure out before making this ready for review.

@fang-xing-esql fang-xing-esql marked this pull request as ready for review April 8, 2025 02:35
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Apr 8, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@alex-spies alex-spies self-requested a review April 8, 2025 15:19
Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not done with the review yet, but left a first round of remarks - mostly questions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we emit a warning, given that the user may be unaware of the auto-cast, AND the auto-cast can lead to some dates being out of range of the cast?

Copy link
Contributor

@alex-spies alex-spies Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we consider adding an optional query parameter that would enable this lenient behavior, rather than making this the default?

Changing the output for the hire_date field from all nulls to the auto-casted data is technically breaking, which would be mitigated by this approach; although it'd be admittedly very weird if a query relied on the nulls being there.

Update: Or a setting. That's how the changing of the default behavior on partial failures is being rolled out: there is a setting that controls the behavior, and the actual change of default behavior is then just a switch of the default value of the setting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wonder:

For wide index pattern queries like the simple FROM logs-*, currently the result on large clusters with many indices consists of maaaany null columns, because of mapping conflicts.

Loading/sending constant null blocks is comparably cheap.

If we start enabling auto-casting whenever we can, these queries will now read and return actual data; this might cause performance issues for this kind of query, no?

@fang-xing-esql , maybe we should check in with the others, ensure that this case is covered by our nightlies, and observe if this noticeably changes the performance in some cases, once this is merged?


FROM employees, employees_incompatible
| DROP birth_date
| EVAL x = emp_no + 1, y = hire_date + 1 year, z = hire_date::datetime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to compute hire_date::datetime? Is all converted to nanos, first, when extracting the field initially - and then cast to datetimes?

If so, this would be losing datetimes that are far in the future/past. We should probably ensure that we skip the first cast to date_nanos if the first thing we do with the field attribute is casting it back to datetime. If we don't do that already. (Not deep enough into the review, yet, so apologies if that's already addressed.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add a test case for this. We should probably stick to our current union type logic when encountering an explicit cast of a union type field, e.g. hire_date::datetime - and therefore, dates outside of the nanos range should pose no problem, while nanos will have to undergo rounding.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't double cast, if a (union typed)field has been cast to a type explicitly, it won't be casted to date_nanos implicitly again. ResolveUnionTypes transforms the InvalidMappedFields(resolved) with explicit casting into MultiTypeEsFields, and ImplicitCastingForUnionTypedFields happens after it. ImplicitCastingForUnionTypedFields finds the InvalidMappedFields(resolved, mixed with date and date_nanos) without explicit casting, and adds implicit casting for them, the AnalyzerTests.testImplicitCastingForDateAndDateNanosFields is to check there is no double casting.

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done now. Thanks @fang-xing-esql , this is a good start and I like the strong test coverage.

However, I see the following problems:

  • This seems to go straight to production. Please, let's guard this behind SNAPSHOT for now. We could add a setting or pragma to control this, and flip the setting's default value later when we're confident that all is well.
  • The scope is expanded to all numerical types. I think we should check back that this really doesn't have unintended side effects before moving forward.
  • This could accidentally worsen the performance of wide index pattern queries, like FROM log-s* | limit 10. We'd have to confirm via benchmarks.
  • I think the scope is also accidentally expanded to allow arbitrary comparisons between date time and nanos in any expression, due to the change to the EsqlDataTypeConverter, but we don't have corresponding tests.
  • The code needs to special case, in multiple places, for a white list of commands that are currently working and tested with this. This is brittle and will break when we add new commands. This also adds a lot of code to facilitate this separately from the ResolveUnionTypes analyzer rule by injecting Evals, which I think could be simplified. We also require changes to the type resolution of several commands to account for UNSUPPORTED, which we didn't need for union types, previously.

I think the injecting of EVALs also implies that in a query like

FROM idx* | eval x  =  multi_typed_date_and_nanos | eval y = multi_typed_date_and_nanos::datetime

the column y will be computed by first transforming to nanos on block loading, and then to datetimes in the compute engine - rather than loading the blocks as datetimes directly, leading to a lossy conversion.

I'm not sure, but the same could happen between long and float/double, potentially leading to unintended lossy conversions if the long has more digits than float/double can hold?

I think there may be a simpler way to implement this, which would lean on the existing union type logic more and wouldn't have to special case so much, or require injecting Evals, or run into the problem of accidental lossy conversions.
Currently, in ResolveUnionTypes, we scan the query for expressions of the kind some_multi_typed_field::target_type - and if we find them, we create a new field attribute in the EsRelation for the field loaded as the target type, and replace the expression by this field attribute. (We also deduplicate based on the target type if we've already come across this conversion).
I expect that we could get a simpler implementation if we expanded the scope of the rule to also look for multi-typed field attributes (except those that were already replaced because they were inside a convert function) and treat those field attributes as if they were of the form some_multi_typed_field::target_type, and just apply the same substitution by the field attribute that's been cast to the target type.
If I'm correct, the changes may be reduced to just determining which target type to apply there - everything else should work as it always did with union types.

required_capability: implicit_casting_union_typed_numeric_and_date

FROM employees, employees_incompatible
| STATS c = count(*), b = BUCKET(hire_date, 1 year) + 1 year BY yr = BUCKET(hire_date, 1 year)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could add a test where we use both hire_date::datetime and hire_date (auto-cast to nanos) in the same STATS?

Comment on lines 1343 to 1349
if (p instanceof OrderBy
|| p instanceof EsqlProject
|| p instanceof Aggregate
|| p instanceof MvExpand
|| p instanceof Eval
|| p instanceof LookupJoin
|| p instanceof Enrich) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A static white list of commands will be brittle. Newly added commands will not work with this approach and will require that implementers go back and understand the implementation here.

Copy link
Member Author

@fang-xing-esql fang-xing-esql Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll keep a note here before I forget why I decided to do this outside of ResolveUnionTypes, instead of adding more complexity into ResolveUnionTypes. There are a couple of reasons that I can think of now:

No.1 is related to resolving LogicalPlan, especially when a union typed field is referenced multiple times in the query.

For example there is a query like below:

from log | where union_typed_field < "2025-04-11" | sort union_typed_field;

If we do the implicit casting in ResolveUnionTypes, without checking the parent/child plan, we may end up cast the union_typed_field twice, which is unnecessary. We may need to add some complexity in this rule to check parent/child relationship, and reduce the amount the casting.

No.2 is that not all of the places where take an InvalidMappedField, take an expression like a ConversionFunction, an Eval plan needs to be added before the plan that references an InvalidMappedField, for example keep/drop/mv_expand/enrich they take identifier or patterns, but they don't take a functionExpression.

To be able to tangle these situation, we need to know the LogicalPlan, knowing it is an InvalidMappedField is not enough. I think that's the main reason why that I decided not to add this complexity into ResolveUnionTypes, and leave ResolveUnionTypes do its own job to deal with conversion functions.

When adding a new LogicalPlan, we will need to keep in mind that what kind of expressions are supported.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do the implicit casting in ResolveUnionTypes, without checking the parent/child plan, we may end up cast the union_typed_field twice, which is unnecessary. We may need to add some complexity in this rule to check parent/child relationship, and reduce the amount the casting.

This is a good thing to be paranoid about, but avoiding multiple casts of the same field attributes is a problem that ResolveUnionTypes already solved.

For each target type that union_typed_field is cast to in the query, ResolveUnionTypes adds 1 field attribute $$union_typed_field$converted_to$target_type to the EsRelation, and all casts of this field to target_type are simply replaced by this new field attribute in all expressions. Then later, once field extractions are getting inserted, only one extraction is inserted at the first point where ``$$union_typed_field$converted_to$target_type` is first referenced in the query plan.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent some time trying to solve this problem with a smaller change to ResolveUnionTypes, only, and while multiple casts would be easily solved this way, I didn't manage to find a solution for the problem that you stated:

No.2 is that not all of the places where take an InvalidMappedField, take an expression like a ConversionFunction, an Eval plan needs to be added before the plan that references an InvalidMappedField, for example keep/drop/mv_expand/enrich they take identifier or patterns, but they don't take a functionExpression.

Replacing datetime_nanos_field by $$datetime_nanos_field$converted_to$nanos everywhere in the plan as part of ResolveUnionTypes can't work because changing the name messes up the resolution, e.g. when there's a downstream EVAL that shadows datetime_nanos field.

Comment on lines 1355 to 1357
// Add an implicit EsqlProject on top of the whole query if there isn't one yet,
// without explicit or implicit casting, a union typed field is returned as a null.
newPlan = addImplicitProjectForInvalidMappedFields(newPlan);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already logic that does this in UnionTypesCleanup, which seems to be duplicated here at least to some extent.

Comment on lines 1410 to 1412
case EsqlProject p -> {
return esqlProjectForInvalidMappedField(p.source(), p.child(), aliases, newProjections);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicit special casing for all kinds of commands will be brittle. Also, it looks like we're not leveraging the existing code and logic in ResolveUnionTypes.

for (Expression e : newProjections) { // Add groupings
newAggs.add(Expressions.attribute(e));
}
return agg.with(evalForInvalidMappedField(agg.source(), agg.child(), aliases), newProjections, newAggs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we inject an EVAL here, there's a chance that we'll accidentally inject EVALs multiple times. Which could lead to broken plans, when an upstream EVAL already replaced the multi-typed field attribute by a reference attribute of the same name. Or, put the other way around, since we're replacing a field by another one with the same name, downstream plans may have invalid references now.

This could happen in case that the reference resolution manages to resolve the whole query plan, with multiple references to multi-typed fields, before we start adding Evals.

List<? extends NamedExpression> origAggs = agg.aggregates();
List<NamedExpression> newAggs = new ArrayList<>(origAggs.size());
for (int i = 0; i < origAggs.size() - newProjections.size(); i++) { // Add aggregate functions
newAggs.add(origAggs.get(i));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, the way to update the expressions of a plan is to transform the expressions and use AttributeMap.resolve with an AttributeMap<Attribute> that contains which attribute should be replaced by which other attribute.

A date in the valid date-nanos range should be acceptable to parse. This issue was showing up in a difference in behaviour between Elasticsearch ingest and the CsvTests. This fix makes CsvTests behave the same as Elasticearch itself.
Copy link
Contributor

@craigtaverner craigtaverner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran out of time, so I'll post the comments so far.

@fang-xing-esql fang-xing-esql changed the title [ES|QL] Date nanos implicit casting [ES|QL] Date nanos implicit casting in union types Apr 21, 2025
@fang-xing-esql
Copy link
Member Author

fang-xing-esql commented Apr 22, 2025

Thanks for reviewing and brainstorming @craigtaverner @alex-spies ! Most of the comments have been addressed, here is a summary of the list of things that have been discussed:

  1. The scope of this PR is reduced to do implicit casting for union typed fields with date and date_nanos only. It makes sense to take numeric types out, as there are a lot of questions on date_nanos already, date_nanos can be a pilot, so that we can collect feedbacks before supporting the other data types.
  2. We don't do double casting, if there is explicit casting on union typed fields with date and date_nanos coded in the query, they won't be cast to date_nanos implicitly.
  3. The implicit casting for union typed fields with date and date_nanos is taken out of the existing ImplicitCasting rule, a new rule is added after ResolveUnionTypes, to do implicit casting for the union typed fields(without explicit casting) that are not resolved by ResolveUnionTypes. We are not always able to replace a union typed field with a conversion function in place, sometimes an eval child plan needs to be added for the conversion functions, we need to traverse the plan tree, we can do it inside the ResolveUnionTypes rule, if it is preferred, however it will add some complexity to it, that's the main reason that I keep it as a separate rule in ImplicitCastingForUnionTypedFields.
  4. Enrich doesn't support date_nanos as a match field(checked in resolveEnrich), it is out of the scope of this PR.
  5. Lookup join does not work with date_nanos join keys yet, it is out of the scope of this PR, join is also a binary plan, in the future if we want to support it, we need some special handlings for binary plans when traversing and transforming the plan.

There are some comments that haven't been addressed yet:

  1. Add a warning message when we do implicit casting to date_nanos, I think this is a good idea, I haven't added it as I want to clarify with the other folks whether casting to date_nanos is the final decision, as date_nanos have a smaller range of valid values compared to date. I'll reach out to relevant folks to collect feedbacks on which data type to cast to.
  2. Performance. Filter and aggregations are the most interesting ones. Adding implicit casting may affect performance as it may prevent predicates to be pushed down into Lucene, and adding conversion functions to union typed fields add another layer of processing for operators that are not pushed down into Lucene(today we don't push down aggregations in a lot of cases). Although these queries that reference union typed fields fail at Verifier today, and allowing them to run is a step forward, if we can improve their performance by pushing down more into Lucene and reduce an extra layer of processing, that will be ideal. I'll investigate more about the possibility of unwrapping the conversion functions and pushing down more at data nodes.

Copy link
Contributor

@craigtaverner craigtaverner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks much nicer to me! I have a few comments and questions, but otherwise I think it is good.

@@ -180,7 +187,9 @@ public class Analyzer extends ParameterizedRuleExecutor<LogicalPlan, AnalyzerCon
*/
new ImplicitCasting(),
new ResolveRefs(),
new ResolveUnionTypes() // Must be after ResolveRefs, so union types can be found
new ResolveUnionTypes(), // Must be after ResolveRefs, so union types can be found
// Must be after ResolveUnionTypes, if there is explicit casting on the union typed fields, implicit casting won't be added
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice solution!

// MultiTypeEsField, and don't be double cast here.
Map<FieldAttribute, Alias> invalidMappedFieldCasted = new HashMap<>();
LogicalPlan transformedPlan = plan.transformUp(LogicalPlan.class, p -> {
if (p instanceof UnaryPlan == false && p instanceof LookupJoin == false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we allow through UnaryPlan and LookupJoin, but later we do nothing for LookupJoin. Is this a placeholder for a further update supporting LookupJoin?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a comment here for LookupJoin, and remove the instanceof check, I was attempting to make date_nanos fields work as join keys, however it didn't workout easily for me. Thank you for creating a separate issue for it!

private static Expression castInvalidMappedField(DataType targetType, FieldAttribute fa) {
Source source = fa.source();
return switch (targetType) {
case DATETIME -> new ToDatetime(source, fa); // in case we decided to use DATE as a common type instead of DATE_NANOS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice as an example, but is effectively dead code. I wonder what are the chances we'll change our mind on this one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like we're making a decision prioritizing the LogsDb plans. But could we be causing trouble for other users? Or is the thinking that other users need to add an explicit cast? That seems a reasonable assumption.

// if so add a project with eval, so that a not null value can be returned for a union typed field
if (logicalPlan.resolved()) {
List<Attribute> output = logicalPlan.output();
Map<FieldAttribute, Alias> newAliases = Maps.newHashMapWithExpectedSize(output.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is very similar to code above, but that also took into account previous/existing field mappings. Why is that not needed here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of castInvalidMappedFieldInFinalOutput is to add implicit casting to mixed date/date_nanos fields, that are not referenced by the query(there was no chance to do implicit casting for them if they are not referenced in any commands), however they are in the returned resultsets. For example, from index*. It only checks the top level plan's output, it doesn't traverse the entire tree, so it doesn't have a map to keep track of the casted fields map across all plans/subplans.

@fang-xing-esql fang-xing-esql added the test-release Trigger CI checks against release build label Apr 25, 2025
@alex-spies
Copy link
Contributor

alex-spies commented Apr 25, 2025

Discussed offline with @fang-xing-esql and @craigtaverner.

I think the proposed solution should work, it's just a little complex given that it needs to replicate a) some logic present in InsertFieldExtraction (inserting EVALs for implicitly cast fields just before they're used) and b) for passing on fields that users never explicitly dropped or kept into the final output correctly.

I suggested one more approach which we agreed to try before going forward with the Eval-injection approach. The alternative would turn invalid mapped fields of date/nanos type into multi typed fields per default, before any resolution step (multi typed fields in field attributes trigger automatic conversion on field extraction). I threw together a draft as a commit on top of this PR, which seems to still run most tests fine (all existing union type tests + most tests from this PR).

This approach still needs to special case for the situation where users specifically don't want to convert into nanos, but dates, i.e. when the query has datetime_nanos_field::datetime in some expression and the classical union type mechanism should kick in.

@fang-xing-esql
Copy link
Member Author

fang-xing-esql commented May 7, 2025

Discussed offline with @fang-xing-esql and @craigtaverner.

I think the proposed solution should work, it's just a little complex given that it needs to replicate a) some logic present in InsertFieldExtraction (inserting EVALs for implicitly cast fields just before they're used) and b) for passing on fields that users never explicitly dropped or kept into the final output correctly.

I suggested one more approach which we agreed to try before going forward with the Eval-injection approach. The alternative would turn invalid mapped fields of date/nanos type into multi typed fields per default, before any resolution step (multi typed fields in field attributes trigger automatic conversion on field extraction). I threw together a draft as a commit on top of this PR, which seems to still run most tests fine (all existing union type tests + most tests from this PR).

This approach still needs to special case for the situation where users specifically don't want to convert into nanos, but dates, i.e. when the query has datetime_nanos_field::datetime in some expression and the classical union type mechanism should kick in.

The other approach that converts InvalidMappedField to MultiTypeEsField for date and date_nanos in place in EsRelation is here, either of these two approaches look reasonable to me. There are some issues with release tests, not relevant to these two PRs, the delivery team will fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) test-release Trigger CI checks against release build v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ESQL] Support autocasting to resolve types between nanosecond dates and millisecond dates
6 participants