Skip to content

CASE WHEN expression labels evaluated eagerly instead of lazily #65

@dallmair

Description

@dallmair

Background

We have a table similar to:

Type (string) Value (string)
text abc
date 07/27/2023 00:00:00

We're trying to transform the date value to a different format using NQuery with:

SELECT  Type
        , CASE WHEN Type = 'date' THEN FORMAT(TO_DATETIME(Value), 'yyyy-MM-dd') ELSE Value END AS Value
FROM    ourTable

Expected result

Returns table with date value transformed to 2023-07-27.

Actual result

Throws exception:

System.FormatException: The string 'abc' was not recognized as a valid DateTime. There is an unknown word starting at index '0'.
   at System.Convert.ToDateTime(String value, IFormatProvider provider)
   at System.String.System.IConvertible.ToDateTime(IFormatProvider provider)
   at System.Convert.ToDateTime(Object value, IFormatProvider provider)
   at NQuery.Symbols.BuiltInFunctions.ToDateTime(Object value) in .\nquery-vnext\src\NQuery\Symbols\BuiltInFunctions.cs:line 176
   at lambda_method11(Closure )
   at NQuery.Iterators.ComputeScalarIterator.Read() in .\nquery-vnext\src\NQuery\Iterators\ComputeScalarIterator.cs:line 41
   at NQuery.Iterators.ProjectionIterator.Read() in .\nquery-vnext\src\NQuery\Iterators\ProjectionIterator.cs:line 31
   at NQuery.QueryReader.Read() in .\nquery-vnext\src\NQuery\QueryReader.cs:line 43
   at NQuery.Data.QueryReaderExtensions.ExecuteDataTable(QueryReader queryReader) in .\nquery-vnext\src\NQuery.Data\QueryReaderExtensions.cs:line 21
   at NQueryViewer.MainWindow.<>c__DisplayClass15_0.<ExecuteQuery>b__0() in .\nquery-vnext\src\NQueryViewer\MainWindow.xaml.cs:line 143
   at System.Threading.Tasks.Task`1.InnerInvoke()
   at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location ---
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
--- End of stack trace from previous location ---
   at NQueryViewer.MainWindow.ExecuteQuery() in .\nquery-vnext\src\NQueryViewer\MainWindow.xaml.cs:line 140

Analysis

The problem is caused by the caching of expression values in variables implemented in ExpressionBuilder.BuildCachedExpression. This code transforms expressions like a.b.c.Prop to

var temp1 = a.b;
var temp2 = temp1.c;
return temp2.Prop;

And it transforms our CASE WHEN Type = 'date' THEN FORMAT(TO_DATETIME(Value), 'yyyy-MM-dd') ELSE Value END to:

var temp1 = Type;
var temp2 = 'date';
var temp3 = Value;
var temp4 = TO_DATETIME(temp3);
var temp5 = 'yyyy-MM-dd';
return temp1 = temp2 ? FORMAT(temp4, temp5) : Value;

In words: It pulls calculations out of the CASE expression, changing the semantics (searched CASE expressions are short-circuited in SQL).

Open questions

As a little experiment, I removed caching by replacing all calls to BuildCachedExpression with BuildLiftedExpression. All unit tests are still green, but I'm unsure whether that is a good idea as I do not know:

  1. Why was the caching of expression values introduced?
  2. Does it have any measurable performance effect?
  3. Is it necessary for correctness in some cases I don't understand?
  4. Can caching be removed altogether? Or would this have negative effects, and it's necessary to disable it selectively to fix the CASE WHEN bug described here?

Highly appreciate any insights.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions