Skip to content

feat: Binary CASE WHEN expression with support for nested conditions#13

Merged
lukekim merged 4 commits into
developfrom
lukim/binary
Jan 29, 2026
Merged

feat: Binary CASE WHEN expression with support for nested conditions#13
lukekim merged 4 commits into
developfrom
lukim/binary

Conversation

@lukekim
Copy link
Copy Markdown

@lukekim lukekim commented Jan 26, 2026

Adds support for the "CASE WHEN" SQL expression to the Vortex expression system, including its conversion from DataFusion, benchmarking, and pushdown logic. The main focus is on enabling CASE WHEN expressions to be parsed, converted, and benchmarked, while ensuring only supported forms are handled.

Support for CASE WHEN expressions:

  • Added a new module case_when to vortex-array's expression system and re-exported its functions, enabling construction and evaluation of CASE WHEN and nested CASE WHEN expressions. (vortex-array/src/expr/exprs/mod.rs) [1] [2]
  • Registered the new CaseWhen expression in the ExprSession so it can be used in expression evaluation. (vortex-array/src/expr/session.rs) [1] [2]

DataFusion integration and conversion:

  • Implemented conversion from DataFusion's CaseExpr to Vortex's nested case_when expressions, with validation to only support the "searched CASE" form (not "simple CASE"). (vortex-datafusion/src/convert/exprs.rs) [1] [2] [3]
  • Updated the pushdown logic to recognize and validate CASE WHEN expressions, including recursive checks for convertible sub-expressions and else clauses. (vortex-datafusion/src/convert/exprs.rs) [1] [2]

Benchmarks and protocol updates:

  • Added a new benchmark suite for CASE WHEN expressions, covering simple, nested, all-true, and all-false scenarios with varying array sizes. (vortex-array/benches/expr/case_when_bench.rs, vortex-array/Cargo.toml) [1] [2]
  • Extended the protocol buffer definitions to include options for CASE WHEN expressions, specifying the number of when/then pairs and presence of an else clause. (vortex-proto/proto/expr.proto)

Bench:

Timer precision: 16 ns
expr_case_when                    fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ case_when_all_false                          │               │               │               │         │
│  ├─ 1000                        3.183 µs      │ 1.572 ms      │ 3.295 µs      │ 19.01 µs      │ 100     │ 100
│  ├─ 10000                       4.047 µs      │ 5.311 µs      │ 4.175 µs      │ 4.192 µs      │ 100     │ 100
│  ╰─ 100000                      12.36 µs      │ 17.1 µs       │ 12.52 µs      │ 12.6 µs       │ 100     │ 100
├─ case_when_all_true                           │               │               │               │         │
│  ├─ 1000                        3.167 µs      │ 4.047 µs      │ 3.311 µs      │ 3.324 µs      │ 100     │ 100
│  ├─ 10000                       4.095 µs      │ 7.407 µs      │ 4.191 µs      │ 4.234 µs      │ 100     │ 100
│  ╰─ 100000                      12.36 µs      │ 14.43 µs      │ 12.49 µs      │ 12.52 µs      │ 100     │ 100
├─ case_when_nested_3_conditions                │               │               │               │         │
│  ├─ 1000                        13.53 µs      │ 159.4 µs      │ 13.75 µs      │ 15.28 µs      │ 100     │ 100
│  ├─ 10000                       18.41 µs      │ 21.19 µs      │ 18.75 µs      │ 18.78 µs      │ 100     │ 100
│  ╰─ 100000                      203.6 µs      │ 424.2 µs      │ 236.5 µs      │ 252.2 µs      │ 100     │ 100
╰─ case_when_simple                             │               │               │               │         │
   ├─ 1000                        4.591 µs      │ 6.991 µs      │ 4.735 µs      │ 4.764 µs      │ 100     │ 100
   ├─ 10000                       6.415 µs      │ 9.471 µs      │ 6.527 µs      │ 6.567 µs      │ 100     │ 100
   ╰─ 100000                      147.5 µs      │ 184.5 µs      │ 153.3 µs      │ 153.5 µs      │ 100     │ 100

@lukekim lukekim changed the title Lukim/binary feat: Binary CASE WHEN expression with support for nested conditions Jan 27, 2026
@lukekim lukekim self-assigned this Jan 27, 2026
@lukekim lukekim added the enhancement New feature or request label Jan 27, 2026
Comment thread vortex-array/src/expr/exprs/case_when.rs
Comment thread vortex-array/src/expr/exprs/case_when.rs Outdated
Comment thread vortex-array/src/expr/exprs/case_when.rs Outdated
Comment thread vortex-array/src/expr/exprs/case_when.rs
@joseph-isaacs
Copy link
Copy Markdown

cna you profile this at all?

@lukekim lukekim merged commit c981498 into develop Jan 29, 2026
17 of 44 checks passed
@lukekim lukekim deleted the lukim/binary branch January 29, 2026 04:03
lukekim added a commit that referenced this pull request Mar 4, 2026
…13)

* feat: implement binary CASE WHEN expression with support for nested conditions
lukekim pushed a commit that referenced this pull request May 18, 2026
## Summary

Fix for the second part of: vortex-data#7808 

```
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>)
    at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6)
    at ./nptl/pthread_kill.c:89
#3  0x000076a38cc4527e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x000076a38cc288ff in __GI_abort () at ./stdlib/abort.c:79
#5  0x000076a38cc297b6 in __libc_message_impl (fmt=fmt@entry=0x76a38cdce8d7 "%s\n")
    at ../sysdeps/posix/libc_fatal.c:134
#6  0x000076a38cca8ff5 in malloc_printerr (
    str=str@entry=0x76a38cdd1bf0 "free(): double free detected in tcache 2")
    at ./malloc/malloc.c:5775
#7  0x000076a38ccab55f in _int_free (av=0x76a38ce03ac0 <main_arena>, p=<optimized out>, 
    have_lock=0) at ./malloc/malloc.c:4541
#8  0x000076a38ccaddce in __GI___libc_free (mem=0x5be5cd9632c0) at ./malloc/malloc.c:3398
#9  0x000076a38eb6807e in alloc::alloc::dealloc (ptr=0x5be5cd9632c0, layout=...)
    at /home/ubuntu/.rustup/toolchains/1.91.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/alloc.rs:114
#10 alloc::alloc::{impl#1}::deallocate (self=0x5be5cd95f708, ptr=..., layout=...)
    at /home/ubuntu/.rustup/toolchains/1.91.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/alloc.rs:271
#11 0x000076a38ead9a64 in alloc::boxed::{impl#8}::drop<dyn vortex_scan::Partition, alloc::alloc::Global> (self=0x5be5cd95f6f8)
    at /home/ubuntu/.rustup/toolchains/1.91.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:1666
#12 0x000076a38ead349e in core::ptr::drop_in_place<alloc::boxed::Box<dyn vortex_scan::Partition, alloc::alloc::Global>> ()
    at /home/ubuntu/.rustup/toolchains/1.91.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:804
#13 0x000076a38e8764de in core::ptr::drop_in_place<vortex_ffi::scan::VxPartitionScan> ()
    at /home/ubuntu/.rustup/toolchains/1.91.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:804
#14 0x000076a38e876fb8 in core::ptr::drop_in_place<alloc::boxed::Box<vortex_ffi::scan::VxPartitionScan, alloc::alloc::Global>> ()
    at /home/ubuntu/.rustup/toolchains/1.91.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:804
#15 0x000076a38e87f2f5 in core::mem::drop<alloc::boxed::Box<vortex_ffi::scan::VxPartitionScan, alloc::alloc::Global>> (_x=0x5be5cd95f6f0)
    at /home/ubuntu/.rustup/toolchains/1.91.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/mem/mod.rs:961
#16 0x000076a38e84efa7 in vortex_ffi::scan::vx_partition_free (ptr=0x5be5cd95f6f0)
    at vortex-ffi/src/macros.rs:295
#17 0x00005be5b0c81126 in operator() (__closure=0x7fff2208a8b0)
    at /home/ubuntu/vortex/vortex-ffi/test/scan.cpp:940
```

## Testing

Verifying existing behavior is maintained.

Signed-off-by: Dergousov Maksim <dergousovmaxim99@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants