Skip to content

Commit b9e6f42

Browse files
cigraingerclaude
andauthored
feat: add cond/if/in to Query macro, document function pass-through (#32)
Query macro enrichment: - `cond do ... end` compiles to SQL CASE WHEN ... THEN ... ELSE ... END - `if(cond, do: x, else: y)` compiles to simple CASE WHEN - `x in [1, 2, 3]` and `x in ^list` compile to SQL IN (...) Documentation: - Expanded Query moduledoc with all operators, conditionals, IN, and comprehensive DuckDB function examples by category (date, string, regex, list, struct, math, null, type) - Updated getting-started guide with cond, if, in, DuckDB functions, and compute/1 in materialization section - Updated README with cond/in examples and function pass-through note 52 query tests including happy path, adversarial (SQL injection strings, null values, many branches), combined cond+in, and DuckDB function pass-through verification. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 6926939 commit b9e6f42

5 files changed

Lines changed: 656 additions & 19 deletions

File tree

README.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,21 @@ min_amount = 500
6565
Dux.filter(df, amount > ^min_amount and status == "active")
6666
```
6767

68-
The `_with` variants accept raw DuckDB SQL for anything the macro doesn't cover:
68+
All DuckDB functions work inside expressions — `year()`, `lower()`, `coalesce()`, `regexp_matches()`, and [hundreds more](https://duckdb.org/docs/sql/functions/overview). `cond` maps to `CASE WHEN`, `in` maps to `IN`:
69+
70+
```elixir
71+
Dux.mutate(df,
72+
tier: cond do
73+
amount > 1000 -> "gold"
74+
amount > 100 -> "silver"
75+
true -> "bronze"
76+
end
77+
)
78+
79+
Dux.filter(df, status in ["active", "pending"])
80+
```
81+
82+
The `_with` variants accept raw DuckDB SQL for window functions and other constructs the macro doesn't cover:
6983

7084
```elixir
7185
Dux.mutate_with(df, rank: "ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC)")

guides/getting-started.livemd

Lines changed: 112 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,18 @@ penguins
3535

3636
## Materialization
3737

38-
Pipelines are lazy — nothing executes until you materialise. `compute/1`
39-
executes and returns a `%Dux{}` with the data. `to_rows/1` and `to_columns/1`
40-
extract raw Elixir data structures:
38+
Pipelines are lazy — nothing executes until you materialise:
39+
40+
- `compute/1` — execute and return a `%Dux{}` with the data
41+
- `to_rows/1` — execute and return a list of maps
42+
- `to_columns/1` — execute and return a map of lists
43+
44+
```elixir
45+
penguins
46+
|> Dux.filter(species == "Adelie")
47+
|> Dux.head(3)
48+
|> Dux.compute()
49+
```
4150

4251
```elixir
4352
penguins
@@ -113,6 +122,106 @@ penguins
113122
|> Dux.compute()
114123
```
115124
125+
## Conditional Expressions
126+
127+
Elixir's `cond` compiles to SQL `CASE WHEN` — classify rows directly in your pipeline:
128+
129+
```elixir
130+
penguins
131+
|> Dux.drop_nil([:body_mass_g])
132+
|> Dux.mutate(
133+
size: cond do
134+
body_mass_g > 5000 -> "large"
135+
body_mass_g > 3500 -> "medium"
136+
true -> "small"
137+
end
138+
)
139+
|> Dux.group_by(:size)
140+
|> Dux.summarise(n: count(species))
141+
|> Dux.compute()
142+
```
143+
144+
For simple two-branch logic, use `if/else`:
145+
146+
```elixir
147+
penguins
148+
|> Dux.drop_nil([:body_mass_g])
149+
|> Dux.mutate(heavy: if(body_mass_g > 4500, do: "yes", else: "no"))
150+
|> Dux.group_by(:heavy)
151+
|> Dux.summarise(n: count(species))
152+
|> Dux.compute()
153+
```
154+
155+
## Membership Tests
156+
157+
Use `in` to filter by a set of values:
158+
159+
```elixir
160+
penguins
161+
|> Dux.filter(species in ["Adelie", "Chinstrap"])
162+
|> Dux.group_by(:species)
163+
|> Dux.summarise(n: count(species))
164+
|> Dux.compute()
165+
```
166+
167+
Works with interpolated lists too:
168+
169+
```elixir
170+
target_islands = ["Biscoe", "Dream"]
171+
172+
penguins
173+
|> Dux.filter(island in ^target_islands)
174+
|> Dux.group_by(:island)
175+
|> Dux.summarise(n: count(species))
176+
|> Dux.compute()
177+
```
178+
179+
## Using DuckDB Functions
180+
181+
All DuckDB functions work inside the query macro — they pass through as SQL
182+
function calls. Here are some useful ones:
183+
184+
```elixir
185+
# String functions
186+
penguins
187+
|> Dux.mutate(
188+
species_lower: lower(species),
189+
first_char: left(species, 1)
190+
)
191+
|> Dux.select([:species, :species_lower, :first_char])
192+
|> Dux.distinct()
193+
|> Dux.compute()
194+
```
195+
196+
```elixir
197+
# Math and rounding
198+
penguins
199+
|> Dux.drop_nil([:bill_length_mm, :bill_depth_mm])
200+
|> Dux.mutate(
201+
bill_ratio: round(bill_length_mm / bill_depth_mm, 2),
202+
log_mass: round(ln(body_mass_g), 2)
203+
)
204+
|> Dux.select([:species, :bill_ratio, :log_mass])
205+
|> Dux.head(5)
206+
|> Dux.compute()
207+
```
208+
209+
```elixir
210+
# Null handling with coalesce
211+
penguins
212+
|> Dux.mutate(safe_sex: coalesce(sex, "unknown"))
213+
|> Dux.group_by(:safe_sex)
214+
|> Dux.summarise(n: count(species))
215+
|> Dux.compute()
216+
```
217+
218+
> #### Any DuckDB function works {: .info}
219+
>
220+
> `lower()`, `upper()`, `trim()`, `year()`, `month()`, `date_diff()`,
221+
> `regexp_matches()`, `list_extract()`, `struct_extract()`, `coalesce()`,
222+
> `cast()`, and [hundreds more](https://duckdb.org/docs/sql/functions/overview).
223+
> If DuckDB has it, you can call it in a Dux expression.
224+
116225
## See the SQL
117226

118227
Every pipeline compiles to SQL CTEs. Use `sql_preview/1` to inspect

lib/dux/query.ex

Lines changed: 138 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -8,42 +8,122 @@ defmodule Dux.Query do
88
Use `^` to interpolate Elixir variables — these become parameter bindings
99
in the generated SQL, preventing SQL injection by construction.
1010
11-
## Supported in queries
12-
1311
Queries are used in `Dux.filter/2`, `Dux.mutate/2`, `Dux.summarise/2`,
1412
and `Dux.sort_by/2`.
1513
1614
## Operators
1715
18-
Comparison: `==`, `!=`, `>`, `>=`, `<`, `<=`
19-
Arithmetic: `+`, `-`, `*`, `/`
20-
Logical: `and`, `or`, `not`
21-
String: `<>` (concatenation)
16+
| Elixir | SQL |
17+
|--------|-----|
18+
| `==`, `!=`, `>`, `>=`, `<`, `<=` | `=`, `!=`, `>`, `>=`, `<`, `<=` |
19+
| `+`, `-`, `*`, `/` | `+`, `-`, `*`, `/` |
20+
| `and`, `or`, `not` | `AND`, `OR`, `NOT` |
21+
| `<>` | `\|\|` (string concatenation) |
22+
| `in [...]` | `IN (...)` |
2223
23-
## Aggregation functions
24+
## Conditional expressions
25+
26+
Elixir's `cond` maps to SQL `CASE WHEN`:
27+
28+
Dux.mutate(df,
29+
tier: cond do
30+
amount > 1000 -> "gold"
31+
amount > 100 -> "silver"
32+
true -> "bronze"
33+
end
34+
)
35+
36+
Elixir's `if/else` for simple two-branch conditionals:
2437
25-
`sum(col)`, `mean(col)`, `min(col)`, `max(col)`, `count(col)`,
26-
`count_distinct(col)`, `avg(col)`, `std(col)`, `variance(col)`
38+
Dux.mutate(df, label: if(amount > 0, do: "positive", else: "negative"))
2739
28-
## Other functions
40+
## Membership test
2941
30-
`col("name")` for columns with unusual names.
31-
`cast(expr, type)` for type casting.
42+
Elixir's `in` works with literal and pinned lists:
43+
44+
Dux.filter(df, status in ["active", "pending"])
45+
46+
allowed = ["active", "pending"]
47+
Dux.filter(df, status in ^allowed)
3248
3349
## Interpolation
3450
35-
Use `^` to access variables defined outside the query:
51+
Use `^` to interpolate Elixir values as parameter bindings:
3652
3753
min_val = 10
3854
Dux.filter(df, x > ^min_val)
3955
56+
## Aggregation functions
57+
58+
`sum`, `avg`/`mean`, `min`, `max`, `count`, `count_distinct`, `std`, `variance`
59+
60+
## DuckDB functions
61+
62+
**All DuckDB functions work in queries.** Function calls pass through
63+
to DuckDB unchanged. A few examples by category:
64+
65+
**Date/time:**
66+
67+
Dux.mutate(df, y: year(d), m: month(d), trunc: date_trunc("month", d))
68+
Dux.mutate(df, age_days: date_diff("day", created_at, current_date()))
69+
70+
**String:**
71+
72+
Dux.mutate(df, low: lower(name), parts: string_split(path, "/"))
73+
Dux.filter(df, starts_with(name, "A"))
74+
Dux.mutate(df, clean: trim(replace(name, " ", " ")))
75+
76+
**Regex:**
77+
78+
Dux.filter(df, regexp_matches(email, ".*@gmail\\\\.com"))
79+
Dux.mutate(df, domain: regexp_extract(email, "@(.+)", 1))
80+
81+
**List/Array:**
82+
83+
Dux.mutate(df, first: list_extract(tags, 1), n: len(tags))
84+
85+
**Struct:**
86+
87+
Dux.mutate(df, city: struct_extract(address, "city"))
88+
89+
**Math:**
90+
91+
Dux.mutate(df, log_amt: ln(amount), pct: round(ratio * 100, 2))
92+
93+
**Null handling:**
94+
95+
Dux.mutate(df, safe: coalesce(nullable_col, 0))
96+
97+
**Type casting:**
98+
99+
Dux.mutate(df, as_text: cast(id, "VARCHAR"), as_int: cast(amount, "INTEGER"))
100+
101+
For the full list, see the
102+
[DuckDB Functions reference](https://duckdb.org/docs/sql/functions/overview).
103+
104+
For anything the macro doesn't support (window functions, subqueries),
105+
use the `_with` variants (`mutate_with/2`, `filter_with/2`) which accept
106+
raw DuckDB SQL strings.
107+
108+
## Column references
109+
110+
`col("name")` for columns with spaces or special characters:
111+
112+
Dux.filter(df, col("Total Amount") > 100)
113+
40114
## Examples
41115
42116
# Filter
43117
Dux.filter(df, age > 18 and status == "active")
44118
45-
# Mutate
46-
Dux.mutate(df, revenue: price * quantity, tax: price * ^tax_rate)
119+
# Mutate with conditional
120+
Dux.mutate(df,
121+
revenue: price * quantity,
122+
tier: cond do
123+
price > 100 -> "premium"
124+
true -> "standard"
125+
end
126+
)
47127
48128
# Summarise
49129
Dux.summarise(df, total: sum(amount), n: count(id))
@@ -137,6 +217,49 @@ defmodule Dux.Query do
137217
{{:negate, ast}, pins}
138218
end
139219

220+
# cond → CASE WHEN ... THEN ... ELSE ... END
221+
defp traverse({:cond, _meta, [[do: clauses]]}, pins) do
222+
{pairs, else_expr, pins} =
223+
Enum.reduce(clauses, {[], nil, pins}, fn {:->, _m, [[condition], result]},
224+
{pairs, _else, pins} ->
225+
{cond_ast, pins} = traverse(condition, pins)
226+
{result_ast, pins} = traverse(result, pins)
227+
228+
case cond_ast do
229+
{:lit, true} ->
230+
# `true -> expr` becomes the ELSE branch
231+
{pairs, result_ast, pins}
232+
233+
_ ->
234+
{pairs ++ [{cond_ast, result_ast}], nil, pins}
235+
end
236+
end)
237+
238+
{{:case_when, pairs, else_expr}, pins}
239+
end
240+
241+
# if/else → CASE WHEN cond THEN then_expr ELSE else_expr END
242+
defp traverse({:if, _meta, [condition, [do: then_expr, else: else_expr]]}, pins) do
243+
{cond_ast, pins} = traverse(condition, pins)
244+
{then_ast, pins} = traverse(then_expr, pins)
245+
{else_ast, pins} = traverse(else_expr, pins)
246+
{{:case_when, [{cond_ast, then_ast}], else_ast}, pins}
247+
end
248+
249+
# if without else → CASE WHEN cond THEN then_expr ELSE NULL END
250+
defp traverse({:if, _meta, [condition, [do: then_expr]]}, pins) do
251+
{cond_ast, pins} = traverse(condition, pins)
252+
{then_ast, pins} = traverse(then_expr, pins)
253+
{{:case_when, [{cond_ast, then_ast}], {:lit, nil}}, pins}
254+
end
255+
256+
# in operator → SQL IN
257+
defp traverse({:in, _meta, [left, right]}, pins) do
258+
{l_ast, pins} = traverse(left, pins)
259+
{r_ast, pins} = traverse(right, pins)
260+
{{:in, l_ast, r_ast}, pins}
261+
end
262+
140263
# String concatenation: <>
141264
defp traverse({:<>, _meta, [left, right]}, pins) do
142265
{l_ast, pins} = traverse(left, pins)

lib/dux/query/compiler.ex

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,57 @@ defmodule Dux.Query.Compiler do
139139
{"#{sql_name}(#{Enum.join(arg_sqls, ", ")})", all_params, idx}
140140
end
141141

142+
# --- CASE WHEN ---
143+
144+
defp compile({:case_when, pairs, else_expr}, pins, idx) do
145+
{when_clauses, all_params, idx} =
146+
Enum.reduce(pairs, {[], [], idx}, fn {condition, result}, {clauses, params, idx} ->
147+
{cond_sql, cond_params, idx} = compile(condition, pins, idx)
148+
{result_sql, result_params, idx} = compile(result, pins, idx)
149+
clause = "WHEN #{cond_sql} THEN #{result_sql}"
150+
{clauses ++ [clause], params ++ cond_params ++ result_params, idx}
151+
end)
152+
153+
{else_clause, else_params, idx} =
154+
case else_expr do
155+
nil ->
156+
{"", [], idx}
157+
158+
expr ->
159+
{sql, params, idx} = compile(expr, pins, idx)
160+
{" ELSE #{sql}", params, idx}
161+
end
162+
163+
sql = "(CASE #{Enum.join(when_clauses, " ")}#{else_clause} END)"
164+
{sql, all_params ++ else_params, idx}
165+
end
166+
167+
# --- IN operator ---
168+
169+
defp compile({:in, left, {:pin, pin_idx}}, pins, idx) do
170+
{l_sql, l_params, idx} = compile(left, pins, idx)
171+
values = Enum.at(pins, pin_idx)
172+
# Pinned list — expand to individual parameter bindings
173+
{placeholders, idx} =
174+
Enum.reduce(values, {[], idx}, fn _v, {phs, idx} ->
175+
{phs ++ ["$#{idx + 1}"], idx + 1}
176+
end)
177+
178+
{"(#{l_sql} IN (#{Enum.join(placeholders, ", ")}))", l_params ++ values, idx}
179+
end
180+
181+
defp compile({:in, left, right}, pins, idx) when is_list(right) do
182+
{l_sql, l_params, idx} = compile(left, pins, idx)
183+
184+
{val_sqls, val_params, idx} =
185+
Enum.reduce(right, {[], [], idx}, fn item, {sqls, params, idx} ->
186+
{sql, new_params, idx} = compile(item, pins, idx)
187+
{sqls ++ [sql], params ++ new_params, idx}
188+
end)
189+
190+
{"(#{l_sql} IN (#{Enum.join(val_sqls, ", ")}))", l_params ++ val_params, idx}
191+
end
192+
142193
# --- Sort direction markers ---
143194

144195
defp compile({:asc, expr}, pins, idx) do

0 commit comments

Comments
 (0)