Skip to content

Commit e654f0c

Browse files
claudespicelukekim
authored andcommitted
docs: Add table function (kind: table) documentation
Document user-defined table functions alongside the existing scalar function docs. Table functions return multiple rows/columns, use a SELECT query as body (with args exposed via a virtual args table), and are always SQL-only (never surfaced as LLM tools).
1 parent 6d5a287 commit e654f0c

2 files changed

Lines changed: 79 additions & 10 deletions

File tree

website/docs/features/functions/index.md

Lines changed: 58 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: 'Functions'
33
sidebar_label: 'Functions'
4-
description: 'Define custom scalar SQL functions inline (SQL tier) or by calling remote HTTP services (Remote tier), automatically exposed as SQL functions and LLM tools.'
4+
description: 'Define custom scalar and table SQL functions inline (SQL tier) or by calling remote HTTP services (Remote tier), automatically exposed as SQL functions and LLM tools.'
55
sidebar_position: 11
66
pagination_prev: null
77
pagination_next: null
@@ -12,7 +12,7 @@ tags:
1212
- tools
1313
---
1414

15-
Functions extend Spice's SQL engine with custom scalar logic declared in your Spicepod. Each function can be:
15+
Functions extend Spice's SQL engine with custom logic declared in your Spicepod. Functions can be **scalar** (one value per row) or **table** (returning multiple rows and columns). Each function can be:
1616

1717
- **Called directly in SQL** like any built-in function (`SELECT my_fn(col) FROM ...`).
1818
- **Surfaced to LLMs as tools** for tool-calling workflows.
@@ -169,6 +169,60 @@ The runtime sends a single HTTP `POST` per batch with `Content-Type: application
169169

170170
Calls to remote functions require the runtime to be configured with [`runtime.auth.api-key`](../../reference/spicepod/runtime#runtimeauth) — they execute under the read-write API key context.
171171

172+
## Table Functions
173+
174+
Table functions (`kind: table`) return multiple rows and columns instead of a single scalar value. They are called using standard SQL table-function syntax: `SELECT ... FROM my_table_fn(args)`.
175+
176+
### Declaring a table function
177+
178+
Set `kind: table` and provide `returns` as a list of output columns (instead of a single type string):
179+
180+
```yaml
181+
functions:
182+
- name: emit_pair
183+
from: sql
184+
kind: table
185+
description: Emit an input row and its successor.
186+
volatility: immutable
187+
signature:
188+
args: [{ name: x, type: int64 }]
189+
returns:
190+
- { name: value, type: int64 }
191+
- { name: doubled, type: int64 }
192+
body: |
193+
SELECT x AS value, x * 2 AS doubled FROM args
194+
UNION ALL
195+
SELECT x + 1 AS value, (x + 1) * 2 AS doubled FROM args
196+
```
197+
198+
```sql
199+
SELECT value, doubled FROM emit_pair(4) ORDER BY value;
200+
-- value | doubled
201+
-- ------+--------
202+
-- 4 | 8
203+
-- 5 | 10
204+
```
205+
206+
### Key differences from scalar functions
207+
208+
| Aspect | Scalar | Table |
209+
| --- | --- | --- |
210+
| `kind:` | `scalar` (default) | `table` |
211+
| `signature.returns:` | Single Arrow type string (e.g., `int64`) | List of `{name, type}` output columns |
212+
| `body:` (SQL tier) | Single SQL expression | Full SQL `SELECT` query |
213+
| LLM tool exposure | `as_tool: true` (default) | Always SQL-only |
214+
215+
### SQL body for table functions
216+
217+
The `body` of a SQL table function is a complete `SELECT` query (not an expression). Scalar arguments are exposed through a virtual one-row table named `args`:
218+
219+
```yaml
220+
body: |
221+
SELECT x AS value, x * 2 AS doubled FROM args
222+
```
223+
224+
The query's output columns must match the declared `returns` schema.
225+
172226
## Volatility
173227

174228
Volatility tells the optimizer how the function behaves across calls. Pick the strongest level that's actually true — the default (`volatile`) is the safest but disables constant folding, query-level caching, and pushdown.
@@ -235,7 +289,7 @@ The `list_udfs()` UDTF returns every function registered in the runtime, includi
235289
| ------------- | --------------------------------------------------------------------- |
236290
| `name` | Function identifier. |
237291
| `source` | `user` for declared functions, `builtin` for Spice/DataFusion ones. |
238-
| `kind` | `scalar` for user functions, `NULL` for built-ins. |
292+
| `kind` | `scalar` or `table` for user functions, `NULL` for built-ins. |
239293
| `volatility` | `immutable` / `stable` / `volatile`. |
240294
| `from` | `sql`, `http://...`, or `https://...`. |
241295
| `description` | The declared description, if any. |
@@ -250,7 +304,7 @@ Returns a JSON array of user functions only (built-ins are excluded). Each entry
250304

251305
## Functions as LLM tools
252306

253-
Every declared function is automatically callable from LLMs as a tool with the same name and description. This lets a model reason in natural language and then invoke `haversine_km(...)` or `classify_intent(...)` directly.
307+
Every declared scalar function is automatically callable from LLMs as a tool with the same name and description. This lets a model reason in natural language and then invoke `haversine_km(...)` or `classify_intent(...)` directly. Table functions (`kind: table`) are always SQL-only and are not surfaced as LLM tools.
254308

255309
:::tip[Many functions? Use the Tool Registry]
256310
A Spicepod with many functions can quickly cross the threshold where injecting every function definition into every chat turn becomes expensive. The [Tool Registry](../tool-registry/index.md) replaces individual tool definitions with searchable `tool_search` / `tool_invoke` meta-tools, typically saving ~10× the per-turn tool-definition tokens. Set `tools: auto` on the model and the registry kicks in automatically once the function count crosses the threshold.

website/docs/reference/spicepod/functions.md

Lines changed: 21 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ sidebar_label: 'Functions'
44
description: 'User-defined functions YAML reference'
55
---
66

7-
Functions extend Spice's SQL engine with custom scalar logic. Each entry in the top-level `functions:` block is registered as a callable SQL function and (by default) as an LLM tool.
7+
Functions extend Spice's SQL engine with custom scalar and table logic. Each entry in the top-level `functions:` block is registered as a callable SQL function and (for scalar functions, by default) as an LLM tool.
88

99
For an overview, examples, and execution-tier details, see [Functions](../../features/functions).
1010

@@ -14,7 +14,7 @@ The `functions:` section is only honored when [`runtime.functions.enabled`](./ru
1414

1515
## `functions`
1616

17-
The `functions:` section in your configuration declares one or more scalar functions.
17+
The `functions:` section in your configuration declares one or more scalar or table functions.
1818

1919
Example:
2020

@@ -58,7 +58,12 @@ Optional. Free-form description surfaced in `list_udfs()`, `GET /v1/functions`,
5858

5959
### `kind`
6060

61-
Optional. Defaults to `scalar`. Only scalar functions are supported in the current beta.
61+
Optional. Defaults to `scalar`.
62+
63+
| Value | Description |
64+
| -------- | -------------------------------------------------------------------------------------------------------- |
65+
| `scalar` | (default) Returns a single value per row. Called as `SELECT my_fn(x) FROM ...`. |
66+
| `table` | Returns multiple rows and columns. Called as `SELECT ... FROM my_fn(x)`. Always SQL-only (`as_tool` is ignored). |
6267

6368
### `volatility`
6469

@@ -95,11 +100,21 @@ Each entry has:
95100

96101
#### `signature.returns`
97102

98-
Required for scalar functions. The Arrow type of the function's output, in the same format as argument types.
103+
Required. For **scalar functions**, a single Arrow type string (e.g., `int64`, `utf8`). For **table functions**, a list of output column definitions:
104+
105+
```yaml
106+
# Scalar function
107+
returns: int64
108+
109+
# Table function
110+
returns:
111+
- { name: value, type: int64 }
112+
- { name: label, type: utf8 }
113+
```
99114

100115
### `body`
101116

102-
Inline SQL expression body. Required when `from: sql` (unless `body_ref` is set instead). Must be a single SQL expression (not a statement) referencing the function's arguments by name.
117+
Inline SQL body. Required when `from: sql` (unless `body_ref` is set instead). For **scalar functions**, must be a single SQL expression referencing the function's arguments by name. For **table functions**, must be a single `SELECT` query; scalar arguments are available via a virtual `args` table.
103118

104119
```yaml
105120
body: |
@@ -153,7 +168,7 @@ metadata:
153168

154169
### `as_tool`
155170

156-
Optional. Defaults to `true`. When `true`, the function is registered as an LLM tool with the same name and description and becomes callable from chat completions, `POST /v1/tools/<name>`, and the `/v1/tools` listing.
171+
Optional. Defaults to `true` for scalar functions. When `true`, the function is registered as an LLM tool with the same name and description and becomes callable from chat completions, `POST /v1/tools/<name>`, and the `/v1/tools` listing. Table functions (`kind: table`) are always SQL-only regardless of this setting.
157172

158173
Set to `false` to keep the function SQL-only:
159174

0 commit comments

Comments
 (0)