docs: Add table function (kind: table) documentation

claudespice · lukekim · commit e654f0c5cd34 · 2026-05-05T09:03:59.000-07:00
Document user-defined table functions alongside the existing scalar
function docs. Table functions return multiple rows/columns, use a
SELECT query as body (with args exposed via a virtual args table),
and are always SQL-only (never surfaced as LLM tools).
diff --git a/website/docs/features/functions/index.md b/website/docs/features/functions/index.md
@@ -1,7 +1,7 @@
 ---
 title: 'Functions'
 sidebar_label: 'Functions'
-description: 'Define custom scalar SQL functions inline (SQL tier) or by calling remote HTTP services (Remote tier), automatically exposed as SQL functions and LLM tools.'
+description: 'Define custom scalar and table SQL functions inline (SQL tier) or by calling remote HTTP services (Remote tier), automatically exposed as SQL functions and LLM tools.'
 sidebar_position: 11
 pagination_prev: null
 pagination_next: null
@@ -12,7 +12,7 @@ tags:
   - tools
 ---
 
-Functions extend Spice's SQL engine with custom scalar logic declared in your Spicepod. Each function can be:
+Functions extend Spice's SQL engine with custom logic declared in your Spicepod. Functions can be **scalar** (one value per row) or **table** (returning multiple rows and columns). Each function can be:
 
 - **Called directly in SQL** like any built-in function (`SELECT my_fn(col) FROM ...`).
 - **Surfaced to LLMs as tools** for tool-calling workflows.
@@ -169,6 +169,60 @@ The runtime sends a single HTTP `POST` per batch with `Content-Type: application
 
 Calls to remote functions require the runtime to be configured with [`runtime.auth.api-key`](../../reference/spicepod/runtime#runtimeauth) — they execute under the read-write API key context.
 
+## Table Functions
+
+Table functions (`kind: table`) return multiple rows and columns instead of a single scalar value. They are called using standard SQL table-function syntax: `SELECT ... FROM my_table_fn(args)`.
+
+### Declaring a table function
+
+Set `kind: table` and provide `returns` as a list of output columns (instead of a single type string):
+
+```yaml
+functions:
+  - name: emit_pair
+    from: sql
+    kind: table
+    description: Emit an input row and its successor.
+    volatility: immutable
+    signature:
+      args: [{ name: x, type: int64 }]
+      returns:
+        - { name: value, type: int64 }
+        - { name: doubled, type: int64 }
+    body: |
+      SELECT x AS value, x * 2 AS doubled FROM args
+      UNION ALL
+      SELECT x + 1 AS value, (x + 1) * 2 AS doubled FROM args
+```
+
+```sql
+SELECT value, doubled FROM emit_pair(4) ORDER BY value;
+-- value | doubled
+-- ------+--------
+--     4 |       8
+--     5 |      10
+```
+
+### Key differences from scalar functions
+
+| Aspect | Scalar | Table |
+| --- | --- | --- |
+| `kind:` | `scalar` (default) | `table` |
+| `signature.returns:` | Single Arrow type string (e.g., `int64`) | List of `{name, type}` output columns |
+| `body:` (SQL tier) | Single SQL expression | Full SQL `SELECT` query |
+| LLM tool exposure | `as_tool: true` (default) | Always SQL-only |
+
+### SQL body for table functions
+
+The `body` of a SQL table function is a complete `SELECT` query (not an expression). Scalar arguments are exposed through a virtual one-row table named `args`:
+
+```yaml
+body: |
+  SELECT x AS value, x * 2 AS doubled FROM args
+```
+
+The query's output columns must match the declared `returns` schema.
+
 ## Volatility
 
 Volatility tells the optimizer how the function behaves across calls. Pick the strongest level that's actually true — the default (`volatile`) is the safest but disables constant folding, query-level caching, and pushdown.
@@ -235,7 +289,7 @@ The `list_udfs()` UDTF returns every function registered in the runtime, includi
 | ------------- | --------------------------------------------------------------------- |
 | `name`        | Function identifier.                                                  |
 | `source`      | `user` for declared functions, `builtin` for Spice/DataFusion ones.    |
-| `kind`        | `scalar` for user functions, `NULL` for built-ins.                    |
+| `kind`        | `scalar` or `table` for user functions, `NULL` for built-ins.         |
 | `volatility`  | `immutable` / `stable` / `volatile`.                                  |
 | `from`        | `sql`, `http://...`, or `https://...`.                                |
 | `description` | The declared description, if any.                                     |
@@ -250,7 +304,7 @@ Returns a JSON array of user functions only (built-ins are excluded). Each entry
 
 ## Functions as LLM tools
 
-Every declared function is automatically callable from LLMs as a tool with the same name and description. This lets a model reason in natural language and then invoke `haversine_km(...)` or `classify_intent(...)` directly.
+Every declared scalar function is automatically callable from LLMs as a tool with the same name and description. This lets a model reason in natural language and then invoke `haversine_km(...)` or `classify_intent(...)` directly. Table functions (`kind: table`) are always SQL-only and are not surfaced as LLM tools.
 
 :::tip[Many functions? Use the Tool Registry]
 A Spicepod with many functions can quickly cross the threshold where injecting every function definition into every chat turn becomes expensive. The [Tool Registry](../tool-registry/index.md) replaces individual tool definitions with searchable `tool_search` / `tool_invoke` meta-tools, typically saving ~10× the per-turn tool-definition tokens. Set `tools: auto` on the model and the registry kicks in automatically once the function count crosses the threshold.
diff --git a/website/docs/reference/spicepod/functions.md b/website/docs/reference/spicepod/functions.md
@@ -4,7 +4,7 @@ sidebar_label: 'Functions'
 description: 'User-defined functions YAML reference'
 ---
 
-Functions extend Spice's SQL engine with custom scalar logic. Each entry in the top-level `functions:` block is registered as a callable SQL function and (by default) as an LLM tool.
+Functions extend Spice's SQL engine with custom scalar and table logic. Each entry in the top-level `functions:` block is registered as a callable SQL function and (for scalar functions, by default) as an LLM tool.
 
 For an overview, examples, and execution-tier details, see [Functions](../../features/functions).
 
@@ -14,7 +14,7 @@ The `functions:` section is only honored when [`runtime.functions.enabled`](./ru
 
 ## `functions`
 
-The `functions:` section in your configuration declares one or more scalar functions.
+The `functions:` section in your configuration declares one or more scalar or table functions.
 
 Example:
 
@@ -58,7 +58,12 @@ Optional. Free-form description surfaced in `list_udfs()`, `GET /v1/functions`,
 
 ### `kind`
 
-Optional. Defaults to `scalar`. Only scalar functions are supported in the current beta.
+Optional. Defaults to `scalar`.
+
+| Value    | Description                                                                                              |
+| -------- | -------------------------------------------------------------------------------------------------------- |
+| `scalar` | (default) Returns a single value per row. Called as `SELECT my_fn(x) FROM ...`.                           |
+| `table`  | Returns multiple rows and columns. Called as `SELECT ... FROM my_fn(x)`. Always SQL-only (`as_tool` is ignored). |
 
 ### `volatility`
 
@@ -95,11 +100,21 @@ Each entry has:
 
 #### `signature.returns`
 
-Required for scalar functions. The Arrow type of the function's output, in the same format as argument types.
+Required. For **scalar functions**, a single Arrow type string (e.g., `int64`, `utf8`). For **table functions**, a list of output column definitions:
+
+```yaml
+# Scalar function
+returns: int64
+
+# Table function
+returns:
+  - { name: value, type: int64 }
+  - { name: label, type: utf8 }
+```
 
 ### `body`
 
-Inline SQL expression body. Required when `from: sql` (unless `body_ref` is set instead). Must be a single SQL expression (not a statement) referencing the function's arguments by name.
+Inline SQL body. Required when `from: sql` (unless `body_ref` is set instead). For **scalar functions**, must be a single SQL expression referencing the function's arguments by name. For **table functions**, must be a single `SELECT` query; scalar arguments are available via a virtual `args` table.
 
 ```yaml
 body: |
@@ -153,7 +168,7 @@ metadata:
 
 ### `as_tool`
 
-Optional. Defaults to `true`. When `true`, the function is registered as an LLM tool with the same name and description and becomes callable from chat completions, `POST /v1/tools/<name>`, and the `/v1/tools` listing.
+Optional. Defaults to `true` for scalar functions. When `true`, the function is registered as an LLM tool with the same name and description and becomes callable from chat completions, `POST /v1/tools/<name>`, and the `/v1/tools` listing. Table functions (`kind: table`) are always SQL-only regardless of this setting.
 
 Set to `false` to keep the function SQL-only: