You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add a "Struct outputs" section to the UDFs guide showing how to return
multiple related values from a single UDF using pa.struct as the
data_type, illustrated with the image dimensions example.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
> **Note**: Batch UDFS require you to specify `data_type` in the ``@udf`` decorator for batched UDFs which defines `pyarrow.DataType` of the returned `pyarrow.Array`.
61
61
62
+
### Struct outputs
63
+
64
+
A UDF can return multiple related values as a single `struct` column by setting `data_type` to a `pa.struct(...)` and returning a tuple (matched by field order) or a `dict` keyed by field name.
Downstream UDFs can then read individual fields via dot notation in `input_columns` (see below).
85
+
62
86
### Struct fields and list inputs
63
87
64
88
You can pass nested `struct` fields directly into a UDF by specifying `input_columns` with dot notation. For list-typed inputs, Geneva can pass a NumPy array when the argument is annotated as `np.ndarray` (use `np.ndarray | None` for nullable lists).
0 commit comments