Skip to content

Conversation

@benbellick
Copy link
Member

@benbellick benbellick commented Oct 25, 2025

  • feat: introduces a notion of YAML-defined table functions which are represented as a new TableFunctionRel (take in no relations)

Closes #823

// string that case-insensitively matches one of the allowed options.
repeated FunctionArgument arguments = 3;

// The derived fields indicates whether or not the YAML file produced the schema:
Copy link
Member Author

@benbellick benbellick Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every other function type has a FunctionOption field around here. Do we need that for table functions as well?

message TableFunctionRel {
    ...
    repeated FunctionOption options = 4;
    ...
}

$ref: "#/$defs/sessionDependent"
deterministic:
$ref: "#/$defs/deterministic"
schema:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to come up with a more explicit way to mark the inability of deriving an output schema for a particular table function? Or is it sufficient to just let the absence of the schema field indicate this?

//
// Future extensions may add an optional input field to support transformation
// table functions that operate on input relations.
message TableFunctionRel {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a more appropriate name for this GeneratorTableFunctionRel as introduced by
@jacques-n here?

In my mind there are two possible paths we can go down which determine the appropriate name.

  1. We could explore expanding this proto (by e.g. adding a relations input field) so that it appropriately models both kinds of table functions. In this case, keeping the name as TableFunctionRel makes sense.
  2. We could introduce a brand new relation to represent table functions which take relations as input. In this case, it would make sense to rename this relation to GeneratorTableRel or some name which distinguishes it from the other kind of table function.

}
}

message TableFunction {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this to be consistent with the other function definitions, though AFAICT, this isn't really used anywhere. I found #193 from a few years ago which suggests that this should continue to be updated, but I'm not sure if that is still the prevailing philosophy.


NamedStruct schema = 9;

repeated Implementation implementations = 10;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other functions have variadic final argument logic like:

message TableFunction {
    ...
    oneof final_variable_behavior {
      FinalArgVariadic variadic = 10;
      FinalArgNormal normal = 11;
    }
    ...
}

However, I skipped it for now for simplicity. Is it necessary on a first pass?


- **Transformation table functions**: Functions that take an input relation and transform it (by adding an optional `input` field to `TableFunctionRel`)
- **Set-returning functions**: Functions that process input records and produce multiple output records per input
- **Lateral joins**: Applying table functions to each row of an input relation
Copy link
Member Author

@benbellick benbellick Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, we would be able to model something like:

SELECT unnest(col) FROM table;

However, my understanding is that this is really syntactic sugar for:

SELECT u.value 
FROM table t 
CROSS JOIN LATERAL unnest(t.col) AS u(value);

If that is correct, then we really need LateralJoinRel to properly model this usage of unnest.

It is probably worth then clarifying somewhere in the documentation introduced in this PR that the arguments to a table function can't be field references.

# If the return value for the function implementation is multiple lines long,
# print each line separately. This is the case for some functions in
# functions_arithmetic_decimal.yaml
if "\n" in impl["return"]:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: this could probably use a little work to make it a bit nicer. I'll work on that later once we have some general agreement on the spec changes.

The change here is for now just to get CI/CD to pass.

@benbellick benbellick force-pushed the benbellick/simple-table-fns branch from 62ca59b to 2b04636 Compare October 25, 2025 20:50
introduces a notion of YAML-defined functions which are represented as
a new `TableFunctionRel`.

Closes #823
This is for consistency with other functions
@benbellick benbellick force-pushed the benbellick/simple-table-fns branch from 2b04636 to 26fec6b Compare October 27, 2025 20:54
- **Determinism**: Whether the function produces the same output for the same inputs
- **Session Dependency**: Whether the function depends on session state

## Schema Determination
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: I think this section could use a little bit of work:

  • add some examples with YAML files to demonstrate the difference between derived / explicit
  • Make the fact that there are these two kinds a bit clearer.

@benbellick benbellick force-pushed the benbellick/simple-table-fns branch from f707684 to b090c4e Compare October 28, 2025 19:54
@benbellick benbellick changed the title feat: introduction of simple table functions (WIP) feat: introduction of simple table functions Oct 28, 2025
@benbellick benbellick marked this pull request as ready for review October 28, 2025 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Where can Table Function be used?

2 participants