-
Notifications
You must be signed in to change notification settings - Fork 187
feat: update VirtualTable to use an expression for defining table records #786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
While I like this simplification there is on thing I'm concerned about. That is the lack of widespread support for complex types (here list and structs). A simplistic Substrait consumer could still be considered compliant without that support. But if we require support that means systems like DuckDB will need that support before they can even define a virtual table which is a way of providing data without a physical table. I suppose we could implement this as an alternative with the expectation that both methods could still be options a few years from now. |
|
Thanks, @EpsilonPrime ! That makes sense--rather than deprecating the field, I've updated it to use a |
|
@EpsilonPrime, could you take a look at the latest iteration? I had to wrap the |
|
I'm sure this will be discussed at this week's community meeting. In the meantime two possibilities could be taken here:
|
|
Ah, but the goal here is to return all of the data for the table in one expression and not too allow dynamic parameters throughout. So my solutions don't work. It's either this new way or the old way. It's probably still fine without the oneof. Providing a whole table as a dynamic parameter is possible. The closest analog we have with the existing behavior would to be able to return an entire column (except we don't have anything that specifies it is a row definition). I suppose we could continue to use the expressions table with an enum that specifies whether the expressions are values (current behavior), columns (requiring that each expression makes sense as a column) and table (requiring only one expression). Need to fiddle with that idea to make it work though (I haven't gone through all the cases). |
Actually, the main idea was that a single expression can accomodate both patterns (or anything in between). That is, you can have (1) a single dynamic expression of type list, (2) a list expression of dynamic expressions of type struct, or (3) a list expression of struct expressions where each field is a dynamic expression corresponding to a column in the virtual table. The original concern about this approach was that there is a lack of widespread support for complex types (here list and structs). However, I assume the main concern is lists, since structs are already required for compatibility with the latest version of |
I won't be able to attend tomorrow, but please pass along any questions that come up. |
|
What do you think about just adding Unnest relation. Then we can use a single literal expression in a virtual table and push towards unnest for the actual unrolling (rather than trying to overload virtual table to do this). |
feat: This PR modifies the
VirtualTablemessage to introduce a newrecord_list_expressionfield. Therecord_list_expressionfield is anExpressionthat evaluates into a list of structs (LIST<STRUCT<T1, ..., Tn>>).This improvement leverages Substrait's support for nested types and provides greater flexibility than the existing representation, i.e., it allows using a single dynamic parameter expression (#780) to represent the records in a virtual table.