-
Notifications
You must be signed in to change notification settings - Fork 135
GetConnectionId and expose the BindData struct in the scalar UDF executor #457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks a lot for the PR! It looks like it would be simpler to pass context.Context directly to the UDF instead of BindData. However, in that case, we would need to implement logic to store the Go context at the duckdb.Conn (duckdb.Connector) level. Also, we need to implement the same logic to retrieve the connection ID from the table UDF bind info. |
I have prepared proposal based on your branch taniabogatsch#3 The user data can be passed through context, as a normal go way. The big limitation is when user make concurrent queries in one connection, the context in UDF will be from the last query Could you take a look at? |
EDIT: I was writing this before you submitted your PR to my fork - will check that out now (thanks for the input!) :)
What I like about having something like Here's a bit more information on how I envision an (optional) bind callback in the future, once we have this PR available: duckdb/duckdb#17666. For example, internally, during bind, we can have the input parameters available as typedef unique_ptr<FunctionData> (*bind_scalar_function_t)(ClientContext &context, ScalarFunction &bound_function,
vector<unique_ptr<Expression>> &arguments); In the future, we can expose these in the C API, and make it possible to, e.g., have access to a constant expression value, or a return type. With that knowledge, the client can then store additional information (necessary for execution) in the
That's a good point, I haven't checked if we already have the necessary C API functions in place for that. |
My last comment was at the same time with yours)) |
I just had a quick look at your PR, it's pretty neat! I'll reiterate on it later today, and merge it + update this draft, after I've played around with your changes a bit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @taniabogatsch!
Thanks both of you for your thorough review feedback! 💪 I've implemented all suggestions, and left a comment w.r.t. the EDIT: I just saw the comments - will not merge this yet. |
all good, you can merge it |
feel free to merge when you're ready :) |
We use closures when constructing the executor function in order to pass the context through. But yes, you’re right — the function is called for each chunk, and there are indeed disproportionately more read operations.Sent from my iPhoneOn 27. May 2025, at 18:16, Lorenzo Paoliani ***@***.***> wrote:
@lorenzowritescode commented on this pull request.
In context.go:
+type ctxStore struct {
+ mu sync.Mutex
+ store map[uint64]context.Context
+}
won't we need to read from the context store every time we call RowExecutor? That's a lot of reads compared to writes, no? Or maybe I am missing something and we don't need to access the map every time
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Expose
GetConnectionId
:Adds a new struct
BindData
, which contains data DuckDB, and in the future also clients, can set during scalar function binding. Currently, this only contains the connection ID. Together withGetConnectionId
, this enables users to map their scalar functions to the connection that executes the scalar function.I'm opening this as a draft PR, as there are still a few unresolved points. Also, to make arbitrary binding data available during execution, I need to PR more C API functions to DuckDB main.
Questions:
Conn
or evenBindData
?type ScalarFuncExecutor struct
with another field allowing a new type of executor. This should be in line with Go's compatibility rules (https://go.dev/blog/module-compatibility) for structs. But please let me know if I am missing something here.cc @VGSML @lorenzowritescode