Transport-Trino: Manage StdUDF state using instance factory #118
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently, UDF state in Trino's
StdUdfWrapper
is initialized in thespecialize()
method, and is updated ineval()
on certain conditions. State initialization inspecialize()
is not reliable sincespecialize()
result can be cached across multiple UDF invocations, and hence one invocation can use the initialized state from another, leading to issues like query contamination. This patch moves away from manipulating state through thespecialize()
method in Trino UDFs, and instead uses aState
class to keep track of state (in an object conventionally called instance factory). A key property of theState
class is that is constructor is parameterless. To enableState
class to be parameterless while having it contain a reference to the enclosingStdUDF
(see the patch for why the reference is needed), we resort to code generation to create a customState
class for eachStdUDF
, along with the expectedStdUDF
reference. All state manipulation now moves to theeval()
function.