Skip to content

Conversation

@Broshen
Copy link
Contributor

@Broshen Broshen commented Jul 30, 2025

BREAKING CHANGE: this changes the function signature of ResolveType, which is a public method on public structs ScalarFunctionVariant/AggregateFunctionVariant/WindowFunctionVariant

If anybody is using/referencing the function/these structs in their own projects, outside of the substrait-go repo, their function calls would no longer work because of the change in signature.

@codecov
Copy link

codecov bot commented Jul 30, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 65.61%. Comparing base (8ca89e3) to head (2d9ea58).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #147      +/-   ##
==========================================
+ Coverage   65.07%   65.61%   +0.54%     
==========================================
  Files          45       45              
  Lines       11625    11625              
==========================================
+ Hits         7565     7628      +63     
+ Misses       3734     3668      -66     
- Partials      326      329       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@vbarua vbarua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a cursory pass and noticed a couple of things.

Pinging @zeroshade as well, as he did a bunch of work in these files.

type variant interface {
*extensions.ScalarFunctionVariant | *extensions.AggregateFunctionVariant | *extensions.WindowFunctionVariant
ResolveType([]types.Type) (types.Type, error)
ResolveType([]types.Type, extensions.Set) (types.Type, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically speaking, this is a breaking change and we should mark it as such.

I think it's reasonable though, because type resolution needs access to the extensions.Set in order to get and/or set user-defined types if they are present.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated PR description to be

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just mark resolveType deprecated and add resolveTypeExt or something like that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just mark resolveType deprecated and add resolveTypeExt or something like that?

We could, but I'd rather just make the API "correct" and not deal with sharp-edges poking around. There's also another breaking change I'm looking at this week, so we might as well get a 2-in-1 deal.

Copy link
Member

@vbarua vbarua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more minor comments, but after that I'm happy to merge this in.

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result, err := extensions.EvaluateTypeExpression(tt.nulls, tt.ret, tt.extArgs, &tt.variadic, tt.args)
result, err := extensions.EvaluateTypeExpression(tt.nulls, tt.ret, tt.extArgs, &tt.variadic, tt.args, extensions.NewSet(), "test://uri")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: for consistency you should pick one of /test_uri or test://uri for your fake URI

"typeAnchor": 2,
"name": "custom_type2"
}
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two bits here, just the extensionType types, are the only thing we need to check here, the rest is just noise IMO. Do you think it would be reasonable to check those directly rather than embedding the whole plan?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OH i thought we wanted to check the outputType.userDefined.typeReference matches the extensionType.typeAnchor here too so we'd have to check the full plan 🤦
i'll refactor

// argumentTypes: the actual argument types provided to the function
func EvaluateTypeExpression(nullHandling NullabilityHandling, returnTypeExpr types.FuncDefArgType,
funcParameters FuncParameterList, variadic *VariadicBehavior, argumentTypes []types.Type) (types.Type, error) {
funcParameters FuncParameterList, variadic *VariadicBehavior, argumentTypes []types.Type, registry Set, uri string) (types.Type, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: could you make uri the first arg to this. From looking at call sites

	return EvaluateTypeExpression(s.impl.Nullability, s.impl.Return.ValueType, s.impl.Args, s.impl.Variadic, argumentTypes, registry, s.uri)

it makes sense to group the arguments from the function variant together. URI is the only one that comes from *Variant and not the *Impl so it feels like it should come first

Could you update the function doc as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - do the https://pkg.go.dev/github.com/substrait-io/substrait-go/v4/... function docs get updated automatically w/ adding it as comments here? or is there something manual to do?


if udt, ok := outType.(*types.UserDefinedType); ok {
name := strings.TrimPrefix(returnTypeExpr.ShortString(), "u!") // short string contains the u! prefix, but type definitions in the extensions don't
udt.TypeReference = registry.GetTypeAnchor(ID{Name: name, URI: uri}) // assume that the user defined type is in the same extension as the function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should encode the "uri is expected to be the function uri" into the function docs instead of a comment.

Copy link
Member

@vbarua vbarua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good overall. I think you left some debugging printlns in. If remove them, I'll merge this in tomorrow.

// custom_type1 is referenced as an argument and return type, so should be registered in the extensions
// custom_type2 is referenced as an argument and return type, so should be registered in the extensions
// custom_type3 is only referenced as a return type, but should still be registered in the extensions
// custom_type4 is not referenced in the plan at all, so not be registerd in the extensions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooooh, good idea to check for a type that's not used.

@vbarua vbarua merged commit 485f6dc into substrait-io:main Aug 8, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants