-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(csharp): improve handling of StructArrays #2587
Conversation
davidhcoe
commented
Mar 7, 2025
- improves the handling of structs to return objects or JsonString (defaults to JsonString to not break existing callers)
- additional testing for each return type
- updates to the ADO.NET wrapper to support both struct types
- fixes csharp: ValueAt extension causes error when StringArray length = 0 #2586
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the change! I've left a few comments and questions to consider, but nothing I'd think of as seriously blocking.
In hindsight, I think this is arguably the wrong approach to dealing with "nonstandard" values whether they're structured or decimal. It would have been better to keep all conversions in Arrow "vector" space instead of dealing with them one-by-one in a "get scalar" function. That way, if I'm a consumer who wants to deal with the results as an array but I don't want to have to handle values one at a time I can say "convert this struct array into a string array" and then it's just a regular Arrow string vector and I can keep going in vector space. For full generality, this might require a change to the C# Arrow implementation to support a common interface between C# arrays and Arrow arrays, but that's probably worth doing or at least thinking about.
(And we can obviously move in those directions over time.)
@@ -76,7 +83,9 @@ public static class IArrowArrayExtensions | |||
case ArrowTypeId.Int64: | |||
return ((Int64Array)arrowArray).GetValue(index); | |||
case ArrowTypeId.String: | |||
return ((StringArray)arrowArray).GetString(index); | |||
StringArray sArray = (StringArray)arrowArray; | |||
if (sArray.Length == 0) { return null; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can we get here? Why is this not an error, and why does it impact only StringArray and not other arrays?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still curious about this as it looks strictly wrong. Is there a call stack which shows how we'd get here?
csharp/src/Apache.Arrow.Adbc/Extensions/IArrowArrayExtensions.cs
Outdated
Show resolved
Hide resolved
csharp/src/Apache.Arrow.Adbc/Extensions/IArrowArrayExtensions.cs
Outdated
Show resolved
Hide resolved
csharp/src/Apache.Arrow.Adbc/Extensions/IArrowArrayExtensions.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll wait a bit before checking in to see if we can clarify the added code in ValueAt
.
It happens with:
so it is a larger Struct of Structs ... perhaps it's a parsing error there instead of a check for GetString? |
I just noticed that at lines 335 and lines 363 we're (potentially) calling the wrong version of |
After the most recent change, I think the lines 350-357 could be simplified to just |