Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(csharp): improve handling of StructArrays #2587

Merged
merged 7 commits into from
Mar 10, 2025

Conversation

davidhcoe
Copy link
Contributor

@davidhcoe davidhcoe changed the title feat(csharp): improve handling of structs feat(csharp): improve handling of StructArrays Mar 7, 2025
@davidhcoe davidhcoe marked this pull request as ready for review March 8, 2025 00:00
@github-actions github-actions bot added this to the ADBC Libraries 18 milestone Mar 8, 2025
Copy link
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change! I've left a few comments and questions to consider, but nothing I'd think of as seriously blocking.

In hindsight, I think this is arguably the wrong approach to dealing with "nonstandard" values whether they're structured or decimal. It would have been better to keep all conversions in Arrow "vector" space instead of dealing with them one-by-one in a "get scalar" function. That way, if I'm a consumer who wants to deal with the results as an array but I don't want to have to handle values one at a time I can say "convert this struct array into a string array" and then it's just a regular Arrow string vector and I can keep going in vector space. For full generality, this might require a change to the C# Arrow implementation to support a common interface between C# arrays and Arrow arrays, but that's probably worth doing or at least thinking about.

(And we can obviously move in those directions over time.)

@@ -76,7 +83,9 @@ public static class IArrowArrayExtensions
case ArrowTypeId.Int64:
return ((Int64Array)arrowArray).GetValue(index);
case ArrowTypeId.String:
return ((StringArray)arrowArray).GetString(index);
StringArray sArray = (StringArray)arrowArray;
if (sArray.Length == 0) { return null; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we get here? Why is this not an error, and why does it impact only StringArray and not other arrays?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still curious about this as it looks strictly wrong. Is there a call stack which shows how we'd get here?

Copy link
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll wait a bit before checking in to see if we can clarify the added code in ValueAt.

@davidhcoe
Copy link
Contributor Author

It happens with:

Message: 
  System.ArgumentOutOfRangeException : Specified argument was out of the range of valid values. (Parameter 'index')

Stack Trace: 
  BinaryArray.GetBytes(Int32 index, Boolean& isNull)
  StringArray.GetString(Int32 index, Encoding encoding)
  IArrowArrayExtensions.ValueAt(IArrowArray arrowArray, Int32 index, StructResultType resultType) line 107
  IArrowArrayExtensions.ValueAt(IArrowArray arrowArray, Int32 index) line 47
  IArrowArrayExtensions.ParseStructArray(StructArray structArray, Int32 index) line 335
  IArrowArrayExtensions.ParseStructArray(StructArray structArray, Int32 index) line 352
  IArrowArrayExtensions.SerializeToJson(StructArray structArray, Int32 index) line 316
  <>c.<GetValueConverter>b__3_25(IArrowArray array, Int32 index) line 298
  AdbcDataReader.GetValue(Int32 ordinal) line 277
  ClientTests.CanClientExecuteQuery(AdbcConnection adbcConnection, TestConfiguration testConfiguration, Action`1 additionalCommandOptionsSetter, String customQuery, Nullable`1 expectedResultsCount, String environmentName) line 134
  ClientTests.CanClientExecuteQuery() line 98
  RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
  MethodBaseInvoker.InvokeWithNoArgs(Object obj, BindingFlags invokeAttr)

so it is a larger Struct of Structs ... perhaps it's a parsing error there instead of a check for GetString?

@CurtHagenlocher
Copy link
Contributor

I just noticed that at lines 335 and lines 363 we're (potentially) calling the wrong version of ValueAt. We need to pass StructResultType.Object. Could that be the cause of the problem?

@CurtHagenlocher
Copy link
Contributor

After the most recent change, I think the lines 350-357 could be simplified to just jsonDictionary.Add(name, children); because children.Count should always equal structArray1.Length which we know isn't zero. (And if it's negative, then something bad has happened... .)

@CurtHagenlocher CurtHagenlocher merged commit 861f009 into apache:main Mar 10, 2025
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

csharp: ValueAt extension causes error when StringArray length = 0
2 participants