How to pass array column as argument in VectorUdf ? #866
Replies: 1 comment
-
I noticed that the c# implementation of Apache arrow seems like a lot less code than the implementation on the java side. The README isn't that encouraging either . https://github.com/apache/arrow/tree/master/csharp Supposedly "List" is implemented in C#: https://github.com/apache/arrow/blob/master/docs/source/status.rst ... but it doesn't seem to be the case based on your error message. Here is where the error message comes from: Interestingly, it looks like we are using an older implementation than what is shown. If things executed as shown in my link above, then we should have reached a new "List" case in the switch block. Although that wouldn't help you much either, since it seems overly restrictive and would only allow a "List" of one item. I think the larger question is how come the Apache Arrow implementation for C# seems so far behind the other implementations. Someone may want to start putting more effort into that. Here is the nuget In the short term your best option may be to use the non-vectorized UDF. The arrays seemed to work find in that scenario, based on my testing. Although it will obviously run slower. |
Beta Was this translation helpful? Give feedback.
-
I'm trying to implement Vector Udf
I have created .Net Spark environment by following Spark .Net. Vector Udf (Apache arrow and Microsoft.Data.Analysis both) worked for me for IntegerType column. Now, trying to send the Integer array type column to Vector Udf and couldn't find the way to achieve this.
usings
program
Above Udfs will work, if i send the "id" column instead of "array" column. I'm not sure , what type should be the argument of the Udfs for "array" column. Above code results same error like below for Apache.Arrow and Microsoft.Data.Analysis,
asked in SO as well
Thanks
Beta Was this translation helpful? Give feedback.
All reactions