Description
I'm a long time C# programmer but just getting my feet wet with .Net for Apache Spark. Following many "getting started" instructions and videos, I installed:
7-Zip
Java 8
I downloaded Apache Spark from https://spark.apache.org/downloads.html
.NET for Apache Spark v2.1.1
WinUtils.exe I'm running this on Window 10
Problem:
When I call DataFrame.Show() after doing a DataFrame.WithColumn() using a UDF, I always get an error: [2023-02-07T15:45:31.3903664Z] [DESKTOP-H37P8Q0] [Error] [TaskRunner] [0] ProcessStream() failed with exception: System.ArgumentNullException: Value cannot be null. Parameter name: type
TestCases.csv looks like this:
TestCases.csv
OrderList.csv looks like this:
OrderList.csv
Here is the Program class of the TestSparkApp console project:
Program.cs.txt
and supporting classes:
Player.cs.txt
Collector.cs.txt
Here is the output of the above app:
TestSpartAppOutput.txt
Note that the same bug will appear executing many different methods on the DataFrame object but only after a call to the WithColumn method using a UDF. In this case, the code looks like this:
// user defined function
Func<Column, Column, Column> GetSubst = Udf<string, string, int>(
(strOrder, strPlayers) =>
{
return GetSubstance(strOrder, strPlayers);
});
// call the user defined function and add a new column to the dataframe
ordersFrame = ordersFrame.WithColumn("substance", GetSubst(ordersFrame["names"], ordersFrame["players"]).Cast("Integer"));
// *** This is where the error will be thrown, but if I comment it out, the same error will be thrown later
// print out the data
ordersFrame.Show(20, 20, false);
however, I've tried it with other UDFs followed by other DataFrame method calls and I always get the same error. In the Main() function, you will see a later foreach loop. If I comment out the ordersFrame.Show() call, and comment in the contents of the loop, I will get the same error when I access row.Values[0].ToString().
I wonder if I have missed something in my installation?
Desktop (please complete the following information):
- OS: Windows 10
- Browser n/a
- Version see above