Description
Update 28thFeb2023
Thanks to @dragorosson
Following the advice by @dragorosson
- Building Spark .NET
- Building Spark .NET Scala Extensions Layer
It is possible in Windows to prepare openjdk-8-jdk, mvm, spark-3.2.3-bin-hadoop3.2 to create
version = 2.1.1
microsoft-spark-3-2/target/microsoft-spark-3-2-3_2.12-[2.1.1].jar
By compiling Microsoft.Spark.Worker in .NET7, instead of .NET6.0, this ensure consistency of addressing the BinaryFormatter compiling error. UDF now works. Delta works.
Next open issue:
- Key issue left is support for
microsoft-spark-3-3.jar
Previous discussions
#### Update 23rdFeb2023Please share your feedback to this observation provided here, but please do this within this issue for tracking purposes.
For Spark, this project is key for .NET developers to stay within .NET when dealing with big data analytics. It is unclear WHY there are questionable and sporadic commitments shown here. If this effort here fails OR with further delay, it could have a ripple effect on the ENTIRE
machine learning and deep learning .NET efforts.
The triangular THREE PRONGED .NET efforts to KEEP big data analytics within .NET could be questionable.
- Machine Learning (ML.NET)
- PolyGlot
- .NET for Spark
Update 20thFeb2023
- Test the WIP .NET6 version
- One of the main TODO for the (yet-to-release) .NET6 is UDF support.
BinaryFormatter problem has been reported since Dec 2020. This will be the top priority for the .NET6 before it is being released
- One of the main TODO for the (yet-to-release) .NET6 is UDF support.
- Check the WIP .NET6 with
microsoft-spark-3-2.jar
- Check the WIP .NET6 with
microsoft-spark-3-2_2.12-2.1.1.jar
So far, I could only get the WIP merged .NET6 to work with
microsoft-spark-3-1_2.12-2.1.1.jar
- Check the WIP .NET6 with
Update 15thFeb2023
It seems the Azure Synapse team has officially deleted ALL C# .NET for Spark samples for Synapse Jan 2023. Sad!