Connect to a remote spark cluster (Databricks Connect) #516
Replies: 5 comments 16 replies
-
We are working with Databricks engineers to be able to use C# with databricks-connect. We will update this thread when we get more info. cc: @elvaliuliuliu |
Beta Was this translation helpful? Give feedback.
-
Thank you! My best bet would be to keep this python api for now? Or is there another way to have my web service trigger a SQL query against a Databricks model? Also, is there an ETA around this? Thank you very much for such a quick reply! |
Beta Was this translation helpful? Give feedback.
-
I have posted about this on https://ideas.databricks.com/ideas/ as recommended here and in other threads on this topic. However, is this really up to the databricks folk to implement? I'm not sure it is... my understanding is that the databricks connect libraries for python, r, etc. rely on a vendored set of .jar files that are shipped when you install databricks connect. ie. After having performed the basic setup:
Then you can find the location of the installed 'modified' spark implementation using:
However, it seems currently what is lacking is an implementation in Microsoft.Spark to allow those values to be used as overrides in the library? Specifically the jars folder. Am I mistaken in my understanding? I've seen multiple threads saying this is something that needs to be implemented by the Databricks folk, but my understanding is that, that is not correct. The databricks connect implementation for arbitrary consumers already exists, it is simply:
tldr; Seems like game of hand ball here to me, and I'm not sure it's fair to say this is being blocked by the Databricks side... |
Beta Was this translation helpful? Give feedback.
-
@shadowmint, .NET for Apache Spark works with databricks connect as long as you don't use UDFs written in C#. Databricks connect works by sending the logical plans to the server side and while the plans are parsed on the server side, the I had a call with a Databricks engineer and got a confirmation that the server side needs to be updated. I will update this thread with the latest status. Thanks. |
Beta Was this translation helpful? Give feedback.
-
To answer your second question:
We are working on making .NET for Spark usable for all Spark engines and experiences. However, each company that offers Spark services has to make its own decision on whether or not to include it out of the box. If they do so, we are happy to help.
At this point, I am aware of Azure HDInsight and Azure Synapse workspaces to offer .NET for Spark out of the box. I do like the Synapse experience, especially since it provides nice integration with the nteract based notebooks in Synapse.
Best regards
Michael
|
Beta Was this translation helpful? Give feedback.
-
Hi!
I need to refactor a python api that uses pyspark to execute a sql statement on Spark cluster in Azure Databricks. It uses Databricks connect (https://docs.databricks.com/dev-tools/databricks-connect.html) to initialize the connectivity to the cluster.
Is it possible to achieve something like this in c#, I would love to have an mvc core api!
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions