Description
I am trying to create a structured streaming example that could use .NET Spark to analyze incoming Tweets (i.e. filter them, maybe apply ML.NET sentiment analysis efficiently). I found a nuget package called Tweetinvi that allows me to create a stream of incoming Tweets.
When I was trying to find a way to connect a Twitter stream with .NET Spark, I wasn't sure the best option. The only input options for DataStreamReader.Format() that I could find in the examples in this repo were "Kafka" and "socket."
Is there a comprehensive list of all possible formats we can use in a .NET Spark structured streaming scenario, or are socket and Kafka the only two options? Is there any explanation of what counts as a socket connection (i.e. if I find host/port information through Tweetinvi, could I just use the socket format for my Spark program)?
If these are the only two format options, or if there is no format suitable for the stream format I have currently generated, I figure I may need to add another step, like finding a way to intertwine my Twitter stream with Kafka, and then create a Kafka-based Spark streaming scenario.