Skip to content

Where to find possible streaming formats? #197

Open
@bamurtaugh

Description

@bamurtaugh

I am trying to create a structured streaming example that could use .NET Spark to analyze incoming Tweets (i.e. filter them, maybe apply ML.NET sentiment analysis efficiently). I found a nuget package called Tweetinvi that allows me to create a stream of incoming Tweets.

When I was trying to find a way to connect a Twitter stream with .NET Spark, I wasn't sure the best option. The only input options for DataStreamReader.Format() that I could find in the examples in this repo were "Kafka" and "socket."

Is there a comprehensive list of all possible formats we can use in a .NET Spark structured streaming scenario, or are socket and Kafka the only two options? Is there any explanation of what counts as a socket connection (i.e. if I find host/port information through Tweetinvi, could I just use the socket format for my Spark program)?

If these are the only two format options, or if there is no format suitable for the stream format I have currently generated, I figure I may need to add another step, like finding a way to intertwine my Twitter stream with Kafka, and then create a Kafka-based Spark streaming scenario.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions