SentimentAnalysisKafka

A Data Intensive Application to analyze sentiments for positive/negative/neutral of Twitter APIs

This project is a good starting point for those who have little or no experience with Kafka & Apache Spark Streaming.

Input data: Tweets with a company names Main model: Data Intensive application that can scale and run efficiently with data models and encoding schemes. Preprocessing and apply sentiment analysis on the tweets Output: Text with all the tweets and their sentiment analysis and competitor names

We use Python version 3.11 and Spark version 3.5.0 and Kafka 3.6.0.

Part 1: Ingest Data using Kafka

This part is about sending tweets from Twitter Sentiment Analysis data. To do this, follow the instructions about the ingestion of Data using Kafka.

Part 2: Tweet preprocessing and sentiment analysis

In this part, we receive tweets from Kafka and preprocess them with the pyspark library which is python's API for spark. We then apply sentiment analysis using textblob; A python's library for processing textual Data. And have competitors names extracted using spacy library.

After sentiment analysis, we write the sentiment analysis and competitor names in the dashboard using Flask applicaiton. We have also the possibility to store in a parquet file, which is a data storage format.

Part 3: Data Warehouse (Snowflake) and OLAP (Online Analytical Processing)

A Single source of truth and an integrated, non-volatile store for historical streamed data by Spark for complex querying helping analysts for Daily/Monthly/Quarterly Summaries Example1: Average Sentiment Score per Company

Example2: Top Competitors by Mention Count

Example3: Sentiment Trend Over Time/Daily Summary

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
twittersentimentanalysis-main		twittersentimentanalysis-main
.gitignore		.gitignore
LICENSE		LICENSE
OLAP.sql		OLAP.sql
README.md		README.md
install_script.sh		install_script.sh
instructions_for_hardcoded_values		instructions_for_hardcoded_values

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SentimentAnalysisKafka

Part 1: Ingest Data using Kafka

Part 2: Tweet preprocessing and sentiment analysis

Part 3: Data Warehouse (Snowflake) and OLAP (Online Analytical Processing)

About

Uh oh!

Releases

Packages

Languages

License

zi78494umbcedu/SentimentAnalysisKafka

Folders and files

Latest commit

History

Repository files navigation

SentimentAnalysisKafka

Part 1: Ingest Data using Kafka

Part 2: Tweet preprocessing and sentiment analysis

Part 3: Data Warehouse (Snowflake) and OLAP (Online Analytical Processing)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages