Sentiment Analysis on Automobile Brands

Sentiment Analysis on Automobile Brands Using DonanımHaber Data

Sentiment Analysis on Automobile Brands is a sentiment analysis project that that helps you gain insights from text data by analyzing and classifying as negative, neutral or positive the sentiment expressed within the text.

The preprocessing steps involve removing punctuation marks, stop words, non-sentiment phrases, URLs, and numbers. Additionally, all letters are converted to lowercase, and time information is transformed into a date format to prepare the data for analysis. In the preprocessing , explanations were provided for the necessity of these operations in sentiment analysis.

You can prepare your raw dataset for natural language processing with the preprocessing tool, or you can use my preprocessed dataset.

Data Annotation

The comments on the DonanımHaber site lack sentiment tags. If the comments had a star rating feature like other websites, we could easily categorize them based on their star ratings.

Since we don't have any data other than the text itself, we must rely on the textual context to label the comments as negative, positive, or neutral.

To achieve this, we employed two different approaches:

1.Data Annotation with OpenAI API

If you are using the free version of OpenAI, you won't be able to send prompts continuously. The free version encounters an error in approximately 20 seconds. That's why we wait for 20 seconds when an error occurs. However, this approach proved to be very time-consuming when applied to the entire dataset, so we couldn't label the dataset using this method. When I compare a few comments I labeled using the OPENAI API with the alternative approach, I can confidently say that this method provides more accurate results.

2.Data Annotation with FLAN

The FLAN-T5 base model was used in the context of sentiment analysis. Originally, the FLAN-T5 base model was created as a pre-trained model for text summarization. To make it suitable for sentiment analysis, the model underwent fine-tuning, involving adjustments to numerous parameters, using a labelled dataset.

The labeled dataset used in this context was not generated through web scraping. It was pre-existing and obtained from https://www.kaggle.com/datasets/seymasa/turkish-sales-comments/data.

You have the option to label the unlabeled dataset, which was generated through web scraping, using data annotation with FLAN. Alternatively, you can also download my dataset that has been labeled with FLAN.

However, it can be stated that manual tagging is the most trustworthy approach for labeling text.

At this stage, I received assistance from the FLAN.

Create Model and Feature Extraction with BERT

The pre-trained Turkish BERT model is utilized for performing sentiment analysis in Turkish.

With create_model_with_bert.ipynb, you can train and save a model for performing sentiment analysis on Turkish texts.

During this process, feature extraction was performed using the BERT model, and the data were converted into tensors to be fed into the model.

The model was then created and trained using the Keras.Sequential() function.

Finally, the model results were visualized.

Each stage is explained in detail in create_model_with_bert.ipynb and I recommend that you read it.

create_model_with_bert.ipynb may take some time to run, but since the model is saved as best_model.h5 after training, you can use the model multiple times without the need to train it again.

At this stage, I received assistance from the Sentiment_Analyse.

Sentiment Analyse

The model is loaded, and sentiment analysis is conducted on comments obtained through separate web scraping processes for each brand.

The resulting sentiment scores are organized based on the date of the comments.

Additionally, the analysis calculates the number of comments made on each day.

The outcomes are visualized using Matplotlib and saved for potential use in various plots within the web application

You can perform sentiment analysis by running the sentiment_analyse.ipynb file.

The results obtained are saved in the Scores directory.

Processing for Visualization

Sentiment analysis results will be visualized using Echarts charts in the web application created with Flask.

By running processing_for_visualization.ipynb, the analysis results are prepared for use in Echarts.

The results have been stored in the GraphicData folder.

Here are a few analysis charts.

1.Graph showing daily sentiment scores of all brands.

2.A graph displaying the total comments for each brand.

3.Brand-specific daily and total sentiment analysis graphs.

Flask

Graphics created with Flask were turned into a web application that will run on localhost.

A dynamic website was created showing the sentiment results obtained with Flask.

Features

I have created a static and limited website, allowing you to preview the web application without running the Flask server. You can access the website

Because it was a sample, separate HTML files for 29 brands couldn't be generated for the static website. In this case, a single page for Mercedes has been created as an example.

To access pages for other brands, running a Flask application is required.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
Flask		Flask
MachineLearning		MachineLearning
WebScraping		WebScraping
readme		readme
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis on Automobile Brands

Table of Contents

Introduction

Description

Installation

Usage

Documentation

Web Scraping

Machine Learning

Preprocessing

Data Annotation

Create Model and Feature Extraction with BERT

Sentiment Analyse

Processing for Visualization

Flask

Features

About

Uh oh!

Releases

Packages

Languages

EmineSener/Sentiment-Analysis-on-Automobile-Brands

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis on Automobile Brands

Table of Contents

Introduction

Description

Installation

Usage

Documentation

Web Scraping

Machine Learning

Preprocessing

Data Annotation

Create Model and Feature Extraction with BERT

Sentiment Analyse

Processing for Visualization

Flask

Features

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages