Text-Analytics-With-R

Text Analytics of Twitter Data

This code shows the general way of doing text analytics. The process are

Tokens
DFM: Document Frequency Matrix
TF-IDF
SVD: Singular Value Decomposition
Random Forest
Prediction -Confusion Matrix

The output accuracy of the test data having sample size of 39 was 62.16 %, that has Sensitivity : 0.4667
and Specificity : 0.7273. As you all can guess, the model isn't perfect. However, it gives some basic idea about the text analysis.

The data is randomly picked subset from the Kaggle dataset "Real or Not? NLP with Disaster Tweets". Due to lack of computational power, I chose random 200 data points.

Thanks to the YouTube videos by Data Science Dojo: Introduction to Text Analytics in R.

The data set link is : https://www.kaggle.com/c/nlp-getting-started The YouTube link for DS Dojo is : https://www.youtube.com/playlist?list=PL8eNk_zTBST8olxIRFoo0YeXxEOkYdoxi

Thank You

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
DSS.xlsx		DSS.xlsx
NLP_DIsaster_Tweets_Subset.Rmd		NLP_DIsaster_Tweets_Subset.Rmd
NLP_DIsaster_Tweets_Subset.html		NLP_DIsaster_Tweets_Subset.html
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text-Analytics-With-R

About

Releases

Packages

Languages

sharmajee499/Text-Analytics-With-R

Folders and files

Latest commit

History

Repository files navigation

Text-Analytics-With-R

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages