Built single and multiple (Additive and Interaction) Linear Regression Models on the Facebook dataset to derive inferences or conclusions with the hypothesis testing. When the sample size is small, used bootstrapping to approximate the sampling distribution needed to assess the uncertainty of our estimated coefficients and make inferences. The models are deployed using ggplot
and plotly
.
This dataset is related to posts' critical information on user engagement during 2014 on a Facebook page of a famous cosmetics brand. The original dataset contains 500 observations relative to different classes of posts, and it can be found in data.world. After some data cleaning, it ends up with 491 observations. The dataset was firstly analyzed by Moro et al. (2016) in their data mining work to predict the performance of different post metrics, which are also based on the type of post. The original dataset has 17 different continuous and discrete variables. However, in this project, we extracted five variables for facebook_data
. Details are explained in the notebook or PDF file.