Success Prediction of Mobile Apps in Google Playstore

Description

There are many app developers putting lot of efforts in making and publishing thier apps in google playstore, expecting to get a good revenue out of it. But most of the time they don't get paid as anticipated. This project of predicting the success of an app helps these developers to decide with which factor they should build thier apps and show thier skills on.

Basically this projects take some inputs from the user like Size, free or Paid, Ad supported or not, Availability of in app purchases or not, no of months from the release date to till date, no of months from last updated to till date etc.... After that he will be given the average downloads as the result that he could get with those factors.

This will help a lot of developers create thier apps in a way that could most probably result them with the expected revenue.

About the Project

The data for this project was imported from kaggle datasets. which is huge in size when compared to any other tabular data of playstore. Which is going to precursor our success of the project by mitigating the defieciency of data for training.

This project was developed in python with most common python packages. All the predictions are made using classical machine learning methods.

Repository Tree

F:.
├───exp-notebooks
├───google-playstore-apps
├───images
└───notebooks
    └───.ipynb_checkpoints

Structure

Importing data from kaggle - using API key and Authentication.
Data cleaning.
Visualization and exploratory analysis of cleaned dataset.
Data transformation based on above mentioned analysis.
Regression models on transformed data - around 10 regression models to find the best fit model.
Hyperparameter tuning - to reduce MSE(Mean Squared Error) value of best fit model.

NOTE:

The repository directories - google-playstore-apps and images are auto generated if notebooks is executed as mentioned below. The directory exp-notebooks is a folder of experimental notebooks, which is just for reference and will not contribute to the execution of notebooks.

Built With

The below mentioned are some of the most cardinal python modules used in this project

Numpy
Pandas
Matplotlib
Seaborn
sklearn
Various other packages containing boosting algorithms

Getting Started

The below steps are to be executed in the order mentioned for orderly processing and non-disruptive execution.

Step-1:

Create a python Environment and activate them.

python -m venv venv
source venv/bin/activate

Step-2:

Clone the repository.

git clone https://github.com/Nithish1201/Success_prediction_of_app.git

Step-3:

Installing required packages -

pip install -r requirments.txt

Step-4:

After logging in to kaggle, obtain the API key and enter when prompted.

Step-5:

Order of Executing the Notebooks

1.Playstore-Cleaning.ipynb

Data cleaning using Pandas:
    - Getting rid of unwanted columns.
    - Handling irrelevent data.
    - Giving features its respective data types.
    - Handling missing values.

NOTE: Running this notebook will download a compressed data file in your google-playstore-apps directory. This file is loaded to the next 2.Playstore-Visualization.ipynb notebook.

2.Playstore-Visualization.ipynb

- Visualizing the data to gain some insights about the data.
- Styled graphs and charts using Matplotlib and seaborn.

NOTE: Running this notebook will download images to images directory (only for reference).

3.Playstore-EDA&Responses.ipynb

- Exploring the dimensions and structure of the data.
- Gaining useful insights - distribution of data and presence of outlier.
- Feature scaling.
- Handling some categorical features for a good prediction result.

NOTE: Running this notebook will download a compressed data file in your google-playstore-apps directory. This file is loaded to the next 4.Playstore-RegressionModels.ipynb notebook.

4.Playstore-RegressionModels.ipynb

- Testing various Regression models to find the most accurate model.
- Classical machine learning models were not only chose from sklearn but also from other various packages which are aggresive too.

Warning: Running this notebook with basic system requirements may take time for processing.

5.Playstore-FeatureSelection.ipynb

- Feature selection on best fit model, no change in the MAE score.

Warning: Running this notebook with basic system requirements may take time for processing.

6.Playstore-RegressionCatboost.ipynb

- Manual hyperparameter tuning on final model.

Warning: Running this notebook with basic system requirements may take time for processing.

Note: The thoeritical base and better understanging of the project can still be extracted from the final documentation.

Note: The notebooks 4 and 5 are optional.

Contributors

Aditi Mittal
Nithish kumar
Varun

Contact Info

Name: Nithish kumar
Github ID: Nithish1201
Repository Name: Sucess_prediction_of_app
Email ID: [email protected]
Contact: +91 9360637610

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
exp-notebooks		exp-notebooks
images		images
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
project_report.docx		project_report.docx
project_report.pdf		project_report.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Success Prediction of Mobile Apps in Google Playstore

Description

About the Project

Repository Tree

Structure

Built With

Getting Started

Create a python Environment and activate them.

Clone the repository.

Installing required packages -

Order of Executing the Notebooks

1.Playstore-Cleaning.ipynb

2.Playstore-Visualization.ipynb

3.Playstore-EDA&Responses.ipynb

4.Playstore-RegressionModels.ipynb

5.Playstore-FeatureSelection.ipynb

6.Playstore-RegressionCatboost.ipynb

Contributors

Contact Info

About

Uh oh!

Contributors 2

Uh oh!

Languages

Nithish1201/Success_prediction_of_app

Folders and files

Latest commit

History

Repository files navigation

Success Prediction of Mobile Apps in Google Playstore

Description

About the Project

Repository Tree

Structure

Built With

Getting Started

Create a python Environment and activate them.

Clone the repository.

Installing required packages -

Order of Executing the Notebooks

1.Playstore-Cleaning.ipynb

2.Playstore-Visualization.ipynb

3.Playstore-EDA&Responses.ipynb

4.Playstore-RegressionModels.ipynb

5.Playstore-FeatureSelection.ipynb

6.Playstore-RegressionCatboost.ipynb

Contributors

Contact Info

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors 2

Uh oh!

Languages