Disaster Response NLP Pipeline

List of Dependencies

The code should run with no issues using Python versions 3. Other libraries used in this project are:

numpy
pandas
flask
nltk
pickle
matplotlib
scikit-learn
sqlalchemy

Project Introduction

The task of this project is to analyze disaster messages from Figure Eight and build a Machine Learning model that classifies disaster messages. The data set contains real messages that were sent during disaster events. A machine learning pipeline is created to categorize these events so that one can send the messages to an appropriate disaster relief agency. The project also includes a web app where an emergency worker can input a new message and get classification results in several categories.

Instructions for running the scripts

Run the following commands in the project's root directory to set up your database and model. s
- To run ETL pipeline that cleans data and stores in database python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
- To run ML pipeline that trains classifier and saves python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
Run the following command in the app's directory to run your web app. python run.py
Go to http://0.0.0.0:3001/

Project Structure

app folder
1. templates folder
  1. go.html
  2. master.html
2. run.py
data folder
1. disaster_categories.csv
2. disaster_messages.csv
3. DisasterResponse.db
4. process_data.py
models folder
1. classifier.pkl
2. train_classifier.py
jupyter notebooks folder
1. categories.csv
2. messages.csv
3. ETL Pipeline Preparation.ipynb
4. ML Pipeline Preparation.ipynb
sample images folder
README.md

File Descriptions

The app folder contains files necessary for the functioning of the Web app. The templates folder contains two html files (go.html is used to render the information about the training data in the form of bar graphs and the classification results into 36 different categories while master.html is used to render the Web page). The run.py file runs the flask Web app.
The data folder contains two csv files (disaster_messages.csv contains disaster messages and disaster_categories.csv contains 36 different categories into which disaster messages can be classified) and a sql database file DisasterResponse.db that contains the cleaned and processed disaster messages for training the classification model. The process_data.py script merges the two csv files into a single file, cleans the disaster messages and stores the cleaned/processed messages into a sql database.
The train_classifier.py script inside the models folder contains the code to load the cleaned disaster messages from the sql database, creates new features (number of words in each message, number of characters in each message, number of non stopwords in each message), builds Machine Learning Pipeline, performs GridSearchCV to find the best hyperparameter for the classification model, evaluates the trained model on test set and then saves the trained model as a pickle file to deploy on the Web app. The classifier.pkl file contains the trained model as pickle file.
The jupyter notebooks folder contains two jupyter notebooks. ETL Pipeline Preparation.ipynb notebook performs Extract, Load and Transform task on the messages and categories csv files after merging these two files. The process_data.py script is prepared using ETL notebook. ML Pipeline Preparation.ipynb contains Machine Learning Pipeline to classify disaster messages into 36 different categories. The train_classifier.py script is prepared using this notebook.
The sample_images folder contains the images of visualizations from the ETL notebook and the working Web app for the purpose of quick demonstration in the results section below.

Results

Some visualizations from this project

Number of messages in each genre
Number of messages in each category
Web app interface
Directing message to web app for classification
Classification result of above message
Statistics of word and character counts of messages in the training data

Licensing, Authors, Acknowledgements

Must give credit to Udacity for the data and python 3 notebook.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Disaster Response NLP Pipeline

Table of Contents

List of Dependencies

Project Introduction

Instructions for running the scripts

Project Structure

File Descriptions

Results

Licensing, Authors, Acknowledgements

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Jupyter Notebooks		Jupyter Notebooks
app		app
data		data
models		models
sample images		sample images
README.md		README.md

Ankit-Kumar-Saini/Disaster-Response-NLP-Pipeline

Folders and files

Latest commit

History

Repository files navigation

Disaster Response NLP Pipeline

Table of Contents

List of Dependencies

Project Introduction

Instructions for running the scripts

Project Structure

File Descriptions

Results

Licensing, Authors, Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages