Disaster Response Pipeline Project

Introduction

This project is my final project submission for the chapter Data Engineering of the Udacity Data Science Nano Degree Program.

In this project, we have built a web application allowing to classify messages sent during disasters into one of 36 categories. The application uses a Machine Learning model which was trained on a set of pre-labeled real-life examples. After a series of data engineering and cleaning steps, the multiclass-multioutput model is trained and stored. Through a web app, users can interact with the model and classify unseen messages. Furthermore, some visualizations of the underlying data can be found.

This project aims at solving one of the most challenging problems in ML and Data Science these days. Following a disaster, typically there are millions of messages/tweets/texts generated which is overwhelming for disaster response organizations at the time where they have the least capacity. For them it is crucial to filter and extract those messages which are most important so that they can address the most pressing issues and situations quickly and appropriately and forward the messages to the right response team depending on the topic/area of help that is needed. The provided application allows Disaster Response teams to analyze, filter and priotize the vast amount of generated messages in a quick and automated fashion so that they can better target their resources on the people in need after a disaster.

File Descriptions

Folder: app

run.py - python script to launch web application.
Folder: templates - web dependency files (go.html & master.html) required to run the web application.

Folder: data

disaster_messages.csv - real messages sent during disaster events (provided by Figure Eight)
disaster_categories.csv - categories of the messages
process_data.py - ETL pipeline used to load, clean, extract feature and store data in SQLite database
ETL Pipeline Preparation.ipynb - Jupyter Notebook used to prepare ETL pipeline
DisasterResponse.db - cleaned data stored in SQlite database

Folder: models

train_classifier.py - ML pipeline used to load cleaned data, train model and save trained model as pickle (.pkl) file for later use
classifier.pkl - pickle file contains trained model
ML Pipeline Preparation.ipynb - Jupyter Notebook used to prepare ML pipeline

Installation

All required libraries are included in the Anaconda distribution.

Instructions

Run the following commands in the project's root directory to set up your database and model.
- To run ETL pipeline that cleans data and stores in database python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
- To run ML pipeline that trains classifier and saves python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
Run the following command in the app's directory to run your web app. python run.py
Go to http://0.0.0.0:3001/

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
app		app
data		data
models		models
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Disaster Response Pipeline Project

Table of Contents

Introduction

File Descriptions

Folder: app

Folder: data

Folder: models

Installation

Instructions

About

Uh oh!

Releases

Packages

Languages

flstahl/udacity-disaster-response-pipeline

Folders and files

Latest commit

History

Repository files navigation

Disaster Response Pipeline Project

Table of Contents

Introduction

File Descriptions

Folder: app

Folder: data

Folder: models

Installation

Instructions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages