In this project, data analysis was carried out to get insights into how to solve the water crisis in a country called 'Maji Ndogo'.
This project was one of the mandatory requirements for success in the Data Analysis scholarship course at the African Leadership Academy (ALX).
The Project was divided into three separate parts, each with a specific objective and goal. The first part was carried out through Microsoft Excel, and the second part was carried out using MySQL Workbench. While the third part was carried out using Microsoft PowerBi.
The data was sourced from the Course providers (ALX), which provides the foundation for my analysis, containing detailed information on site visits, water sources, locations, provinces, survey results, contaminated sources, and employees' information. Through a series of SQL scripts, I explore key questions such as the water sources, the percentage of citizens using shared taps, crime rates, and costs for the project to be completed.
1- Microsoft Excel: First part of the project was to conduct data cleanup, sorting, filtering, and perform equations using Microsoft Excel.
2- MySQL WorkBench: The second part of the project was to upload, clean up, manipulate, and alter the provided database using MySQL Workbench.
3- Powerbi: The third part of the project was to provide a data visualization dashboard for the above data to get meaningful insights that could alleviate the country's crisis using Microsoft Powerbi.
Below are the questions I want to answer in my project (NB: Inside the SQL codes, there are more questions documented. Those were presented during the exam of the course):
* How many visits from employees to the sites of water sources?
* Where are most of the water sources in Naji Ndogo? And what is the most dominant water source?
* What is the percentage of people using taps for water?
* What is the percentage of people with water infrastructure?
* How long is the queue time for each citizen to get water from shared taps, and what is the busiest time of the day?
* What are the names of the two worst-performing employees who visited the fewest sites, and how many sites did the worst-performing employee visit?
* Who are the main victims of the crime during the waiting time?
* What are the highest assaults that victims face?
* In Amanzi, Women face significantly less risk of being crime victims. Why is this the case in Amanzi?
* Can you investigate the mistakes in the site visits record and determine if the mistake was made on purpose or not? If it was made on purpose, will you figure out who the employee is who made this mistake?
* Which towns should we upgrade shared taps first?
* How many UV filters do we have to install in total?
- Create a pie chart to illustrate the total Urban and Rural population split in Maji Ndogo. Use SUM(number_of_people_served), split by location_type
- Create a bar (column) chart showing the total population using the various water sources. Use SUM(number_of_people_served), split by type_of_water_source
- Create a line chart that plots the average queue times for the hour of the day, and add the days of the week as different coloured lines.
- Create a bar chart that shows the average composition of queues for each day of the week. HINT: Plot the average percent_female, average percent_male and average percent_child on the Y-axis and day_of_week on the X-axis.
- Create a scatter plot with number_of_people_served vs. Average of time_in_queue. Make sure to add a filter to this plot to include only data where the visit_count > 1. To use it as a filter condition, add visit_count to the filter pane, and use an advanced filter.
- Plot the different crime types to show the gender disparity that women face. Try to identify the crimes that are affecting women in Maji Ndogo.
- Plot the total number of crimes affecting men, women and children. These are crimes related to people collecting water. What are the patterns you notice?
- Plot the number of crimes for the time of day and create another visual that shows the number of crimes per day of the week for the different types of victims. Do you note any patterns?
- Analyse the number of crimes per province. Which provinces have high crime rates, and which provinces have low crime rates, specifically focusing on women?
- Map of Provinces in Maji Ndogo.
- A pie/doughnut chart of the population split between urban and rural.
- A tree map of the total number of people per source type.
- Column chart showing the total number of each source of water type, for every town.
- Column chart that counts the different sources by type.
Public questions:
- How the project is going. Very simply: Which sources have been completed?
- How much money has been spent so far?
- Where was the money spent?
- What the money was spent on?
- Details about everything in every town.
Decision makers questions:
- How far is the project?
- How much money has been spent so far?
- Where was the money spent?
- What the money was spent on?
- Will we have enough money to complete the project?
- Where can we cut costs?
- I want to see data at the national, provincial and town level.