Skip to content

Tahernezhad/UK-Regional-Insights

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UK Regional Insights: A Geospatial Analysis of Economic GVA and Deprivation

Animated Map of UK GVA per Capita (1998-2023)

This repository contains a data analytics project analyzing regional economic inequality in the United Kingdom. It implements an end-to-end ETL pipeline to process 26 years of geospatial and economic data from the Office for National Statistics (ONS).

The project then uses advanced geospatial statistics, time-series analysis, and interactive visualization to identify economic "hot spots," "cold spots," and long-term growth trends.

🚀 About the Project

The goal of this project is to explore the complex relationship between economic productivity (measured by Gross Value Added, GVA) and social-economic factors (like Population and the Index of Multiple Deprivation, IMD).

It moves beyond a simple static analysis by building a rich, time-series dataset spanning from 1998 to 2023. This allows us to ask deeper questions:

  • Where are the statistically significant clusters of wealth and deprivation?
  • Have these "hot spots" and "cold spots" changed over the last 26 years?
  • What are the underlying growth trajectories of different regions?
  • Which features (deprivation, location) are the best predictors of economic output?

✨ Packages

This project showcases a full-stack data science workflow, from data engineering to visualization.

Python 3.11 Conda Pandas GeoPandas PySAL Scikit-learn Folium Plotly Streamlit Jupyter


🏗️ Project Structure

The repository is organized into a modular pipeline, separating data engineering from analysis and the final application.

UK-Regional-Insights/
├─ data_pipeline/
│  └─ transformers.py         # Python script to clean and merge all raw data
├─ notebooks/
│  └─ data_exploration.ipynb  # Jupyter notebook with all analyses
├─ app/
│  └─ streamlit_dashboard.py  # (Future) Code for the interactive web app
├─ models/
│  ├─ inequality_predictor.py     # (Future) Baseline ML models
│  └─ gnn_regional_networks.py    # (Future) GNN model
├─ assets/
│  ├─ animated-map-demo.gif   # Demo GIFs for this README
│  └─ lisa-hotspot-map.png    # Saved plots
├─ data/
│  ├─ input/       # (Ignored by git) Raw .xlsx and .geojson files
│  └─ processed/   # (Ignored by git) The final master GeoPackage
├─ outputs/        # (Ignored by git) Saved .html and .png visualizations
├─ .gitignore          # Ignores all data, output, and cache files
├─ environment.yml     # Reproducible Conda environment
└─ README.md           # You are here!

📊 Core Analyses & Visualizations

This notebook performs five key analyses to move from raw data to actionable insights.

1. Static & Animated Economic Maps

First, a 2D choropleth map (using folium) visualizes the GVA per capita for the most recent year (2023), clearly showing the static economic landscape. This is complemented by an animated time-series GIF (created from a folium.plugins.TimestampedGeoJson map) that shows the dramatic economic changes from 1998 to 2023.

Animated map of GVA

2. Hot Spot & Cold Spot Analysis (LISA)

This analysis uses Local Moran's I (LISA) from the PySAL library to find statistically significant spatial clusters. It clearly identifies the "High-High" (Hot Spot) cluster around London and "Low-Low" (Cold Spot) clusters in former industrial areas and rural regions.

LISA Hot Spot / Cold Spot Map

3. Economic "Winners & Losers" (CAGR Analysis)

This analysis moves beyond a static snapshot to identify long-term economic trajectories. It calculates the Compound Annual Growth Rate (CAGR) for GVA per capita for every Local Authority from 1998 to 2023. This metric reveals the "winners" (fastest-growing regions) and "losers" (stagnating or declining regions) over the past quarter-century.

Animated map of GVA Growth Rate

4. Interactive Inequality Analysis (2D & 3D Scatter Plots)

These plotly charts visualize the complex, multi-dimensional relationship between GVA per Capita, Deprivation (IMD Rank), and Population. The 3D plot allows for a full exploration of how these three key variables interact, while linking the point's shape to the LISA clusters (Hot Spot/Cold Spot) connects this analysis back to the spatial data.

3D scatter plot of GVA, IMD, and Population


🔧 Getting Started

Prerequisites

This project uses Conda to manage its environment and dependencies. You'll need to have Anaconda or Miniconda installed.

Installation

  1. Clone the repository:

    git clone [https://github.com/Tahernezhad/UK-Regional-Insights.git](https://github.com/Tahernezhad/UK-Regional-Insights.git)
    cd UK-Regional-Insights
  2. Create the Conda environment: Use the provided environment.yml file to create the Conda environment. This will install all the necessary packages.

    conda env create -f environment.yml
  3. Activate the environment:

    conda activate geoml
  4. Download the Data: The raw data files are not included in this repository. Please download them from the official sources and place them in a data/input/ folder (you will need to create this folder).

How to Run

  1. Run the ETL Pipeline: Execute the transformers.py script to process all raw files into a single master GeoPackage.

    python data_pipeline/transformers.py
  2. Run the Analysis Notebook: Launch Jupyter and open the main notebook to see all the analyses and visualizations.

    jupyter notebook notebooks/data_exploration.ipynb

🔮 Future Work

This project provides a robust foundation for predictive modeling. The next steps are:

  • Streamlit Dashboard: Populate the app/streamlit_dashboard.py file to create a fully interactive web application.
  • Baseline ML Model: Build a baseline RandomForest or XGBoost model to predict GVA_per_capita using the features from the exploration notebook.
  • Graph Neural Network (GNN): Implement a GNN (using the models/ directory) to model the spatial network explicitly. The spatial weights matrix from the LISA analysis will serve as the graph's adjacency matrix, allowing the model to learn from neighboring regions. """

About

A geospatial data analytics project analyzing 26 years of UK economic (GVA) and deprivation (IMD) data. Features an ETL pipeline, spatial statistics, and interactive visualizations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors