Skip to content

Abdullah-967/railway-trains-databricks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

17 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš„ Railway Trains: Databricks Performance Pipeline

A comprehensive data engineering project analyzing passenger train performance across the Dutch railway network. This pipeline processes historical stop and service data to generate actionable insights into delays, cancellations, and platform changes.

πŸ—οΈ Project Structure

  • data/: Documentation on data sources and a detailed data dictionary.
  • pipeline_code/: The core Databricks logic, organized into a Medallion architecture.
  • visuals/: Screenshots and a demo video of the final performance dashboard.

βš™οΈ Data Architecture (Medallion)

We use a multi-layered approach to transform raw data into insights:

Screenshot 2026-02-01 234816
  1. Bronze (Raw): Raw CSV ingestion with schema evolution and basic sanitization.
  2. Silver (Cleaned): Data typing, cleaning, and enrichment. Includes derived on-time flags (threshold <= 5 min) and performance classification.
  3. Gold (Business): Optimized dimensional models (fact_stops, dim_station) and daily performance aggregations for reporting.

πŸ“Š Insights & Dashboards

The pipeline feeds a dashboard that tracks KPIs like:

  • Arrival/Departure On-Time %
  • Cancellation Rates
  • Platform Change Severity
  • Peak Hour Performance (Morning vs. Evening Rush)

Project Demo Video

Check out the visuals folder for more breakdowns.

πŸ”— Data Source

Data is curated from the NS API by Rijden de Treinen. You can find more details in how_to_get_data.md.

About

A comprehensive data engineering project analyzing passenger train performance across the Dutch railway network. This pipeline processes historical stop and service data to generate actionable insights into delays, cancellations, and platform changes.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages