I'm a 3rd-year Computer Science student at the University of Dundee (Data Science & AI), on track for a First Class degree. I build production-grade systems across three tracks: resilient data pipelines and ETL architectures (Airflow, AWS RDS, FastAPI, Star Schema), end-to-end ML and LLM-powered systems (Scikit-learn, PyTorch, RAG, LangChain, LLM-as-judge evaluation), and full-stack cloud applications (React, React Native, Flask, WebSockets, AWS, CI/CD). I care about engineering rigour - schema design, leakage prevention, test coverage, and operational observability are first-class concerns across everything I build.
I'm currently seeking a 12-month industrial placement or summer internship starting in 2026, in Data Engineering, ML/AI Engineering, or Software Engineering.
- ๐ BSc (Hons) Computer Science (Data Science & AI) โ expected graduation June 2028
- ๐ AWS Academy โ Machine Learning Foundations
- ๐ Microsoft Learn โ Foundations of Azure AI: Concepts, Capabilities, and Implementation
- ๐ AWS Academy โ Cloud Foundations
- ๐ Based in Dundee, Scotland โ open to relocation
- ATM Platform - migrating ingestion to an event-driven architecture (Kafka/Redis Streams) with async worker horizontal scaling and Prometheus/Grafana real-time observability
- StockLens - migrating from Firebase to a self-hosted FastAPI/PostgreSQL backend for production-grade data handling, adding ML-driven spending analytics and forward-looking portfolio projections
- Unix VCS - implementing branching and three-way merge support, an improved diff algorithm, and benchmarking performance against Git on representative workloads
I treat reliability and observability as non-negotiable from the start, not retrofitted after the fact. That means dead-letter routing before data reaches a database, leakage prevention baked into sklearn Pipelines before any cross-validation fold runs, and CI/CD gates that abort deployment on any test failure rather than hoping nothing breaks in production. I'm drawn to problems where silent failures are the hardest kind to debug - concurrent write contention, foreign key mismatches, retrieval quality in RAG systems, and I build systems that make those failures impossible to miss. I work from requirements before writing code - functional, non-functional, and acceptance criteria first - and treat API documentation, schema contracts, and test plans as deliverables in their own right, not afterthoughts.
| ๐งช Automated tests written | 667 (541 DevSync ยท 48 ATM ยท 78 StockLens) |
| ๐ Star schema dimensions | 9 (W3C ETL Pipeline) |
| ๐ Countries of geodata enriched | 78 (W3C ETL Pipeline) |
| โก Parallel Airflow tasks | 9 fan-out, 8ร faster than sequential |
| ๐ค ML classifiers benchmarked | 7 (Haggis - ~90% accuracy) |
| ๐งฉ ATM anomaly types modelled | 7 (synthetic data generator) |
| ๐ Security vulnerabilities caught & patched | 5 (ATM platform - incl. JWT privilege escalation) |
| ๐ Test tiers (ATM platform) | 5 (unit ยท integration ยท API ยท schema ยท concurrency) |
Python FastAPI SQLite APScheduler LangChain ChromaDB Ollama React Vite
Industry project for NCR Atleos - production ATM log ingestion, anomaly diagnostics, and a fully local air-gapped RAG diagnostic assistant. Led backend and data engineering end-to-end across a 7-person Agile team. 7 custom parsers, dead-letter routing, retry-with-backoff, LLM-as-judge evaluation, and 48 automated tests across 5 tiers, patching 5 critical pre-release defects including a JWT privilege escalation vulnerability.
React Flask Socket.io GitHub API PostgreSQL AWS Docker Pytest Jest Cypress
Full-stack project tracker with real-time WebSocket collaboration, GitHub OAuth 2.0, and bidirectional Issue/PR linking. ECS Fargate in a custom VPC, RDS in a private subnet, CloudFront frontend - 541 automated tests gate every PR via GitHub Actions with OIDC federation. Deployment aborts on any failure.
Apache Airflow Python PostgreSQL AWS RDS Power BI Power Automate
Fully automated ETL pipeline transforming raw W3C IIS logs into a 9-dimension Star Schema on AWS RDS. 9-way parallel Airflow fan-out makes phase three 8ร faster than sequential. Geolocation enrichment across 78 countries, โ1 surrogate key fallback ensuring zero dropped records, and Power Automate failure alerting. 7-page Power BI dashboard including P95 response time via DAX.
React Native TypeScript Firebase Node.js Jest Alpha Vantage API
Full-stack mobile app converting physical receipts via OCR into structured financial records, mapping spending to stock tickers via Alpha Vantage, and projecting portfolio performance using ARIMA forecasting and Linear Regression. AES encryption at rest, biometric auth, 78 Jest tests. Currently migrating to FastAPI/PostgreSQL backend.
Python Scikit-learn Flask MovieLens Dataset
Hybrid recommendation engine (collaborative filtering + content-based) on MovieLens. ~78% hit rate, ~0.22 Precision@10. Dependency-injected strategy pattern means recommendation algorithms are fully swappable without touching the API layer. Cold-start problem addressed via hybrid signal combination.
Python Scikit-learn XGBoost Pandas Matplotlib Jupyter Notebook
End-to-end ML pipeline: 7 classifiers benchmarked (~90% accuracy), 2 novel ratio-based features engineered that became top-3 predictors, GridSearchCV 5-fold CV, K-Means + DBSCAN clustering, and Linear Regression (Rยฒ=0.756). Strict leakage prevention via sklearn Pipelines with ColumnTransformer throughout.
C++
Modular C++ OOP system - polymorphic vehicle hierarchy, generic repository template pattern, zero raw pointer usage (smart pointers throughout). Levenshtein distance fuzzy search, automated late fee and loyalty rewards logic, file-based persistence, and an 8-scenario E2E test suite.
Bash Unix
Git-like VCS built from scratch in pure Bash - zero external dependencies beyond native Unix utilities. File locking, timestamped versioning, automatic diff generation, filterable activity logs with user attribution, multi-repo support, and compressed archive export. Currently implementing branching, three-way merge, and benchmarking against Git.
Languages
Frontend & Mobile
ML & AI
Data Engineering
Backend & APIs
Cloud & DevOps
Testing



