I learn by building things from scratch, then layering complexity one level at a time.
Each project starts simple and progressively adds real-world concerns — state management, persistence, security boundaries, production patterns. The goal is to understand how things actually work, not just how to use a framework.
Build an AI agent from scratch in 7 levels — no frameworks, just a while loop and an LLM.
Starts with a hardcoded state machine, ends with a Redis-backed production agent with plan-first execution, history compaction, and chat mode. Each level is a complete runnable script; diff them to see exactly what changed.
Python · Gemini API · Redis · Agent Architecture
Two projects exploring Pydantic AI from different angles:
-
SQL Safety Assistant — Learn the framework across 8 levels (0–7): dependency injection, human-in-the-loop approval, cost guardrails, FastAPI, multi-turn sessions, Redis persistence, multi-agent escalation.
-
HR Pipeline Demo — Three runnable demos showing production patterns: DI as a security boundary, loop-level control (audit trails, mutation caps, replan loops), and a Chainlit UI with real-time approval flows.
Python · Pydantic AI · FastAPI · Chainlit · DuckDB · Redis
End-to-end data platform built around a husbando gacha game. CDC streaming from PostgreSQL through Debezium → Pub/Sub → Apache Beam into a DuckDB/BigQuery warehouse, then dbt transforms raw events into a Kimball star schema (Bronze → Silver → Gold). Full Docker Compose stack — no cloud account needed.
Python · Apache Beam · Debezium · PostgreSQL · DuckDB · BigQuery · Pub/Sub · dbt · Docker
Notebooks from 2017–2020 — older and less polished, kept for authenticity.
- Statistics — Linear regression from first principles (R², F-stat, QQ plots) and Bayesian linear regression with PyMC3. From my MSc in Statistics.
Utilities for extracting content from technical books.
- extract_text.py — PDF text extractor using
pdfplumber, used to pull pages from The Data Warehouse Toolkit (Kimball).
Python · pdfplumber
Senior Data Engineer with 8+ years of experience in real-time data platforms, LLMOps, and multi-region cloud architecture on GCP. I build and scale production systems — CDC pipelines, AI agent infrastructure, analytics platforms — while mentoring engineers and driving cross-functional adoption.