A webapp to explore Citi Bike system data. Composed of four services:
pipeline: a data pipeline that extracts historical data from thetripdatabucket, transforms the data into Parquet, and uploads it to S3clickhouse: a Clickhouse server that reads the Parquet data and further normalizes it for queryingmap: a Next.js app that exposes an interface to explore the data (including serverless API routes)mapdata: provisions AWS infrastructure and lambda implementations for fetching static data thatmapdepends on
There are two compose files: docker-compose.etl.yml and docker-compose.local.yml. The former runs all three services while the latter skips the pipeline (as well as Clickhouse initialization, if already initialized).