Skip to content

tsargent/buoy-data-project

Repository files navigation

Buoy Sonification Project

CI

Transform real-time and historical NOAA buoy observations into reliable, well-structured APIs and derived sonification events for exploratory auditory analytics.

ChatGPT Image Nov 14, 2025, 11_40_20 PM

πŸ—οΈ Architecture

This is a TypeScript monorepo using pnpm workspaces with a three-layer architecture:

Layer Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚            Infrastructure Layer (Docker)             β”‚
β”‚  PostgreSQL (database) + Redis (message bus)        β”‚
β”‚  Started once: docker compose up -d                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      ↕ (used by)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         Application Layer (Node.js + TypeScript)     β”‚
β”‚  β€’ Server (Fastify API + SSE streaming)             β”‚
β”‚  β€’ Worker (NDBC data fetcher + Redis publisher)     β”‚
β”‚  Started separately: pnpm dev (with hot reload)     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      ↕ (serves)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Client Layer (Browsers/Apps)            β”‚
β”‚  EventSource API consuming real-time observations    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Why separate layers?

  • Docker services (PostgreSQL, Redis) are infrastructure - start once, rarely restart
  • Node.js apps (Server, Worker) are your code - restart frequently during development
  • Fast iteration: Hot reload without rebuilding Docker images

Project Structure

apps/
  server/     - Fastify REST API + SSE streaming endpoint
  worker/     - BullMQ job processor + NDBC data fetcher
  web-demo/   - (Future) Web audio client demo

packages/
  shared/     - Shared types and schemas

docs/
  PRD.md                  - Product requirements
  AUDIO_CLIENTS.md        - Audio integration guide
  SSE_IMPLEMENTATION.md   - Real-time streaming architecture

Tech Stack:

  • Runtime: Node.js 20+ with TypeScript
  • Server: Fastify 5 + Prisma ORM
  • Queue: BullMQ + Redis
  • Database: PostgreSQL
  • Streaming: Server-Sent Events (SSE) + Redis Pub/Sub
  • Testing: Vitest
  • Linting: ESLint + Prettier

πŸ“‹ Prerequisites

  • Node.js 20 or later
  • pnpm 10+ (npm install -g pnpm)
  • Docker & Docker Compose (for local Postgres + Redis)

πŸš€ Quick Start

1. Clone and Install

git clone https://github.com/tsargent/buoy-data-project.git
cd buoy-sonification
pnpm install

2. Start Infrastructure Services

# Start PostgreSQL + Redis (infrastructure layer only)
docker compose up -d

# Verify services are running
docker compose ps

Note: docker compose up only starts database and Redis - NOT your application. The server and worker are Node.js apps started separately (next steps) for fast development iteration.

3. Configure Environment

# Server environment
cp apps/server/.env.example apps/server/.env

# Worker environment
cp apps/worker/.env.example apps/worker/.env

# Default configuration works out of the box with Docker services
# Both use: postgresql://postgres:postgres@localhost:5432/buoys

4. Run Database Migrations

pnpm -F @app/server prisma:generate
pnpm -F @app/server prisma:migrate

5. Seed Database with Stations

pnpm -F @app/server prisma:seed

This seeds 5 active NOAA buoy stations:

  • 44009: Delaware Bay
  • 44013: Boston
  • 46022: Eel River, CA
  • 46050: Stonewall Bank, OR
  • 42001: Mid-Gulf

6. Start Application Servers

# Terminal 1: Start API server (application layer)
pnpm -F @app/server dev

# Terminal 2: Start worker (application layer)
pnpm -F worker dev

Why separate processes?

  • Server = Fastify API + SSE streaming endpoint
  • Worker = Background job that fetches NDBC data and publishes to Redis
  • Both are Node.js apps (not Docker) for hot reload during development
  • Both connect to Docker infrastructure (PostgreSQL + Redis)

The API will be available at http://localhost:3000.

Worker will automatically:

  • Fetch latest observations from all active stations
  • Parse and validate NOAA data
  • Store observations in Postgres (handling missing sensors gracefully)
  • Publish new observations to Redis (for real-time streaming)
  • Ingest ~6,000-7,000 observations per station per run

Verify it's working:

# Check stations
curl http://localhost:3000/v1/stations | jq '.data | length'

# Wait ~10 seconds for first ingestion, then check observations
curl "http://localhost:3000/v1/observations/by-station/44009?limit=5" | jq '.meta.total'

# You should see thousands of observations per station!

πŸ§ͺ Testing

# Run all tests
pnpm test

# Run tests in watch mode
pnpm test

# Run tests with UI
pnpm test:ui

# Run server tests only
pnpm -F @app/server test

Note: Integration tests expect a test database or will gracefully handle DB unavailability. For full integration testing, ensure Docker services are running.

πŸ“š Available Scripts

Root Level

  • pnpm build - Build all packages
  • pnpm test - Run tests in watch mode
  • pnpm test:run - Run tests once
  • pnpm lint - Lint all code
  • pnpm format - Format all code with Prettier

Server (apps/server)

  • pnpm dev - Start dev server with hot reload
  • pnpm build - Build for production
  • pnpm start - Start production server
  • pnpm test - Run server tests
  • pnpm prisma:generate - Generate Prisma client
  • pnpm prisma:migrate - Run database migrations
  • pnpm prisma:studio - Open Prisma Studio GUI
  • pnpm prisma:seed - Seed database with sample NOAA buoy stations

Worker (apps/worker)

  • pnpm dev - Start worker in dev mode (fetches real NOAA data every 5 min)

πŸ”Œ API Endpoints

Base URL: http://localhost:3000
API Version: All API endpoints are versioned under /v1

Try it now:

# List all stations
curl http://localhost:3000/v1/stations | jq .

# Get latest observations from Delaware Bay buoy
curl "http://localhost:3000/v1/observations/by-station/44009?limit=5" | jq .

Health

  • GET /health - Service health check (no versioning)
  • GET /metrics - Prometheus metrics endpoint (no versioning)

Stations

  • GET /v1/stations - List all active stations

    • Query params:

      • page (optional): Page number, default 1
      • limit (optional): Results per page, default 100, max 500
    • Response:

      {
        "data": [ ...stations... ],
        "meta": {
          "page": 1,
          "limit": 100,
          "total": 250
        }
      }
  • GET /v1/stations/:id - Get station by ID

Observations

  • GET /v1/observations/by-station/:stationId - Get observations for a station
    • Query params:

      • page (optional): Page number, default 1
      • limit (optional): Results per page, default 100, max 500
      • since (optional): ISO 8601 date, filter observations after this time
    • Response:

      {
        "data": [ ...observations... ],
        "meta": {
          "page": 1,
          "limit": 100,
          "total": 1500
        }
      }

Error Responses

All errors follow a consistent shape per the project constitution:

{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Invalid query parameters",
    "details": { ... }
  }
}

Error codes: NOT_FOUND, VALIDATION_ERROR, INTERNAL_ERROR, UNAUTHORIZED, FORBIDDEN, BAD_REQUEST

Rate Limiting

All API endpoints are rate-limited to prevent abuse:

  • Default: 100 requests per minute
  • Observations endpoint: 60 requests per minute (stricter due to larger payloads)

Rate limit information is included in response headers:

  • x-ratelimit-limit: Total requests allowed in time window
  • x-ratelimit-remaining: Requests remaining in current window
  • x-ratelimit-reset: Timestamp when limit resets

When rate limit is exceeded, the API returns 429 Too Many Requests.

πŸ—„οΈ Database Schema

Station

Field Type Description
id String Station ID (e.g. "44009")
name String Station name
lat Float Latitude
lon Float Longitude
source String Data source (default "NDBC")
isActive Boolean Active status
createdAt DateTime Record creation time
updatedAt DateTime Last update time

Observation

Field Type Description
id String Unique observation ID (cuid)
stationId String Foreign key to Station
observedAt DateTime Observation timestamp
waveHeightM Float? Wave height in meters
windSpeedMps Float? Wind speed in m/s
windDirDeg Int? Wind direction in degrees
waterTempC Float? Water temperature in Celsius
pressureHpa Float? Atmospheric pressure in hPa
createdAt DateTime Record creation time

🎡 Audio Integration

See docs/AUDIO_CLIENTS.md for detailed guidance on integrating with:

  • Web Audio API / Tone.js
  • Max/MSP, Pure Data
  • SuperCollider, ChucK
  • TouchDesigner, VCV Rack
  • Unity, Unreal Engine

Transports supported: SSE (default), WebSocket (future), OSC, MIDI, MQTT

πŸ“– Documentation

πŸ› οΈ Development Workflow

Adding a New Feature

  1. Write failing tests first (TDD per constitution)
  2. Implement feature
  3. Ensure tests pass: pnpm test:run
  4. Lint and format: pnpm lint && pnpm format
  5. Verify types compile: pnpm build

Database Changes

  1. Edit apps/server/prisma/schema.prisma
  2. Run migration: pnpm -F @app/server prisma:migrate
  3. Regenerate client: pnpm -F @app/server prisma:generate

Debugging

  • Server logs: Structured JSON via pino (configured in Fastify)
  • Database queries: Set DEBUG=prisma:* or use Prisma Studio (pnpm -F @app/server prisma:studio)
  • Worker jobs: Check Redis with docker exec -it buoy-sonification-redis-1 redis-cli
  • View ingested data: curl "http://localhost:3000/v1/observations/by-station/44009?limit=10" | jq .

πŸ›οΈ Project Principles

This project adheres to strict engineering principles defined in our Constitution:

  • Test-First Delivery (non-negotiable): Red β†’ Green β†’ Refactor
  • Type-Centric Contracts: Explicit TypeScript types, runtime validation
  • Observability by Design: Structured logging, metrics, error semantics
  • Performance Awareness: No N+1 queries, pagination mandatory
  • Security Baselines: Input validation, secrets in env vars, PII redaction

See Engineering Principles for detailed guidelines.

🚧 Current Status

Phase: Production-ready (v0.1.0) - Phase 3 Complete βœ…

Completed:

  • βœ… Test framework (Vitest) - 30 passing tests
  • βœ… Input validation (Zod schemas)
  • βœ… Error handling (standardized shapes)
  • βœ… API routes with pagination (stations, observations)
  • βœ… Database schema (Prisma ORM)
  • βœ… Docker Compose setup (Postgres + Redis)
  • βœ… CI/CD pipeline (GitHub Actions)
  • βœ… Metrics endpoint (Prometheus format)
  • βœ… Worker architecture (BullMQ + Redis)
  • βœ… Real data ingestion - Worker fetches live NOAA buoy data every 5 minutes
  • βœ… NDBC parser - Handles missing sensors, NaN values, and data validation
  • βœ… API versioning (/v1 prefix)
  • βœ… Rate limiting (100 req/min global, 60 req/min observations)
  • βœ… Security hardening (Helmet, CORS, JWT validation, PII redaction)
  • βœ… ADR documentation (Prisma, BullMQ, SSE)
  • βœ… Database seeding (5 active NOAA buoy stations)
  • βœ… Constitution Compliance: 100%

Live Data Available:

  • 🌊 30,000+ real observations ingested from 5 NOAA buoys
  • πŸ“‘ Stations: Delaware Bay, Boston, Eel River, Stonewall Bank, Mid-Gulf
  • πŸ“Š Data includes: wave height, wind speed/direction, water temp, pressure
  • ♻️ Auto-refresh every 5 minutes with latest observations

Planned (Phase 4+):

  • πŸ“… Real-time SSE stream endpoint
  • πŸ“… E2E smoke tests
  • πŸ“… Web demo client
  • πŸ“… OSC bridge for audio tools
  • πŸ“… Production deployment

See Analysis Report for detailed technical assessment.

🀝 Contributing

  1. Follow the test-first mandate: write failing tests before implementation
  2. Ensure pnpm lint and pnpm format pass
  3. Keep PRs focused (< ~400 LOC net change)
  4. Reference constitution principles in PR descriptions
  5. Add ADR for architectural decisions (docs/adr/)

πŸ“ License

ISC

πŸ”— Links


Onboarding Time Target: < 60 minutes from clone to first API request (per constitution 2.9)

Questions? See docs or open an issue.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •