A production-grade two-stream real-time data pipeline ingesting live equity trades and news headlines, processing 15,000+ events/second with sub-100ms end-to-end latency, exactly-once semantics, and ML-powered anomaly detection.
| Website | tradepulse.nikhilgiridharan.com |
| Medium | https://medium.com/@nikhilgiridharan/building-tradepulse-a-production-grade-real-time-market-data-pipeline-2dbf5be6dc10 |
| Video | https://www.youtube.com/watch?v=BupQapP_J0k |
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β STREAM 1 β EQUITY DATA (10,000+ events/sec) β
β β
β Polygon.io βββΆ Kafka βββΆ Validator βββΆ Faust βββΆ DynamoDB β
β WebSocket market Pydantic Stream Hot store β
β Live trades .trades + DLQ engine 48hr TTL β
β β β β β
β βΌ βΌ βΌ β
β SQS DLQ Anomaly ML S3 Parquet β
β Feat Store Cold store β
β β β
β STREAM 2 β NEWS DATA ββββββββββββββββ β
β 60s temporal join β
β Finnhub βββΆ Kafka βββΆ VADER βββΆ Correlator βββΆ DynamoDB β
β REST poll market Sentiment News+market market_sentiment β
β 60s poll .news <1ms signal join β
β β
β FastAPI βββΆ Custom Dashboard βββΆ tradepulse.nikhilgiridharan.comβ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Financial data has zero tolerance for duplicates. A replayed trade processed twice corrupts every downstream aggregation. TradePulse enforces exactly-once at three layers simultaneously:
Layer 1 β Kafka producer: enable.idempotence=True
Kafka deduplicates via sequence numbers
Layer 2 β Faust processing: processing_guarantee='exactly_once'
Process + offset commit are atomic
Layer 3 β DynamoDB write: ConditionExpression: attribute_not_exists
Rejects duplicate items on replay
If Faust crashes after writing to DynamoDB but before committing the Kafka offset, the event replays on restart β but the conditional write rejects the duplicate silently. Exactly-once guaranteed end to end.
Using ticker as the bare partition key routes all AAPL writes to one DynamoDB partition. At market open (~800 writes/sec for AAPL alone), this hits DynamoDB's per-partition throughput limit and causes throttling β which is exactly how the February 3rd incident happened.
Naive key "AAPL": Shard key "AAPL#3":
All writes β 1 partition shard = hash(ticker+timestamp) % 8
800 writes/sec β THROTTLE 8 partitions Γ 100 writes/sec β HEALTHY
Failed messages are never dropped silently. After 3 retries, messages route to SQS with full context (original payload, error reason, Kafka offset, retry count). A DLQ consumer retries every 15 minutes. After 5 DLQ retries, messages archive to s3://tradepulse-data/dead-letters/ for manual inspection.
When DynamoDB write latency exceeds 100ms for 3 consecutive writes, Faust pauses consumption for 500ms. Without backpressure, slow downstream writes cause the in-memory buffer to grow unboundedly β eventually causing OOM or a crash. Backpressure trades a controlled increase in Kafka consumer lag (safe β Kafka holds data durably) for stable memory usage.
A news article alone is weak signal. A volume spike alone is ambiguous. When strong-sentiment news arrives within 60 seconds of a volume z-score spike (>2.0), TradePulse surfaces a correlated market event β the compound signal that quantitative traders call "news alpha."
09:31:42 β NVDA volume z-score: 4.7
09:31:55 β Finnhub: "NVIDIA Blackwell GPU shipments accelerate ahead of schedule"
09:31:55 β VADER sentiment: +0.891 (strongly positive)
09:31:55 β Correlation: STRONG (Ξt = 13 seconds)
Isolation Forest runs inside the Faust agent at 0.3ms per event. The model trains on the last 1,000 events per ticker and retrains every 500 new events β market conditions change throughout the day and a static model trained at open would be badly miscalibrated by close.
Load tested using a custom script simulating market-open burst traffic (8-10x sustained volume) across 5 tickers. Measurement: end-to-end from WebSocket receive to DynamoDB write confirmed.
| Metric | Result |
|---|---|
| Sustained throughput | 14,800 events/sec |
| Peak throughput (30s burst) | 31,200 events/sec |
| End-to-end latency p50 | 12ms |
| End-to-end latency p95 | 34ms |
| End-to-end latency p99 | 67ms |
| DynamoDB write latency p99 | 18ms |
| Kafka consumer lag (sustained) | <50 messages |
| Anomaly detection inference | 0.3ms/event |
| VADER sentiment inference | <1ms/article |
| Data loss under failure conditions | 0 events |
| Method | Endpoint | Description | Cache |
|---|---|---|---|
| GET | /quotes/{ticker} |
Latest trade price and volume | 1s |
| GET | /aggregations/{ticker} |
VWAP, z-score, momentum | 5s |
| GET | /anomalies/{ticker} |
Recent Isolation Forest detections | None |
| GET | /features/{ticker} |
Real-time feature vector | 1s |
| GET | /sentiment/{ticker} |
News sentiment with market correlation | None |
| GET | /health |
Pipeline health status | None |
| Layer | Technology | Why |
|---|---|---|
| Data source (equity) | Polygon.io WebSocket | Live tick data, free tier |
| Data source (news) | Finnhub REST API | Company-level news, free tier |
| Message broker | Apache Kafka | Durable buffer, replay capability, exactly-once |
| Stream processing | Apache Faust | Python-native, fault-tolerant state, exactly-once |
| Sentiment analysis | VADER | <1ms inference vs 200ms+ for FinBERT |
| Anomaly detection | Scikit-learn Isolation Forest | Multi-dimensional, adapts to market conditions |
| Hot storage | AWS DynamoDB | Single-digit ms reads, zero ops overhead |
| Cold storage | AWS S3 + Parquet | Columnar queries via Athena, Snappy compression |
| Dead letter queue | AWS SQS | Durable failure capture with retry |
| API layer | FastAPI | Async, auto OpenAPI, Pydantic validation |
| Containerization | Docker Compose | One-command local stack |
| Deployment | Railway | FastAPI dashboard in demo mode |
| Observability | AWS CloudWatch | Metrics, alarms, dashboards |
At 10x (150k events/sec): Move from local Kafka to Amazon MSK, switch DynamoDB to on-demand capacity mode, run 3 Faust worker instances in parallel, reduce S3 flush interval from 5min to 2min.
At 100x (1.5M events/sec): Replace Faust with Apache Flink, add Redis caching layer in front of DynamoDB, move to multi-region DynamoDB Global Tables, spin anomaly detection into a dedicated microservice, add Redshift for sub-second historical analytics.
Visit tradepulse.nikhilgiridharan.com β no setup required. Runs in demo mode with simulated market data seeded from real current prices.
Prerequisites:
- Docker and Docker Compose
- Polygon.io API key (free tier)
- Finnhub API key (free tier)
- AWS account with DynamoDB, S3, and SQS access
# Clone and configure
git clone https://github.com/nikhilgiridharan/TradePulse
cd TradePulse
cp .env.example .env
# Add your keys to .env:
# POLYGON_API_KEY, FINNHUB_API_KEY
# AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
# Start everything
make up
# Dashboard available at http://localhost:8000Services started by make up:
- Kafka + Zookeeper
- Polygon.io equity producer
- Finnhub news producer
- Faust stream processor
- FastAPI dashboard
make logs # Tail all service logs
make test # Run unit + integration tests
make down # Stop all services| Document | Description |
|---|---|
| Architecture | Full system design with component responsibilities |
| Schema | DynamoDB table design, partition keys, access patterns |
| Benchmarks | Load test methodology and full results |
| Runbook | Operational procedures and alert thresholds |
| Feature Store | Feature definitions and update frequency |
| Postmortem | Hot partition incident β cause, fix, learnings |
| Deployment | Railway setup and local full-pipeline instructions |
TradePulse/
βββ src/
β βββ producer/
β β βββ polygon_producer.py # WebSocket β Kafka, idempotent
β β βββ news_producer.py # Finnhub polling β Kafka
β βββ processing/
β β βββ faust_app.py # Main stream processor, exactly-once
β β βββ aggregations.py # VWAP, z-score, momentum
β β βββ anomaly_detection.py # Isolation Forest, rolling retrain
β β βββ sentiment_analyzer.py # VADER + market correlation
β β βββ feature_store.py # Precomputed features β DynamoDB
β βββ validation/
β β βββ schema_validator.py # Pydantic validation + DLQ routing
β βββ storage/
β β βββ dynamo_writer.py # Shard keys, conditional writes
β β βββ s3_writer.py # Parquet buffering, Hive partitioning
β β βββ dlq_handler.py # SQS retry + S3 archival
β βββ api/
β β βββ main.py # FastAPI, 6 endpoints, CORS
β β βββ static/
β β βββ index.html # Live dashboard
β β βββ about.html # Documentation page
β βββ monitoring/
β βββ cloudwatch_metrics.py # Batched CloudWatch emission
βββ tests/
β βββ unit/ # Aggregation math, anomaly, validation
β βββ integration/ # Kafka, DynamoDB, DLQ end-to-end
βββ docs/ # Architecture, schema, runbook, postmortem
βββ docker-compose.yml # Full local stack
βββ Dockerfile # Railway production image
βββ railway.toml # Railway deployment config
βββ Makefile # make up Β· make test Β· make lint