High-performance cryptocurrency market data ingestion and API platform
Kirby ingests real-time and historical market data from multiple cryptocurrency exchanges and serves it via a fast, reliable REST API. Named after the Nintendo character that can consume unlimited objects, Kirby efficiently handles massive volumes of market data.
- Real-time Data Collection: WebSocket connections for live OHLCV candles, funding rates, and open interest
- Secure Authentication: API key-based authentication with admin role-based access control
- 1-Minute Buffering: Optimized storage for funding/OI data (98.3% storage reduction)
- Historical Backfills: Automated retrieval of historical candle and funding rate data
- High Performance: Async I/O with optimized bulk inserts using asyncpg
- Time-Series Optimized: TimescaleDB with minute-precision timestamps aligned across all data types
- Modular Architecture: Easy to add new exchanges, coins, and market types
- Production Ready: Health checks, monitoring, structured logging, Docker deployment
- Type-Safe: Full Pydantic validation and type hints throughout
- API-First: FastAPI with auto-generated OpenAPI documentation
git clone https://github.com/oakwoodgates/kirby.git
cd kirby
./deploy.shThat's it! See QUICKSTART.md for details.
- Docker and Docker Compose
- Python 3.13+ (for local development)
# Clone the repository
git clone https://github.com/oakwoodgates/kirby.git
cd kirby
# Copy and configure environment
cp .env.example .env
nano .env # Edit POSTGRES_PASSWORD
# Build and start all services
docker compose build
docker compose up -d
# Run database migrations
docker compose exec collector alembic upgrade head
# Sync configuration to database
docker compose exec collector python -m scripts.sync_config
# Check status
docker compose ps
docker compose logs -fThe API will be available at http://localhost:8000
- QUICKSTART.md - 5-minute quick start guide
- DEPLOYMENT.md - Complete Digital Ocean deployment guide
- Server setup and security hardening
- Production configuration
- Monitoring and maintenance
- Backup strategies
- Troubleshooting guide
- docs/BOOTSTRAP.md - Admin user bootstrap guide
- First admin user creation
- Manual and automatic bootstrap
- Security best practices
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -e ".[dev]"
# Copy environment template
cp .env.example .env
# Edit .env with your settings
# Start TimescaleDB (via Docker or local installation)
docker-compose up -d timescaledb
# Run migrations
alembic upgrade head
# Sync configuration
python -m scripts.sync_configKirby uses TimescaleDB (PostgreSQL with time-series extension). The Docker Compose setup includes TimescaleDB, but for local PostgreSQL:
# Install TimescaleDB extension
CREATE EXTENSION IF NOT EXISTS timescaledb;
# Run migrations
alembic upgrade headCopy .env.example to .env and configure:
# Database
DATABASE_URL=postgresql+asyncpg://kirby:password@localhost:5432/kirby
DATABASE_POOL_SIZE=20
# API
API_HOST=0.0.0.0
API_PORT=8000
API_WORKERS=4
# Logging
ENVIRONMENT=development
LOG_LEVEL=info
LOG_FORMAT=json
# Collectors
COLLECTOR_RESTART_DELAY=5
COLLECTOR_MAX_RETRIES=3
COLLECTOR_BACKFILL_ON_GAP=trueSee .env.example for all available options.
Define what data to collect in config/starlistings.yaml:
exchanges:
- name: hyperliquid
display_name: Hyperliquid
active: true
coins:
- symbol: BTC
name: Bitcoin
active: true
market_types:
- name: perps
display_name: Perpetuals
active: true
starlistings:
- exchange: hyperliquid
coin: BTC
market_type: perps
intervals:
- 1m
- 15m
- 4h
- 1d
active: trueAfter editing, sync to database:
python -m scripts.sync_config# Start all services (database, API, collector)
docker-compose up -d
# Start only specific services
docker-compose up -d timescaledb api
# View logs
docker-compose logs -f api
docker-compose logs -f collectorImportant: All backfill commands must run inside the Docker container using docker compose exec collector.
# Backfill all active starlistings (1 year)
docker compose exec collector python -m scripts.backfill --days=365
# Backfill specific exchange and coin
docker compose exec collector python -m scripts.backfill --exchange=hyperliquid --coin=BTC --days=90
# Backfill specific coin (all intervals)
docker compose exec collector python -m scripts.backfill --coin=SOL --days=30# Backfill all active coins (1 year)
docker compose exec collector python -m scripts.backfill_funding --days=365
# Backfill specific coin (BTC, 30 days)
docker compose exec collector python -m scripts.backfill_funding --coin=BTC --days=30
# Backfill using --all flag
docker compose exec collector python -m scripts.backfill_funding --allHyperliquid API Limitations:
- Historical funding data only includes
funding_rateandpremium - No historical data for:
mark_price,oracle_price,mid_price,open_interest - Real-time collector captures ALL fields going forward
- Backfill uses COALESCE to preserve existing complete data (safe to re-run)
# Manual health check
python -m scripts.health_check
# Or via API
curl http://localhost:8000/health
curl http://localhost:8000/health/hyperliquidhttp://localhost:8000
Most API endpoints require authentication via API keys. Only the /health endpoint is public.
On a fresh deployment, the ./deploy.sh script automatically creates the first admin user and API key if no users exist. The API key is displayed in the terminal output.
Manual bootstrap (if needed):
# Inside Docker container
docker compose exec collector python -m scripts.bootstrap_admin
# With custom credentials
docker compose exec collector python -m scripts.bootstrap_admin \
--email admin@mycompany.com \
--username myadminThe bootstrap script will display your API key once - save it immediately!
See docs/BOOTSTRAP.md for complete bootstrap guide.
After bootstrapping, use the admin API to create additional users and API keys:
# Create a new user
curl -X POST "http://localhost:8000/admin/users" \
-H "Authorization: Bearer kb_YOUR_ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{
"email": "user@example.com",
"username": "user1",
"is_admin": false
}'
# Create an API key for the user (replace USER_ID)
curl -X POST "http://localhost:8000/admin/users/USER_ID/keys" \
-H "Authorization: Bearer kb_YOUR_ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Production Key",
"rate_limit": 1000
}'The response will include the full API key. Save it immediately - it's only shown once:
{
"id": 2,
"name": "Production Key",
"key": "kb_a1b2c3d4e5f6789012345678901234567890abcd",
"key_prefix": "kb_a1b2c3d4",
"rate_limit": 1000,
"is_active": true,
"created_at": "2025-11-17T10:00:00Z"
}REST API - Include the API key in the Authorization header:
curl -H "Authorization: Bearer kb_123456KEY" \
"http://localhost:8000/starlistings"WebSocket - Include the API key as a query parameter:
const ws = new WebSocket('ws://localhost:8000/ws?api_key=kb_123456KEY');# List all API keys for a user
curl -H "Authorization: Bearer {admin_key}" \
"http://localhost:8000/admin/users/{user_id}/keys"
# Deactivate an API key
curl -X PATCH "http://localhost:8000/admin/keys/{key_id}/deactivate" \
-H "Authorization: Bearer {admin_key}"
# Delete an API key
curl -X DELETE "http://localhost:8000/admin/keys/{key_id}" \
-H "Authorization: Bearer {admin_key}"GET /candles/{exchange}/{coin}/{quote}/{market_type}/{interval}Parameters:
exchange- Exchange name (e.g.,hyperliquid)coin- Base coin symbol (e.g.,BTC)quote- Quote currency symbol (e.g.,USD)market_type- Market type (e.g.,perps)interval- Time interval (e.g.,15m,4h,1d)
Query Parameters:
start_time(optional) - Start time (ISO 8601 or Unix timestamp)end_time(optional) - End time (ISO 8601 or Unix timestamp)limit(optional) - Maximum number of candles (default: 1000, max: 5000)
Example:
curl -H "Authorization: Bearer {your_api_key}" \
"http://localhost:8000/candles/hyperliquid/BTC/USD/perps/15m?limit=100"Response:
{
"data": [
{
"time": "2025-10-26T12:00:00Z",
"open": "67500.50",
"high": "67800.00",
"low": "67400.25",
"close": "67650.75",
"volume": "1234.5678",
"num_trades": 542
}
],
"metadata": {
"exchange": "hyperliquid",
"coin": "BTC",
"quote": "USD",
"trading_pair": "BTC/USD",
"market_type": "perps",
"interval": "15m",
"count": 100
}
}GET /funding/{exchange}/{coin}/{quote}/{market_type}Parameters:
exchange- Exchange name (e.g.,hyperliquid)coin- Base coin symbol (e.g.,BTC)quote- Quote currency symbol (e.g.,USD)market_type- Market type (e.g.,perps)
Query Parameters:
start_time(optional) - Start time (ISO 8601 or Unix timestamp)end_time(optional) - End time (ISO 8601 or Unix timestamp)limit(optional) - Maximum number of records (default: 1000, max: 5000)
Example:
curl -H "Authorization: Bearer {your_api_key}" \
"http://localhost:8000/funding/hyperliquid/BTC/USD/perps?limit=10"Response:
{
"data": [
{
"time": "2025-11-20T09:53:00Z",
"funding_rate": "0.000012500000000000",
"premium": "-0.000141919800000000",
"mark_price": "91587.000000000000000000",
"index_price": "91601.000000000000000000",
"oracle_price": "91601.000000000000000000",
"mid_price": "91587.500000000000000000",
"next_funding_time": null
}
],
"metadata": {
"exchange": "hyperliquid",
"coin": "BTC",
"quote": "USD",
"trading_pair": "BTC/USD",
"market_type": "perps",
"count": 10
}
}GET /open-interest/{exchange}/{coin}/{quote}/{market_type}Parameters:
exchange- Exchange name (e.g.,hyperliquid)coin- Base coin symbol (e.g.,BTC)quote- Quote currency symbol (e.g.,USD)market_type- Market type (e.g.,perps)
Query Parameters:
start_time(optional) - Start time (ISO 8601 or Unix timestamp)end_time(optional) - End time (ISO 8601 or Unix timestamp)limit(optional) - Maximum number of records (default: 1000, max: 5000)
Example:
curl -H "Authorization: Bearer {your_api_key}" \
"http://localhost:8000/open-interest/hyperliquid/BTC/USD/perps?limit=10"Response:
{
"data": [
{
"time": "2025-11-20T09:53:00Z",
"open_interest": "29242.785200000000000000",
"notional_value": "2678258968.112400000000000000",
"day_base_volume": "42898.608070000000000000",
"day_notional_volume": "3889408146.746299266800000000"
}
],
"metadata": {
"exchange": "hyperliquid",
"coin": "BTC",
"quote": "USD",
"trading_pair": "BTC/USD",
"market_type": "perps",
"count": 10
}
}GET /starlistingsExample:
curl -H "Authorization: Bearer {your_api_key}" \
"http://localhost:8000/starlistings"Response:
{
"starlistings": [
{
"id": 1,
"exchange": "hyperliquid",
"exchange_display": "Hyperliquid",
"coin": "BTC",
"coin_name": "Bitcoin",
"quote": "USD",
"quote_name": "US Dollar",
"trading_pair": "BTC/USD",
"trading_pair_id": 1,
"market_type": "perps",
"market_type_display": "Perpetuals",
"interval": "15m",
"interval_seconds": 900,
"active": true
}
],
"total_count": 1
}GET /starlistings/{starlisting_id}Parameters:
starlisting_id- Starlisting ID (integer)
Example:
curl -H "Authorization: Bearer {your_api_key}" \
"http://localhost:8000/starlistings/1"Response:
{
"id": 1,
"exchange": "hyperliquid",
"exchange_display": "Hyperliquid",
"coin": "BTC",
"coin_name": "Bitcoin",
"quote": "USD",
"quote_name": "US Dollar",
"trading_pair": "BTC/USD",
"trading_pair_id": 1,
"market_type": "perps",
"market_type_display": "Perpetuals",
"interval": "15m",
"interval_seconds": 900,
"active": true
}Error Response (404):
{
"detail": "Starlisting with ID 999 not found"
}GET /health
GET /health/{exchange}Example:
curl http://localhost:8000/healthResponse:
{
"status": "healthy",
"timestamp": "2025-10-26T12:00:00Z",
"database": "connected",
"collectors": {
"hyperliquid": "running"
}
}GET /funding/{exchange}/{coin}/{quote}/{market_type}Parameters:
exchange- Exchange name (e.g.,hyperliquid)coin- Base coin symbol (e.g.,BTC)quote- Quote currency symbol (e.g.,USD)market_type- Market type (e.g.,perps)
Query Parameters:
start_time(optional) - Start time (ISO 8601 or Unix timestamp)end_time(optional) - End time (ISO 8601 or Unix timestamp)limit(optional) - Maximum number of records (default: 1000, max: 5000)
Example:
curl -H "Authorization: Bearer {your_api_key}" \
"http://localhost:8000/funding/hyperliquid/BTC/USD/perps?limit=10"Response:
{
"data": [
{
"time": "2025-10-26T12:00:00+00:00",
"funding_rate": "0.0001000000",
"premium": "0.0000500000",
"mark_price": "67500.50",
"index_price": "67495.00",
"oracle_price": "67495.00",
"mid_price": "67500.00",
"next_funding_time": "2025-10-26T13:00:00+00:00"
}
],
"metadata": {
"exchange": "hyperliquid",
"coin": "BTC",
"quote": "USD",
"trading_pair": "BTC/USD",
"market_type": "perps",
"count": 10
}
}GET /open-interest/{exchange}/{coin}/{quote}/{market_type}Parameters:
exchange- Exchange name (e.g.,hyperliquid)coin- Base coin symbol (e.g.,BTC)quote- Quote currency symbol (e.g.,USD)market_type- Market type (e.g.,perps)
Query Parameters:
start_time(optional) - Start time (ISO 8601 or Unix timestamp)end_time(optional) - End time (ISO 8601 or Unix timestamp)limit(optional) - Maximum number of records (default: 1000, max: 5000)
Example:
curl -H "Authorization: Bearer {your_api_key}" \
"http://localhost:8000/open-interest/hyperliquid/BTC/USD/perps?limit=10"Response:
{
"data": [
{
"time": "2025-10-26T12:00:00+00:00",
"open_interest": "12345.67890000",
"notional_value": "833333333.5000",
"day_base_volume": "98765.43210000",
"day_notional_volume": "6666666666.0000"
}
],
"metadata": {
"exchange": "hyperliquid",
"coin": "BTC",
"quote": "USD",
"trading_pair": "BTC/USD",
"market_type": "perps",
"count": 10
}
}Once the API is running, visit:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
Kirby provides a WebSocket API for real-time candle data streaming.
ws://localhost:8000/ws
Python:
import asyncio
import json
import websockets
async def stream_candles():
api_key = "kb_123456KEY" # Replace with your API key
async with websockets.connect(f"ws://localhost:8000/ws?api_key={api_key}") as ws:
# Subscribe to BTC/USD 1m candles
subscribe_msg = {
"action": "subscribe",
"starlisting_ids": [1],
"history": 10 # Get 10 historical candles
}
await ws.send(json.dumps(subscribe_msg))
# Receive real-time updates
async for message in ws:
data = json.loads(message)
if data["type"] == "candle":
print(f"New candle: {data['data']}")
asyncio.run(stream_candles())JavaScript:
const apiKey = "kb_123456KEY"; // Replace with your API key
const ws = new WebSocket(`ws://localhost:8000/ws?api_key=${apiKey}`);
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
if (data.type === "candle") {
console.log("New candle:", data.data);
}
};
// Subscribe to multiple starlistings
ws.send(JSON.stringify({
action: "subscribe",
starlisting_ids: [1, 2, 3],
history: 10
}));- โ Real-time updates (~50-100ms latency via PostgreSQL LISTEN/NOTIFY)
- โ Subscribe to multiple starlistings simultaneously
- โ Historical data on connect (optional, up to 1000 candles)
- โ Heartbeat/ping for connection health
- โ Auto-reconnection support
- โ Validated messages with error responses
See docs/WEBSOCKET_API.md for complete WebSocket API documentation including:
- Message protocol specification
- Client examples (Python & JavaScript)
- Error handling and reconnection strategies
- Performance considerations
- Troubleshooting guide
Python Test Client:
python scripts/test_websocket_client.pyJavaScript Test Client:
# Open in browser
open docs/examples/websocket_client.htmlKirby provides powerful export capabilities for AI/ML training, backtesting, and external analysis.
Four CLI scripts are available for exporting data in CSV and Parquet formats:
- export_candles.py - OHLCV candle data with multi-interval support
- export_funding.py - Funding rate data with price context
- export_oi.py - Open interest data with volume metrics
- export_all.py - Merged datasets (candles + funding + OI) for ML/backtesting
# Export BTC 1m candles for last 30 days (both CSV and Parquet)
docker compose exec collector python -m scripts.export_candles \
--coin BTC --intervals 1m --days 30
# Export merged dataset for ML training (Parquet only)
docker compose exec collector python -m scripts.export_all \
--coin BTC --intervals 1m --days 90 --format parquet
# Export all intervals for multi-timeframe backtesting
docker compose exec collector python -m scripts.export_all \
--coin BTC --intervals all --days 365
# Export funding rates only
docker compose exec collector python -m scripts.export_funding \
--coin BTC --days 30- CSV: Universal format, human-readable, larger file size
- Parquet: Columnar format, ~10x smaller, optimized for pandas/PyTorch/TensorFlow
The export_all.py script creates ML-ready datasets by merging:
- Candle data (OHLCV)
- Funding rates (with mark/index/oracle prices)
- Open interest (with volume metrics)
All data is aligned by minute-precision timestamps. Missing values are preserved as NULL (no forward-filling).
Exports are saved to the exports/ directory with timestamped filenames:
exports/
โโโ merged_hyperliquid_BTC_USD_perps_1m_20251102_143022.parquet
โโโ merged_hyperliquid_BTC_USD_perps_1m_20251102_143022.json
โโโ ... (metadata files)
For comprehensive export documentation including:
- Advanced usage examples
- Integration with ML frameworks (PyTorch, TensorFlow, scikit-learn)
- Best practices and troubleshooting
- Format comparison and optimization tips
See docs/EXPORT.md
kirby/
โโโ src/
โ โโโ api/ # FastAPI application
โ โโโ collectors/ # Data collectors
โ โโโ db/ # Database models and repositories
โ โโโ schemas/ # Pydantic schemas
โ โโโ config/ # Configuration management
โ โโโ utils/ # Utilities
โโโ tests/
โ โโโ unit/ # Unit tests
โ โโโ integration/ # Integration tests
โโโ scripts/ # Operational scripts
โโโ config/ # Configuration files
โโโ migrations/ # Database migrations
โโโ docker/ # Docker configuration
- Create collector class in
src/collectors/{exchange_name}.py:
from src.collectors.base import BaseCollector
class NewExchangeCollector(BaseCollector):
async def connect(self):
# Implement WebSocket connection
pass
async def collect(self):
# Implement data collection
pass-
Update configuration in
config/starlistings.yaml -
Register collector in
src/collectors/main.py -
Test and backfill
# Format code
black .
# Lint
ruff check .
# Type check
mypy src
# Run all checks
black . && ruff check . && mypy src# Create new migration
alembic revision --autogenerate -m "description"
# Apply migrations
alembic upgrade head
# Rollback
alembic downgrade -1First-time setup:
# Windows
scripts\setup_dev.bat
# Mac/Linux
bash scripts/setup_dev.shThis will:
- Create a Python virtual environment
- Install Kirby with dev dependencies
- Prepare your environment for testing
# Easy way - automatically sets up test database
python scripts/run_tests.py
# Or run pytest directly
pytest
# Unit tests only
pytest tests/unit -m unit
# Integration tests only
pytest tests/integration -m integration
# With coverage
pytest --cov=src --cov-report=html
# Specific test file
pytest tests/unit/test_helpers.py
# Verbose output
pytest -vThe test suite includes:
- Unit tests: Helper functions, utilities, data validation
- Integration tests: API endpoints, database operations, repositories
- Coverage reporting: HTML and terminal reports
For detailed testing documentation, see TESTING.md.
Integration tests automatically create and use a separate test database (kirby_test). No manual configuration needed!
Complete step-by-step guide: DEPLOYMENT.md
Quick Deploy:
# On your Digital Ocean droplet:
git clone https://github.com/oakwoodgates/kirby.git
cd kirby
./deploy.shThe deployment guide includes:
- โ๏ธ Digital Ocean Droplet setup ($12-24/month)
- ๐ Server configuration and security hardening
- ๐ณ Docker installation and configuration
- ๐ One-command or manual deployment options
- ๐ Monitoring and maintenance procedures
- ๐ Security: UFW firewall, fail2ban, SSH hardening
- ๐พ Automated backup strategies
- ๐ง Comprehensive troubleshooting guide
# Development
ENVIRONMENT=development
LOG_LEVEL=debug
DATABASE_POOL_SIZE=10
# Production
ENVIRONMENT=production
LOG_LEVEL=info
DATABASE_POOL_SIZE=20Before going live:
- Changed default PostgreSQL password
- Configured firewall (UFW)
- Set up SSL/TLS for API (if public-facing)
- Configured log rotation
- Set up automated backups
- Enabled fail2ban
- Monitored services for 24 hours
- Set up alerting (optional: Grafana, Prometheus)
See full checklist in DEPLOYMENT.md
- Data Freshness: Time since last candle/funding/OI received
- Collection Lag: Delay between exchange and ingestion
- API Latency: Response time percentiles (P50, P95, P99)
- Error Rates: Failed requests, collector crashes
- Throughput: Candles/second, requests/second
- Buffer Flush: Check logs for "Flushed buffers to database" every minute
Structured JSON logs in production:
{
"timestamp": "2025-10-26T12:00:00Z",
"level": "INFO",
"logger": "kirby.collector.hyperliquid",
"message": "Collected 120 candles",
"extra": {
"exchange": "hyperliquid",
"candles": 120,
"duration_ms": 45
}
}Database connection errors:
- Verify
DATABASE_URLis correct - Ensure TimescaleDB is running:
docker-compose ps - Check database logs:
docker-compose logs timescaledb
Collector not receiving data:
- Check exchange API status
- Verify WebSocket connection:
docker-compose logs collector - Check rate limiting settings
Missing data gaps:
- Run gap detection:
python -m scripts.detect_gaps - Trigger backfill:
python -m scripts.backfill --fill-gaps
API slow response:
- Check database query performance
- Verify indexes are created:
alembic current - Increase
DATABASE_POOL_SIZEin.env
Enable debug logging:
# In .env
LOG_LEVEL=debug
API_LOG_LEVEL=debug
# Restart services
docker-compose restart- โ TimescaleDB schema with hypertables
- โ Configuration management (YAML โ database)
- โ Database repositories and connection pooling
- โ Hyperliquid WebSocket collector (candles)
- โ Hyperliquid WebSocket collector (funding/OI with 1-minute buffering)
- โ Historical backfill system (candles + funding rates)
- โ REST API endpoints (candles, funding, OI, starlistings, health)
- โ Health checks and monitoring
- โ Docker deployment with production guide
- โ 1-minute buffering for funding/OI (98.3% storage reduction)
- โ COALESCE pattern for safe backfills
- โ Minute-precision timestamp alignment across all tables
- WebSocket API for real-time streaming (with Redis for sub-minute data)
- Advanced monitoring (Prometheus, Grafana)
- Additional exchanges (Binance, Coinbase, OKX)
- More data types (trades, order book, liquidations)
- Caching layer (Redis for real-time serving)
- Multi-region deployment
- Rate limiting and authentication
- Automated data retention policies
We welcome contributions! Please follow these guidelines:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Write tests for new functionality
- Follow code style: Run
blackandruff - Update documentation as needed
- Commit changes:
git commit -m 'Add amazing feature' - Push to branch:
git push origin feature/amazing-feature - Open a Pull Request
- Type hints for all functions
- Pydantic validation for external data
- Async/await for I/O operations
- Comprehensive error handling
- Unit and integration tests
- Docstrings for public APIs
This project is licensed under the MIT License - see the LICENSE file for details.
- FastAPI: Modern, fast web framework
- TimescaleDB: Time-series database built on PostgreSQL
- CCXT: Cryptocurrency exchange trading library
- Hyperliquid: Exchange API and SDK
For issues, questions, or contributions:
- Open an issue on GitHub
- Review existing documentation in CLAUDE.md
- Check API docs at http://localhost:8000/docs
Built with โก by Oakwood Gates