Arctic Mirror is a high-performance data replication system that captures PostgreSQL changes in real-time and stores them in Apache Iceberg format. It provides a DuckDB proxy for querying the replicated data with PostgreSQL compatibility.
- Real-time PostgreSQL Replication: Captures changes using logical replication
- Apache Iceberg Storage: Stores data in open, efficient Iceberg format
- DuckDB Proxy: PostgreSQL-compatible query interface
- Health Monitoring: Built-in health checks and Prometheus metrics // Compaction removed
- Proxy Auth & Slow Query Logging: Optional username/password auth and slow query logging in DuckDB proxy
- WAL Checkpointing: Replication resumes from last persisted LSN
- Docker Support: Easy deployment with Docker and Docker Compose
- Comprehensive Testing: Full test coverage for all components
PostgreSQL → Logical Replication → Arctic Mirror → Iceberg Files
↓
DuckDB Proxy ← Clients
- Replicator: Handles PostgreSQL logical replication
- Iceberg Writer: Converts replication events to Iceberg format
- DuckDB Proxy: Provides PostgreSQL-compatible query interface
- Health Monitor: Monitors system health and provides metrics
- Storage Layer: Supports local filesystem and S3 storage
- Go 1.24+
- Docker and Docker Compose (for containerized deployment)
- PostgreSQL 15+ with logical replication enabled
-
Clone the repository
git clone <repository-url> cd arctic-mirror
-
Install dependencies
make deps
-
Run tests
make test -
Build the application
make build
-
Run locally
make run
-
Start all services
make docker-run
-
Check status
make status
-
View logs
docker-compose logs -f
-
Stop services
make docker-stop
Create a config.yaml file:
postgres:
host: localhost
port: 5432
user: replicator
password: secret
database: mydb
slot: iceberg_replica
publication: pub_all
tables:
- schema: public
name: users
- schema: public
name: orders
iceberg:
path: /data/warehouse
proxy:
port: 5433
auth_user: "" # Optional; set to enable cleartext auth
auth_password: "" # Optional; required if auth_user is set
slow_query_millis: 0 # Optional; log queries slower than N ms
# Compaction removedPOSTGRES_HOST: PostgreSQL hostPOSTGRES_PORT: PostgreSQL portPOSTGRES_USER: PostgreSQL userPOSTGRES_PASSWORD: PostgreSQL passwordPOSTGRES_DB: PostgreSQL database
# Basic usage
./arctic-mirror --config config.yaml
# With custom health port
./arctic-mirror --config config.yaml --health-port 8080
# Help
./arctic-mirror --helpThe application provides health check endpoints:
- Health Check:
GET /health - Detailed Health:
GET /health/detailed - Metrics:
GET /metrics// Compaction endpoint removed
Connect to the DuckDB proxy using any PostgreSQL client:
# Using psql
psql -h localhost -p 5433 -U replicator -d mydb
# Using any PostgreSQL client libraryExample queries:
-- Query replicated data
SELECT * FROM users;
-- Join with orders
SELECT u.username, o.total_amount, o.status
FROM users u
JOIN orders o ON u.id = o.user_id;
-- Aggregations
SELECT status, COUNT(*), AVG(total_amount)
FROM orders
GROUP BY status;arctic-mirror/
├── config/ # Configuration management
├── health/ # Health monitoring
├── iceberg/ # Iceberg format handling
├── proxy/ # DuckDB proxy server
├── replication/ # PostgreSQL replication
├── schema/ # Schema management
├── storage/ # Storage abstractions
├── main.go # Main application
├── Dockerfile # Docker configuration
├── docker-compose.yml # Docker Compose setup
├── Makefile # Development tasks
└── README.md # This file
# All tests
make test
# Tests with race detection
make test-race
# Specific package
go test ./config -v# Format code
make fmt
# Run linter
make lint
# Install development tools
make install-tools# Complete development workflow
make dev
# Full rebuild
make rebuildThe system provides comprehensive health monitoring:
- PostgreSQL Connection: Connection status and latency
- DuckDB Health: Database availability and performance
- Iceberg Storage: File system accessibility and permissions
- System Configuration: Configuration validation
Prometheus-compatible metrics are available at /metrics:
- System uptime
- Component health status
- Connection latencies
- Error counts
Structured logging with different levels:
- Startup and configuration information
- Replication events and errors
- Health check results
- Performance metrics
-
PostgreSQL Connection Failed
- Verify PostgreSQL is running
- Check connection credentials
- Ensure logical replication is enabled
-
Replication Slot Creation Failed
- Verify user has replication privileges
- Check if slot already exists
- Ensure WAL level is set to logical
-
Iceberg Write Permission Denied
- Check directory permissions
- Verify disk space
- Ensure user has write access
-
Proxy Connection Refused
- Verify proxy port is not in use
- Check firewall settings
- Ensure DuckDB extensions are loaded
Enable verbose logging by setting log level:
export LOG_LEVEL=debug
./arctic-mirror --config config.yamlCheck component health:
# Basic health
curl http://localhost:8080/health
# Detailed health
curl http://localhost:8080/health/detailed
# Metrics
curl http://localhost:8080/metrics- Adjust WAL buffer size
- Optimize table schemas
- Use appropriate replication slot settings
- Use SSD storage for Iceberg files
- Optimize Parquet compression
- Consider partitioning strategies
- Tune DuckDB memory settings
- Optimize query patterns
- Monitor connection pooling
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
- Follow Go coding standards
- Write comprehensive tests
- Update documentation
- Use meaningful commit messages
This project is licensed under the MIT License - see the LICENSE file for details.
For support and questions:
- Create an issue on GitHub
- Check the troubleshooting section
- Review the configuration examples
- Consult the health monitoring endpoints
- S3 storage backend
- Additional database support
- Advanced partitioning strategies
- Real-time analytics
- Kubernetes deployment
- Performance benchmarks
- Additional monitoring integrations