⚠️ Removed argoCD && rollouts for faster&easier developments
- gRPC Server (
rpc-server/) - Go-based user service with PostgreSQL persistence - API Gateway (
rpc-client/) - Python FastAPI REST facade over gRPC - Database - PostgreSQL with CloudNativePG operator and sqlc code generation
- Cache - Valkey (Redis-compatible) for session/user caching
- Kubernetes - Local cluster via Orbstack with GitOps deployment
- Service Mesh - Istio with mTLS, traffic management, and telemetry
- GitOps - ArgoCD for declarative deployments and automated sync
- Rollouts - Argo rollouts with canary deployment
- Observability - Complete O11y stack with correlation:
- Metrics - Prometheus + Grafana dashboards
- Traces - OpenTelemetry + Jaeger with B3 propagation
- Logs - EFK stack (Elasticsearch, Fluent Bit, Kibana)
- Security - Wolfi base images, Istio mTLS, cert-manager
- Protocols - gRPC with Protobuf, REST APIs, OpenTelemetry OTLP
- Main Repository (this repo) - Source code, Dockerfiles, CI/CD pipelines
- Manifest Repository - arch-manifest - Kubernetes manifests managed by ArgoCD
- Code Push → GitHub Actions builds Docker images → DockerHub
- Automated Update → GitHub Actions updates image tags in manifest repo using
kustomize edit - GitOps Sync → ArgoCD detects changes and deploys to Kubernetes cluster
- Kustomize bases/overlays + ArgoCD GitOps deployment
- CI/CD pipelines with automated image updates via kustomize edit
- Separate manifest repository for GitOps workflow
- Wolfi secure container images + CloudNativePG operator
- Infrastructure as code (Terraform/Pulumi) for multi-environment
- Canary deployments with automated rollback with analysis
- Secrets management and configuration drift detection
- Istio service mesh with mTLS and ingress gateway
- Complete OpenTelemetry + Jaeger integration with propagation
- Unified trace correlation: Istio sidecar ↔ application spans
- Canary deployments with Argo Rollouts and traffic analysis
- Canary deployments with traffic shifting (weight/header-based routing)
- Circuit breakers, retries, timeouts, and RBAC policies
- Fault injection for chaos engineering
- Prometheus + Grafana dashboards for infrastructure/applications
- EFK stack: Elasticsearch (ECK), Fluent Bit, Kibana with TLS
- Distributed tracing with end-to-end correlation (gRPC → DB/Cache)
- SLI/SLO monitoring with error budgets and alerting
- Golden signals alerting (Latency, Errors, Traffic, Saturation)
- Anomaly detection and synthetic monitoring
- SQL migrations + sqlc code generation + repository pattern
- PostgreSQL with CloudNativePG operator in Kubernetes
- Valkey (Redis) caching with TTL policies and operation tracing
- Database backup/restore, HA failover, and performance tuning
- Cache clustering and warming strategies
- gRPC services with Protobuf + FastAPI REST gateway
- Async gRPC clients with comprehensive error handling
- Message queue system (NATS/Kafka/Redis Streams) with workers
- API versioning, authentication, and rate limiting
- Event-driven architecture with idempotency patterns
- Chaos engineering with Litmus/Chaos Mesh
- Load testing, disaster recovery automation
- Incident response with PagerDuty/Opsgenie integration
- Policy as code (OPA/Gatekeeper) and compliance scanning