Production-ready Go microservice template demonstrating resilient service patterns including graceful shutdown, context propagation, and retry/timeout mechanisms.
- Graceful Shutdown - Clean termination on SIGTERM/SIGINT with configurable drain timeout
- Context Propagation - Request context passed through all layers, cancels retries immediately
- Retry with Exponential Backoff - Configurable retry attempts with exponential backoff
- Prometheus Metrics - Comprehensive metrics for requests, retries, and shutdown duration
- Structured Logging - JSON-formatted logs using Go's
log/slog - Kubernetes Ready - Includes deployment manifests with probes, PDB, and HPA
- 12-Factor App - Configuration via environment variables with
.envsupport - Production Dockerfile - Multi-stage build with distroless base image
- Go 1.23+
- Docker (optional, for containerized builds)
- Make
- Clone the repository:
git clone <repository-url>
cd resilient- Install dependencies:
go mod download- Create
.envfile (optional, uses defaults otherwise):
cp .env.example .env
# Edit .env with your settings- Run the service:
make runOr build and run manually:
make build
./bin/resilientThe service will start on port 8080 by default.
Build Docker image:
make docker-buildRun container:
docker run -p 8080:8080 --env-file .env resilient:latest| Method | Path | Description |
|---|---|---|
GET |
/healthz |
Readiness probe |
GET |
/livez |
Liveness probe |
GET |
/ping |
Main endpoint with retry logic |
GET |
/metrics |
Prometheus metrics endpoint |
# Health check
curl http://localhost:8080/healthz
# Liveness probe
curl http://localhost:8080/livez
# Main endpoint (may retry internally)
curl http://localhost:8080/ping
# Prometheus metrics
curl http://localhost:8080/metrics/ping response:
{
"pong": true
}/healthz response:
{
"ready": true
}All configuration is done via environment variables:
| Variable | Default | Description |
|---|---|---|
PORT |
8080 |
HTTP server port |
RETRY_ATTEMPTS |
3 |
Maximum retry attempts |
RETRY_BACKOFF_MS |
200 |
Initial backoff in milliseconds |
REQUEST_TIMEOUT_MS |
1500 |
Request timeout in milliseconds |
SHUTDOWN_TIMEOUT_SEC |
20 |
Graceful shutdown timeout in seconds |
PORT=8080
RETRY_ATTEMPTS=3
RETRY_BACKOFF_MS=200
REQUEST_TIMEOUT_MS=1500
SHUTDOWN_TIMEOUT_SEC=20.
├── cmd/
│ └── app/
│ └── main.go # Application entry point
├── internal/
│ ├── config/
│ │ └── config.go # Configuration management
│ ├── log/
│ │ └── logger.go # Structured logging setup
│ ├── metrics/
│ │ └── metrics.go # Prometheus metrics
│ ├── retry/
│ │ ├── retry.go # Retry logic with backoff
│ │ └── retry_test.go # Retry tests
│ ├── server/
│ │ └── http.go # HTTP server and routes
│ └── shutdown/
│ ├── graceful.go # Graceful shutdown handler
│ └── graceful_test.go # Shutdown tests
├── deploy/
│ └── k8s/
│ ├── deployment.yaml # Kubernetes deployment
│ ├── service.yaml # Kubernetes service
│ ├── configmap.yaml # Configuration map
│ ├── pdb.yaml # Pod disruption budget
│ └── hpa.yaml # Horizontal pod autoscaler
├── Dockerfile # Multi-stage Docker build
├── Makefile # Build commands
├── .golangci.yml # Linter configuration
├── go.mod # Go dependencies
└── README.md # This file
make buildmake testmake lintThe project uses golangci-lint configured via .golangci.yml. This file:
- Defines which linters to run
- Sets up rules and their severity
- Ensures consistent code quality across the project
- Can be customized for project-specific needs
Enabled linters:
- errcheck - Checks for unchecked errors in code
- gosimple - Simplifies code by applying Go's idiomatic patterns
- govet - Reports suspicious constructs in Go code (vet)
- ineffassign - Detects ineffectual assignments
- staticcheck - Static analysis with comprehensive checks
- unused - Finds unused code (constants, variables, functions, types, etc.)
- gofmt - Checks code formatting
- goimports - Checks import statement formatting
- misspell - Finds commonly misspelled English words in comments
- revive - Fast, configurable linter with customizable rules
Configuration highlights:
revivechecks exported names and return typeserrcheckincludes type assertion checkingstaticcheckruns all available checks- No issue limits (all violations are reported)
See .golangci.yml for complete configuration details.
make fmt- Kubernetes cluster
kubectlconfigured
kubectl apply -f deploy/k8s/kubectl get pods -l app=resilient
kubectl get svc resilientIf using port-forward:
kubectl port-forward svc/resilient 8080:80
curl http://localhost:8080/pingThe service exposes Prometheus metrics at /metrics:
http_requests_total- Total HTTP requests (by route and status)retry_attempts_total- Total retry attempts (by route)http_requests_in_flight- Current in-flight requestsshutdown_duration_seconds- Graceful shutdown duration histogram
# Request rate by route
rate(http_requests_total[5m])
# Retry rate
rate(retry_attempts_total[5m])
# Current in-flight requests
http_requests_in_flight
The /ping endpoint demonstrates retry with exponential backoff:
- Attempts the external call with a timeout
- On failure, waits with exponential backoff (doubles each attempt)
- Checks context cancellation before each retry
- Logs each retry attempt
- Returns error if all attempts fail
On receiving SIGTERM or SIGINT:
- Stops accepting new connections
- Waits for active requests to complete
- Shuts down within the configured timeout
- Logs shutdown duration
- Request context (
r.Context()) is passed through all layers - Retry logic checks
ctx.Done()before each attempt - External calls respect request timeouts
- Cancelled context immediately stops retries
Run all tests:
go test ./...Run tests with coverage:
go test -cover ./...Run specific test:
go test ./internal/retry -vTest graceful and hard shutdown scenarios with Docker:
./test-shutdown.shThis script tests:
- Graceful shutdown (SIGTERM)
- Hard kill (SIGKILL)
- Active requests during shutdown
- Metrics verification
See SHUTDOWN_TEST_RESULTS.md for detailed results.
To verify that Prometheus metrics are working correctly:
./test-metrics.shOr manually check metrics:
curl http://localhost:8080/metrics | grep -E "(http_requests_total|retry_attempts_total|http_requests_in_flight)"- Graceful shutdown with configurable drain period
- Health and liveness probes for Kubernetes
- Prometheus metrics for monitoring
- Structured JSON logging
- Context-aware retry logic
- Minimal Docker image (distroless)
- Resource limits in Kubernetes manifests
- Pod disruption budget for high availability
- Horizontal pod autoscaler configuration
Check logs for configuration errors:
./bin/resilientVerify environment variables:
env | grep -E "(PORT|RETRY|SHUTDOWN)"- Check
RETRY_ATTEMPTSenvironment variable - Verify logs for retry attempts
- Check metrics at
/metricsfor retry counts
- Verify
SHUTDOWN_TIMEOUT_SECis sufficient - Check Kubernetes
terminationGracePeriodSeconds - Review logs for shutdown duration
- Follow Go code style guidelines
- Add tests for new features
- Update documentation
- Run
make lintbefore committing
[Add your license here]
Based on patterns from: Building Resilient Go Services
For detailed technical requirements, see ASSIGNMENT.md