Skip to content

dos0/resilient

Repository files navigation

Resilient Go Microservice

Production-ready Go microservice template demonstrating resilient service patterns including graceful shutdown, context propagation, and retry/timeout mechanisms.

Features

  • Graceful Shutdown - Clean termination on SIGTERM/SIGINT with configurable drain timeout
  • Context Propagation - Request context passed through all layers, cancels retries immediately
  • Retry with Exponential Backoff - Configurable retry attempts with exponential backoff
  • Prometheus Metrics - Comprehensive metrics for requests, retries, and shutdown duration
  • Structured Logging - JSON-formatted logs using Go's log/slog
  • Kubernetes Ready - Includes deployment manifests with probes, PDB, and HPA
  • 12-Factor App - Configuration via environment variables with .env support
  • Production Dockerfile - Multi-stage build with distroless base image

Quick Start

Prerequisites

  • Go 1.23+
  • Docker (optional, for containerized builds)
  • Make

Local Development

  1. Clone the repository:
git clone <repository-url>
cd resilient
  1. Install dependencies:
go mod download
  1. Create .env file (optional, uses defaults otherwise):
cp .env.example .env
# Edit .env with your settings
  1. Run the service:
make run

Or build and run manually:

make build
./bin/resilient

The service will start on port 8080 by default.

Docker

Build Docker image:

make docker-build

Run container:

docker run -p 8080:8080 --env-file .env resilient:latest

API Endpoints

Method Path Description
GET /healthz Readiness probe
GET /livez Liveness probe
GET /ping Main endpoint with retry logic
GET /metrics Prometheus metrics endpoint

Example Usage

# Health check
curl http://localhost:8080/healthz

# Liveness probe
curl http://localhost:8080/livez

# Main endpoint (may retry internally)
curl http://localhost:8080/ping

# Prometheus metrics
curl http://localhost:8080/metrics

Response Examples

/ping response:

{
  "pong": true
}

/healthz response:

{
  "ready": true
}

Configuration

All configuration is done via environment variables:

Variable Default Description
PORT 8080 HTTP server port
RETRY_ATTEMPTS 3 Maximum retry attempts
RETRY_BACKOFF_MS 200 Initial backoff in milliseconds
REQUEST_TIMEOUT_MS 1500 Request timeout in milliseconds
SHUTDOWN_TIMEOUT_SEC 20 Graceful shutdown timeout in seconds

Example .env file

PORT=8080
RETRY_ATTEMPTS=3
RETRY_BACKOFF_MS=200
REQUEST_TIMEOUT_MS=1500
SHUTDOWN_TIMEOUT_SEC=20

Project Structure

.
├── cmd/
│   └── app/
│       └── main.go              # Application entry point
├── internal/
│   ├── config/
│   │   └── config.go           # Configuration management
│   ├── log/
│   │   └── logger.go          # Structured logging setup
│   ├── metrics/
│   │   └── metrics.go          # Prometheus metrics
│   ├── retry/
│   │   ├── retry.go             # Retry logic with backoff
│   │   └── retry_test.go        # Retry tests
│   ├── server/
│   │   └── http.go              # HTTP server and routes
│   └── shutdown/
│       ├── graceful.go          # Graceful shutdown handler
│       └── graceful_test.go    # Shutdown tests
├── deploy/
│   └── k8s/
│       ├── deployment.yaml      # Kubernetes deployment
│       ├── service.yaml         # Kubernetes service
│       ├── configmap.yaml       # Configuration map
│       ├── pdb.yaml             # Pod disruption budget
│       └── hpa.yaml             # Horizontal pod autoscaler
├── Dockerfile                   # Multi-stage Docker build
├── Makefile                     # Build commands
├── .golangci.yml                # Linter configuration
├── go.mod                       # Go dependencies
└── README.md                    # This file

Development

Build

make build

Run Tests

make test

Lint

make lint

The project uses golangci-lint configured via .golangci.yml. This file:

  • Defines which linters to run
  • Sets up rules and their severity
  • Ensures consistent code quality across the project
  • Can be customized for project-specific needs

Enabled linters:

  • errcheck - Checks for unchecked errors in code
  • gosimple - Simplifies code by applying Go's idiomatic patterns
  • govet - Reports suspicious constructs in Go code (vet)
  • ineffassign - Detects ineffectual assignments
  • staticcheck - Static analysis with comprehensive checks
  • unused - Finds unused code (constants, variables, functions, types, etc.)
  • gofmt - Checks code formatting
  • goimports - Checks import statement formatting
  • misspell - Finds commonly misspelled English words in comments
  • revive - Fast, configurable linter with customizable rules

Configuration highlights:

  • revive checks exported names and return types
  • errcheck includes type assertion checking
  • staticcheck runs all available checks
  • No issue limits (all violations are reported)

See .golangci.yml for complete configuration details.

Format Code

make fmt

Kubernetes Deployment

Prerequisites

  • Kubernetes cluster
  • kubectl configured

Deploy

kubectl apply -f deploy/k8s/

Verify Deployment

kubectl get pods -l app=resilient
kubectl get svc resilient

Access the Service

If using port-forward:

kubectl port-forward svc/resilient 8080:80
curl http://localhost:8080/ping

Metrics

The service exposes Prometheus metrics at /metrics:

  • http_requests_total - Total HTTP requests (by route and status)
  • retry_attempts_total - Total retry attempts (by route)
  • http_requests_in_flight - Current in-flight requests
  • shutdown_duration_seconds - Graceful shutdown duration histogram

Example Metrics Query

# Request rate by route
rate(http_requests_total[5m])

# Retry rate
rate(retry_attempts_total[5m])

# Current in-flight requests
http_requests_in_flight

Architecture

Retry Logic

The /ping endpoint demonstrates retry with exponential backoff:

  1. Attempts the external call with a timeout
  2. On failure, waits with exponential backoff (doubles each attempt)
  3. Checks context cancellation before each retry
  4. Logs each retry attempt
  5. Returns error if all attempts fail

Graceful Shutdown

On receiving SIGTERM or SIGINT:

  1. Stops accepting new connections
  2. Waits for active requests to complete
  3. Shuts down within the configured timeout
  4. Logs shutdown duration

Context Propagation

  • Request context (r.Context()) is passed through all layers
  • Retry logic checks ctx.Done() before each attempt
  • External calls respect request timeouts
  • Cancelled context immediately stops retries

Testing

Run all tests:

go test ./...

Run tests with coverage:

go test -cover ./...

Run specific test:

go test ./internal/retry -v

Docker Shutdown Testing

Test graceful and hard shutdown scenarios with Docker:

./test-shutdown.sh

This script tests:

  • Graceful shutdown (SIGTERM)
  • Hard kill (SIGKILL)
  • Active requests during shutdown
  • Metrics verification

See SHUTDOWN_TEST_RESULTS.md for detailed results.

Metrics Verification

To verify that Prometheus metrics are working correctly:

./test-metrics.sh

Or manually check metrics:

curl http://localhost:8080/metrics | grep -E "(http_requests_total|retry_attempts_total|http_requests_in_flight)"

Production Considerations

  • Graceful shutdown with configurable drain period
  • Health and liveness probes for Kubernetes
  • Prometheus metrics for monitoring
  • Structured JSON logging
  • Context-aware retry logic
  • Minimal Docker image (distroless)
  • Resource limits in Kubernetes manifests
  • Pod disruption budget for high availability
  • Horizontal pod autoscaler configuration

Troubleshooting

Service won't start

Check logs for configuration errors:

./bin/resilient

Verify environment variables:

env | grep -E "(PORT|RETRY|SHUTDOWN)"

Retries not working

  • Check RETRY_ATTEMPTS environment variable
  • Verify logs for retry attempts
  • Check metrics at /metrics for retry counts

Graceful shutdown issues

  • Verify SHUTDOWN_TIMEOUT_SEC is sufficient
  • Check Kubernetes terminationGracePeriodSeconds
  • Review logs for shutdown duration

Contributing

  1. Follow Go code style guidelines
  2. Add tests for new features
  3. Update documentation
  4. Run make lint before committing

License

[Add your license here]

References

Based on patterns from: Building Resilient Go Services


For detailed technical requirements, see ASSIGNMENT.md

About

Production-ready Go microservice template demonstrating resilient service patterns: graceful shutdown on SIGTERM/SIGINT with configurable drain timeout, context propagation through all layers, exponential backoff retries, Prometheus metrics, structured logging, and Kubernetes deployment manifests with probes, PDB, and HPA.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors