Composable resilience for Go function calls.
It all started with killing a 1500-line framework. We had 20 files, 12 terms, and a God Object — all for one idea: "if a function fails, maybe try again." We deleted everything and replaced it with one type: func(ctx, call) error. That type covers retry, backoff, and any future resilience pattern — composed like Lego.
Mono-module today. The project is currently a single Go module.
go getpulls all transitive dependencies including the OTEL SDK. Zero-dependency core via separate Go modules is the target architecture, blocked on multimod — a standalone tool extracted from this repository.
go get github.com/thumbrise/resilience// One-off call — no setup needed
err := resilience.Do(ctx, callAPI,
retry.On(ErrTimeout, 3, backoff.Exponential(1*time.Second, 30*time.Second)),
)
// Multiple retry rules — different errors, different strategies
err := resilience.Do(ctx, callAPI,
retry.On(ErrTimeout, 3, backoff.Exponential(1*time.Second, 30*time.Second)),
retry.On(ErrRateLimit, 5, backoff.Constant(10*time.Second),
retry.WithWaitHint(extractRetryAfter), // Retry-After header overrides backoff
),
)
// Client with OTEL observability — create once, pass everywhere
client := resilience.NewClient(rsotel.Plugin())
err := client.Call(callAPI).
With(retry.On(ErrTimeout, 3, backoff.Exponential(1*time.Second, 30*time.Second))).
Do(ctx)Every resilience pattern has the same shape:
type Option func(ctx context.Context, call func(context.Context) error) errorRetry, timeout, circuit breaker, bulkhead — any pattern is a function that wraps a call. The architecture is designed so that future patterns (timeout, circuit breaker, rate limiter) are the same Option type. Write your own in 5–15 lines:
// Timeout — not shipped yet, but this is all it takes:
func Timeout(d time.Duration) resilience.Option {
return func(ctx context.Context, call func(context.Context) error) error {
ctx, cancel := context.WithTimeout(ctx, d)
defer cancel()
return call(ctx)
}
}See the roadmap for what's ready and what's planned.
| Option | Plugin | |
|---|---|---|
| What | func(ctx, call) error |
Interface: Name() + Events() |
| Controls execution? | Yes — wraps the call | No — observes only |
| State | Per-call, fresh each Do() |
Shared across calls |
| Use for | retry (ready), timeout/circuit/bulkhead (planned) | metrics (OTEL — ready), logging (planned) |
The compiler enforces the boundary: With() accepts Option, NewClient() accepts Plugin. Can't mix them up.
| Package | What's inside |
|---|---|
resilience |
Core: Do, Client, Option, Plugin, Events |
resilience/backoff |
Exponential, ExponentialWith, Constant, Default |
resilience/retry |
On, OnFunc, WithWaitHint |
resilience/otel |
Plugin() — OTEL metrics |
All packages live in a single Go module today. See devlog #3 for the multi-module plan.
| cenkalti/backoff | sony/gobreaker | failsafe-go | resilience | |
|---|---|---|---|---|
| Custom pattern in 10 lines | no | no | no | func(ctx, call) error |
| Compose retry + timeout + circuit | manually | no | builder chain | With(a, b, c) |
| Observability without code changes | no | no | per-policy | Plugin interface |
context.Context by design |
bolted on | no | yes | yes |
| Error matching | Permanent() only |
no | yes | errors.Is + errors.As + custom |
| WaitHint (Retry-After) | no | N/A | no | yes |
| Per-call state (no data races) | N/A | shared mutable | shared mutable | by construction |
thumbrise.github.io/resilience — full docs, guide, devlog.
- Getting Started — install, first call, core concepts
- Retry — error matching, budgets, WaitHint
- Backoff — Exponential, Constant, custom
- OTEL — metrics plugin, one line setup
- Roadmap — what's ready, what's next, what's on the horizon
- Devlog — design decisions, dead ends, lessons learned
Apache 2.0