A fast, concurrent go pipeline library for Go with support for complex DAGs, automatic retries, and circuit breakers.
It's an implementation for this original thought: How to design a pipeline in go.
Tested on Apple M1 Pro. Pretty happy with these numbers:
| Pipeline | Latency | Throughput | Memory |
|---|---|---|---|
| Single Node | 1.8 μs | 547K ops/sec | 792 B/op |
| Two Nodes | 5.2 μs | 192K ops/sec | 1.4 KB/op |
| Three Nodes | 8.8 μs | 113K ops/sec | 2.1 KB/op |
The actual processing is around 40ns per node. Most of the overhead comes from Go's channels and scheduling, which honestly isn't much we can optimize further.
BenchmarkThroughput-8 2,824,549 6,878 ns/op 145,394 ops/sec
BenchmarkLatencyBreakdown:
SingleNode-8 6,675,679 1,825 ns/op 792 B/op 13 allocs/op
TwoNodes-8 2,331,255 5,202 ns/op 1,464 B/op 26 allocs/op
ThreeNodes-8 1,372,046 8,771 ns/op 2,136 B/op 39 allocs/op
go get github.com/huahuayu/pipelineHere's the simplest case - a two-stage pipeline that converts strings to uppercase and then counts characters:
package main
import (
"context"
"fmt"
"strings"
"time"
"github.com/huahuayu/pipeline"
)
func main() {
// First node: convert to uppercase
toUpper := pipeline.NewNode[string, string]("uppercase",
func(ctx context.Context, input string) (string, error) {
return strings.ToUpper(input), nil
})
// Second node: count characters
counter := pipeline.NewNode[string, int]("counter",
func(ctx context.Context, input string) (int, error) {
return len(input), nil
})
// Wire them together
toUpper.Connect(counter)
// Create pipeline starting from the first node
p, err := pipeline.NewPipeline(toUpper)
if err != nil {
panic(err)
}
// Start it up
ctx := context.Background()
if err := p.Start(ctx); err != nil {
panic(err)
}
// Send some data
if err := p.SendWithTimeout("hello world", 5*time.Second); err != nil {
fmt.Printf("Failed: %v\n", err)
}
// Clean shutdown
if err := p.Stop(10*time.Second); err != nil {
fmt.Printf("Shutdown error: %v\n", err)
}
}- Type-safe with generics - No interface{} shenanigans, full compile-time type checking
- DAG support - Not just linear pipelines, build any directed acyclic graph
- Context aware - Proper cancellation and timeout support throughout
- Panic recovery - Workers handle panics gracefully
- Built-in metrics - Track throughput, latency, queue sizes
- Circuit breakers - Prevent cascading failures (disabled by default)
- Retries with backoff - Configurable retry logic (disabled by default)
- Zero dependencies - Just standard library
By default, the pipeline runs with minimal features enabled (fail-fast mode). You can enable more complex behavior when needed:
config := pipeline.NodeConfig{
BufferSize: 100, // Job queue size
Workers: 8, // Concurrent workers
// Retries are disabled by default
MaxRetries: 3,
RetryDelay: 100 * time.Millisecond,
// Circuit breaker is disabled by default
CircuitBreaker: pipeline.CircuitBreakerConfig{
Enabled: true,
FailureThreshold: 5,
ResetTimeout: 30 * time.Second,
},
}
node := pipeline.NewNode[Input, Output]("my-node", processFunc, config)You're not limited to linear pipelines. Here's a diamond pattern:
// Build this graph:
// B -> D
// / \
// A F
// \ /
// C -> E
nodeA.Connect(nodeB)
nodeA.Connect(nodeC)
nodeB.Connect(nodeD)
nodeC.Connect(nodeE)
nodeD.Connect(nodeF)
nodeE.Connect(nodeF)
// The pipeline figures out the topology automatically
p, _ := pipeline.NewPipeline(nodeA)Errors are collected and available after processing:
node := pipeline.NewNode[string, string]("validator",
func(ctx context.Context, input string) (string, error) {
if input == "" {
return "", errors.New("empty input not allowed")
}
return input, nil
}).WithErrorHandler(func(err error) {
// Log it, send to metrics, whatever you need
log.Printf("Validation failed: %v", err)
})
// After processing, check for errors
errors := p.Errors()
for _, err := range errors {
log.Printf("Pipeline error: %v", err)
}Each node tracks its own metrics:
metrics := node.Metrics()
fmt.Printf("Processed: %d\n", metrics.ProcessedCount.Load())
fmt.Printf("Failed: %d\n", metrics.FailedCount.Load())
fmt.Printf("Queue size: %d\n", metrics.CurrentQueueSize.Load())
fmt.Printf("Avg latency: %v\n", metrics.GetAverageLatency())Everything respects context cancellation:
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
// This will respect the timeout
err := p.Send(ctx, data)
if errors.Is(err, context.DeadlineExceeded) {
// Handle timeout
}The pipeline is built around a few core ideas:
- Generic nodes - Each node is strongly typed with input/output types
- Worker pools - Each node runs N workers processing jobs concurrently
- Buffered channels - Used for job queues with configurable buffer sizes
- Result channels - Synchronous handoff between nodes provides natural backpressure
- State management - Atomic operations ensure thread safety without heavy locking
For a typical two-node pipeline processing, here's where the ~5μs goes:
- Creating context: ~100ns
- Channel send/receive: ~2000ns (multiple hops)
- Goroutine scheduling: ~1000ns
- Actual work: ~80ns (40ns per node)
- Result propagation: ~2000ns
Most overhead is from Go's runtime, not the pipeline itself.
# Standard tests
go test ./...
# Race detection (always passes!)
go test -race ./...
# Benchmarks
go test -bench=. -benchmem
# Coverage
go test -cover ./...Check out pipeline_test.go for more examples including stress tests, error scenarios, and complex DAG configurations.
- Start simple - Use defaults first, add complexity only when needed
- Size buffers appropriately - Too small causes backpressure, too large wastes memory
- One pipeline per flow - Don't try to reuse pipelines for different data flows
- Always call Stop() - Ensures graceful shutdown and cleanup
- Monitor in production - The built-in metrics are there for a reason
PRs welcome. Just make sure:
- Tests pass (including race detector)
- Benchmarks don't regress significantly
- New features include tests
- Public APIs have godoc comments
MIT
