Description
Description
Memory Leak in otel library code.
Environment
- OS: Linux
- Architecture: x86
- Go Version: 1.23.2
- go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.56.0- go.opentelemetry.io/otel v1.31.0
- go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.31.0
- go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.31.0
- go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.31.0
- go.opentelemetry.io/otel/sdk v1.31.0
- go.opentelemetry.io/otel/trace v1.31.0
Steps To Reproduce
See Comment here: #5190 (comment)
I am pretty sure this is still an issue or something else in the golang otel ecosystem. I will get a pprof setup possibly tomorrow, but here's some anecdotal evidence I have:
Pretty easy to see when tracing was implemented from that graph. And yes. I have removed our tracing implementation and it's back to normal memory usage.
Here is a rough draft of our setup. Please let me know if I am doing anything egregiously dumb, but for the most part, it's all pretty standard stuff take from various docs:
go.mod
go 1.23.2 require ( go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.56.0 go.opentelemetry.io/otel v1.31.0 go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.31.0 go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.31.0 go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.31.0 go.opentelemetry.io/otel/sdk v1.31.0 go.opentelemetry.io/otel/trace v1.31.0 )
We are wrapping the otelhttp.NewHandler around the toplevel muxer, so everything is traced. Yes, I know this is expensive, but it shouldn't leak memory. Eventually we will change this to include/exclude/drop stuff, just so we aren't taking in so much volume (ping routes, health checks, etc.) and do more aggressive down sampling.
func NewApi(c config.Config) *Api { return &Api{ c: &c, controllers: []Controller{newControllers(c)}, server: &http.Server{ ReadTimeout: c.ReadTimeout, WriteTimeout: c.WriteTimeout, IdleTimeout: c.IdleTimeout, Addr: fmt.Sprintf(":%d", c.Port), Handler: otelhttp.NewHandler(chi.NewMux(), "INGRESS", otelhttp.WithFilter(traceFilter)), }, done: make(chan bool), sigChannel: make(chan os.Signal, 1024), } }Here is how we are initializing our trace and metrics providers once on boot:
// TracerProvider an OTLP exporter, and configures the corresponding trace provider. func TracerProvider(ctx context.Context, res *resource.Resource) (func(context.Context) error, error) { // Set up a trace exporter traceExporter, err := otlptrace.New(ctx, otlptracegrpc.NewClient()) if err != nil { return nil, errors.Wrap(err, "failed to create trace exporter") } // Register the trace exporter with a TracerProvider, using a batch // span processor to aggregate spans before export. tracerProvider := sdktrace.NewTracerProvider( sdktrace.WithSampler(sdktrace.AlwaysSample()), sdktrace.WithResource(res), sdktrace.WithBatcher(traceExporter), ) otel.SetTracerProvider(tracerProvider) otel.SetTextMapPropagator( propagation.NewCompositeTextMapPropagator( propagation.TraceContext{}, propagation.Baggage{}, )) // Shutdown will flush any remaining spans and shut down the exporter. return tracerProvider.Shutdown, nil } // MeterProvider an OTLP exporter, and configures the corresponding meter provider. func MeterProvider(ctx context.Context, res *resource.Resource) (func(context.Context) error, error) { metricExporter, err := otlpmetricgrpc.New(ctx) if err != nil { return nil, errors.Wrap(err, "failed to create metric exporter") } meterProvider := sdkmetric.NewMeterProvider( sdkmetric.WithReader(sdkmetric.NewPeriodicReader(metricExporter)), sdkmetric.WithResource(res), ) otel.SetMeterProvider(meterProvider) return meterProvider.Shutdown, nil }Then called and shutdown on main:
shutDownTracer, err := traceinstrument.TracerProvider(ctx, traceRes) if err != nil { log.Logger.Fatal("failed to create trace provider", zap.Error(err)) } defer func(onShutdown func(ctx context.Context) error) { if errr := onShutdown(ctx); errr != nil { log.Logger.Error("error shutting down trace provider", zap.Error(errr)) } }(shutDownTracer) shutdownTraceMetrics, err := traceinstrument.MeterProvider(ctx, traceRes) if err != nil { log.Logger.Fatal("failed to create meter provider", zap.Error(err)) } defer func(onShutdown func(ctx context.Context) error) { if errr := onShutdown(ctx); errr != nil { log.Logger.Error("error shutting down metrics provider", zap.Error(errr)) } }(shutdownTraceMetrics)Note. We are also using the otelhttp.NewTransport to wrap the default logging transport:
http.DefaultTransport = otelhttp.NewTransport(http.DefaultTransport)If we remove tracing setup, memory usage goes back to normal. So the leak is definitely in our tracing setup.
Expected behavior
Memory does not continuously increase over time.
Metadata
Metadata
Assignees
Type
Projects
Status