Skip to content

feat: implement otlp prom exporter #24158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 13 additions & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ require (
github.com/manifoldco/promptui v0.9.0
github.com/mattn/go-isatty v0.0.20
github.com/mdp/qrterminal/v3 v3.2.1
github.com/prometheus/client_model v0.6.1
github.com/prometheus/client_golang v1.22.0
github.com/prometheus/common v0.63.0
github.com/rs/zerolog v1.34.0
Expand All @@ -53,6 +54,11 @@ require (
github.com/spf13/viper v1.20.1
github.com/stretchr/testify v1.10.0
github.com/tendermint/go-amino v0.16.0
go.opentelemetry.io/otel v1.35.0
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.35.0
go.opentelemetry.io/otel/metric v1.35.0
go.opentelemetry.io/otel/sdk v1.35.0
go.opentelemetry.io/otel/sdk/metric v1.35.0
go.uber.org/mock v0.5.1
golang.org/x/crypto v0.37.0
golang.org/x/sync v0.13.0
Expand Down Expand Up @@ -101,6 +107,8 @@ require (
github.com/go-kit/kit v0.13.0 // indirect
github.com/go-kit/log v0.2.1 // indirect
github.com/go-logfmt/logfmt v0.6.0 // indirect
github.com/go-logr/logr v1.4.2 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/go-viper/mapstructure/v2 v2.2.1 // indirect
github.com/godbus/dbus v0.0.0-20190726142602-4481cbc300e2 // indirect
github.com/gogo/googleapis v1.4.1 // indirect
Expand All @@ -111,7 +119,9 @@ require (
github.com/google/btree v1.1.3 // indirect
github.com/google/flatbuffers v1.12.1 // indirect
github.com/google/orderedcode v0.0.1 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/gorilla/websocket v1.5.3 // indirect
github.com/grpc-ecosystem/grpc-gateway/v2 v2.26.1 // indirect
github.com/gsterjov/go-libsecret v0.0.0-20161001094733-a6f4afe4910c // indirect
github.com/hashicorp/go-hclog v1.6.3 // indirect
github.com/hashicorp/go-immutable-radix v1.3.1 // indirect
Expand All @@ -138,7 +148,6 @@ require (
github.com/petermattis/goid v0.0.0-20240813172612-4fcff4a6cae7 // indirect
github.com/pkg/errors v0.9.1 // indirect
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect
github.com/prometheus/client_model v0.6.1 // indirect
github.com/prometheus/procfs v0.15.1 // indirect
github.com/rcrowley/go-metrics v0.0.0-20201227073835-cf1acfcdf475 // indirect
github.com/rogpeppe/go-internal v1.14.1 // indirect
Expand All @@ -155,6 +164,9 @@ require (
github.com/zondax/ledger-go v0.14.3 // indirect
go.etcd.io/bbolt v1.4.0-alpha.0.0.20240404170359-43604f3112c5 // indirect
go.opencensus.io v0.24.0 // indirect
go.opentelemetry.io/auto/sdk v1.1.0 // indirect
go.opentelemetry.io/otel/trace v1.35.0 // indirect
go.opentelemetry.io/proto/otlp v1.5.0 // indirect
go.uber.org/multierr v1.10.0 // indirect
golang.org/x/arch v0.15.0 // indirect
golang.org/x/exp v0.0.0-20250305212735-054e65f0b394 // indirect
Expand Down
27 changes: 17 additions & 10 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -256,6 +256,7 @@ github.com/go-logfmt/logfmt v0.4.0/go.mod h1:3RMwSq7FuexP4Kalkev3ejPJsZTpXXBr9+V
github.com/go-logfmt/logfmt v0.5.0/go.mod h1:wCYkCAKZfumFQihp8CzCvQ3paCTfi41vtzG1KdI/P7A=
github.com/go-logfmt/logfmt v0.6.0 h1:wGYYu3uicYdqXVgoYbvnkrPVXkuLM1p1ifugDMEdRi4=
github.com/go-logfmt/logfmt v0.6.0/go.mod h1:WYhtIu8zTZfxdn5+rREduYbwxfcBr/Vr6KEVveWlfTs=
github.com/go-logr/logr v1.2.2/go.mod h1:jdQByPbusPIv2/zmleS9BjJVeZ6kBagPoEUsqbVz/1A=
github.com/go-logr/logr v1.4.2 h1:6pFjapn8bFcIbiKo3XT4j/BhANplGihG6tvd+8rYgrY=
github.com/go-logr/logr v1.4.2/go.mod h1:9T104GzyrTigFIr8wt5mBrctHMim0Nb2HLGrmQ40KvY=
github.com/go-logr/stdr v1.2.2 h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag=
Expand Down Expand Up @@ -374,6 +375,8 @@ github.com/grpc-ecosystem/go-grpc-prometheus v1.2.0/go.mod h1:8NvIoxWQoOIhqOTXgf
github.com/grpc-ecosystem/grpc-gateway v1.9.5/go.mod h1:vNeuVxBJEsws4ogUvrchl83t/GYV9WGTSLVdBhOQFDY=
github.com/grpc-ecosystem/grpc-gateway v1.16.0 h1:gmcG1KaJ57LophUzW0Hy8NmPhnMZb4M0+kPpLofRdBo=
github.com/grpc-ecosystem/grpc-gateway v1.16.0/go.mod h1:BDjrQk3hbvj6Nolgz8mAMFbcEtjT1g+wF4CSlocrBnw=
github.com/grpc-ecosystem/grpc-gateway/v2 v2.26.1 h1:e9Rjr40Z98/clHv5Yg79Is0NtosR5LXRvdr7o/6NwbA=
github.com/grpc-ecosystem/grpc-gateway/v2 v2.26.1/go.mod h1:tIxuGz/9mpox++sgp9fJjHO0+q1X9/UOWd798aAm22M=
github.com/gsterjov/go-libsecret v0.0.0-20161001094733-a6f4afe4910c h1:6rhixN/i8ZofjG1Y75iExal34USq5p+wiN1tpie8IrU=
github.com/gsterjov/go-libsecret v0.0.0-20161001094733-a6f4afe4910c/go.mod h1:NMPJylDgVpX0MLRlPy15sqSwOFv/U1GZ2m21JhFfek0=
github.com/hashicorp/consul/api v1.3.0/go.mod h1:MmDNSzIMUjNpY/mQ398R4bk2FnqQLoPndWW5VkKPlCE=
Expand Down Expand Up @@ -741,17 +744,21 @@ go.opencensus.io v0.24.0 h1:y73uSU6J157QMP2kn2r30vwW1A2W2WFwSCGnAVxeaD0=
go.opencensus.io v0.24.0/go.mod h1:vNK8G9p7aAivkbmorf4v+7Hgx+Zs0yY+0fOtgBfjQKo=
go.opentelemetry.io/auto/sdk v1.1.0 h1:cH53jehLUN6UFLY71z+NDOiNJqDdPRaXzTel0sJySYA=
go.opentelemetry.io/auto/sdk v1.1.0/go.mod h1:3wSPjt5PWp2RhlCcmmOial7AvC4DQqZb7a7wCow3W8A=
go.opentelemetry.io/otel v1.34.0 h1:zRLXxLCgL1WyKsPVrgbSdMN4c0FMkDAskSTQP+0hdUY=
go.opentelemetry.io/otel v1.34.0/go.mod h1:OWFPOQ+h4G8xpyjgqo4SxJYdDQ/qmRH+wivy7zzx9oI=
go.opentelemetry.io/otel/metric v1.34.0 h1:+eTR3U0MyfWjRDhmFMxe2SsW64QrZ84AOhvqS7Y+PoQ=
go.opentelemetry.io/otel/metric v1.34.0/go.mod h1:CEDrp0fy2D0MvkXE+dPV7cMi8tWZwX3dmaIhwPOaqHE=
go.opentelemetry.io/otel/sdk v1.34.0 h1:95zS4k/2GOy069d321O8jWgYsW3MzVV+KuSPKp7Wr1A=
go.opentelemetry.io/otel/sdk v1.34.0/go.mod h1:0e/pNiaMAqaykJGKbi+tSjWfNNHMTxoC9qANsCzbyxU=
go.opentelemetry.io/otel/sdk/metric v1.34.0 h1:5CeK9ujjbFVL5c1PhLuStg1wxA7vQv7ce1EK0Gyvahk=
go.opentelemetry.io/otel/sdk/metric v1.34.0/go.mod h1:jQ/r8Ze28zRKoNRdkjCZxfs6YvBTG1+YIqyFVFYec5w=
go.opentelemetry.io/otel/trace v1.34.0 h1:+ouXS2V8Rd4hp4580a8q23bg0azF2nI8cqLYnC8mh/k=
go.opentelemetry.io/otel/trace v1.34.0/go.mod h1:Svm7lSjQD7kG7KJ/MUHPVXSDGz2OX4h0M2jHBhmSfRE=
go.opentelemetry.io/otel v1.35.0 h1:xKWKPxrxB6OtMCbmMY021CqC45J+3Onta9MqjhnusiQ=
go.opentelemetry.io/otel v1.35.0/go.mod h1:UEqy8Zp11hpkUrL73gSlELM0DupHoiq72dR+Zqel/+Y=
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.35.0 h1:0NIXxOCFx+SKbhCVxwl3ETG8ClLPAa0KuKV6p3yhxP8=
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.35.0/go.mod h1:ChZSJbbfbl/DcRZNc9Gqh6DYGlfjw4PvO1pEOZH1ZsE=
go.opentelemetry.io/otel/metric v1.35.0 h1:0znxYu2SNyuMSQT4Y9WDWej0VpcsxkuklLa4/siN90M=
go.opentelemetry.io/otel/metric v1.35.0/go.mod h1:nKVFgxBZ2fReX6IlyW28MgZojkoAkJGaE8CpgeAU3oE=
go.opentelemetry.io/otel/sdk v1.35.0 h1:iPctf8iprVySXSKJffSS79eOjl9pvxV9ZqOWT0QejKY=
go.opentelemetry.io/otel/sdk v1.35.0/go.mod h1:+ga1bZliga3DxJ3CQGg3updiaAJoNECOgJREo9KHGQg=
go.opentelemetry.io/otel/sdk/metric v1.35.0 h1:1RriWBmCKgkeHEhM7a2uMjMUfP7MsOF5JpUCaEqEI9o=
go.opentelemetry.io/otel/sdk/metric v1.35.0/go.mod h1:is6XYCUMpcKi+ZsOvfluY5YstFnhW0BidkR+gL+qN+w=
go.opentelemetry.io/otel/trace v1.35.0 h1:dPpEfJu1sDIqruz7BHFG3c7528f6ddfSWfFDVt/xgMs=
go.opentelemetry.io/otel/trace v1.35.0/go.mod h1:WUk7DtFp1Aw2MkvqGdwiXYDZZNvA/1J8o6xRXLrIkyc=
go.opentelemetry.io/proto/otlp v0.7.0/go.mod h1:PqfVotwruBrMGOCsRd/89rSnXhoiJIqeYNgFYFoEGnI=
go.opentelemetry.io/proto/otlp v1.5.0 h1:xJvq7gMzB31/d406fB8U5CBdyQGw4P399D1aQWU/3i4=
go.opentelemetry.io/proto/otlp v1.5.0/go.mod h1:keN8WnHxOy8PG0rQZjJJ5A2ebUoafqWp0eVQ4yIXvJ4=
go.uber.org/atomic v1.3.2/go.mod h1:gD2HeocX3+yG+ygLZcrzQJaqmWj9AIm7n08wl/qW/PE=
go.uber.org/atomic v1.4.0/go.mod h1:gD2HeocX3+yG+ygLZcrzQJaqmWj9AIm7n08wl/qW/PE=
go.uber.org/atomic v1.5.0/go.mod h1:sABNBOSYdrvTF6hTgEIbc7YasKWGhgEQZyfxyTvoXHQ=
Expand Down
3 changes: 3 additions & 0 deletions server/start.go
Original file line number Diff line number Diff line change
Expand Up @@ -533,6 +533,9 @@ func startAPIServer(
}

func startTelemetry(cfg serverconfig.Config) (*telemetry.Metrics, error) {
if cfg.Telemetry.OtlpExporterEnabled {
telemetry.StartOtlpExporter(cfg.Telemetry)
}
return telemetry.New(cfg.Telemetry)
Comment on lines +536 to 539
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Exporter is started without lifecycle management or error propagation

StartOtlpExporter (1) blocks fatal‑logging on failure and (2) launches a goroutine that never stops.
Starting it here means:

  • No way to shut it down during node shutdown (graceDuration, tests, etc.).
  • Potential race: exporter scrapes metrics before telemetry.New registers the Prom sink.

Recommend returning a cancel/cleanup func and wiring it into the existing errgroup, then starting after telemetry.New:

- if cfg.Telemetry.OtlpExporterEnabled {
-     telemetry.StartOtlpExporter(cfg.Telemetry)
- }
- return telemetry.New(cfg.Telemetry)
+ m, err := telemetry.New(cfg.Telemetry)
+ if err != nil {
+     return nil, err
+ }
+ if cfg.Telemetry.OtlpExporterEnabled {
+     cleanup, err := telemetry.StartOtlpExporter(ctx, cfg.Telemetry) // ctx from getCtx
+     if err != nil {
+         return nil, err
+     }
+     g.Go(func() error { <-ctx.Done(); cleanup(); return nil })      // tie to lifecycle
+ }
+ return m, nil

Committable suggestion skipped: line range outside the PR's diff.

}

Expand Down
1 change: 1 addition & 0 deletions simapp/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ require (
github.com/gorilla/websocket v1.5.3 // indirect
github.com/grpc-ecosystem/go-grpc-middleware v1.4.0 // indirect
github.com/grpc-ecosystem/grpc-gateway v1.16.0 // indirect
github.com/grpc-ecosystem/grpc-gateway/v2 v2.26.1 // indirect
github.com/gsterjov/go-libsecret v0.0.0-20161001094733-a6f4afe4910c // indirect
github.com/hashicorp/go-cleanhttp v0.5.2 // indirect
github.com/hashicorp/go-getter v1.7.8 // indirect
Expand Down
34 changes: 18 additions & 16 deletions simapp/go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -1145,8 +1145,8 @@ github.com/grpc-ecosystem/go-grpc-prometheus v1.2.0/go.mod h1:8NvIoxWQoOIhqOTXgf
github.com/grpc-ecosystem/grpc-gateway v1.9.5/go.mod h1:vNeuVxBJEsws4ogUvrchl83t/GYV9WGTSLVdBhOQFDY=
github.com/grpc-ecosystem/grpc-gateway v1.16.0 h1:gmcG1KaJ57LophUzW0Hy8NmPhnMZb4M0+kPpLofRdBo=
github.com/grpc-ecosystem/grpc-gateway v1.16.0/go.mod h1:BDjrQk3hbvj6Nolgz8mAMFbcEtjT1g+wF4CSlocrBnw=
github.com/grpc-ecosystem/grpc-gateway/v2 v2.7.0/go.mod h1:hgWBS7lorOAVIJEQMi4ZsPv9hVvWI6+ch50m39Pf2Ks=
github.com/grpc-ecosystem/grpc-gateway/v2 v2.11.3/go.mod h1:o//XUCC/F+yRGJoPO/VU0GSB0f8Nhgmxx0VIRUvaC0w=
github.com/grpc-ecosystem/grpc-gateway/v2 v2.26.1 h1:e9Rjr40Z98/clHv5Yg79Is0NtosR5LXRvdr7o/6NwbA=
github.com/grpc-ecosystem/grpc-gateway/v2 v2.26.1/go.mod h1:tIxuGz/9mpox++sgp9fJjHO0+q1X9/UOWd798aAm22M=
github.com/gsterjov/go-libsecret v0.0.0-20161001094733-a6f4afe4910c h1:6rhixN/i8ZofjG1Y75iExal34USq5p+wiN1tpie8IrU=
github.com/gsterjov/go-libsecret v0.0.0-20161001094733-a6f4afe4910c/go.mod h1:NMPJylDgVpX0MLRlPy15sqSwOFv/U1GZ2m21JhFfek0=
github.com/hashicorp/consul/api v1.3.0/go.mod h1:MmDNSzIMUjNpY/mQ398R4bk2FnqQLoPndWW5VkKPlCE=
Expand Down Expand Up @@ -1589,22 +1589,24 @@ go.opentelemetry.io/auto/sdk v1.1.0 h1:cH53jehLUN6UFLY71z+NDOiNJqDdPRaXzTel0sJyS
go.opentelemetry.io/auto/sdk v1.1.0/go.mod h1:3wSPjt5PWp2RhlCcmmOial7AvC4DQqZb7a7wCow3W8A=
go.opentelemetry.io/contrib/detectors/gcp v1.34.0 h1:JRxssobiPg23otYU5SbWtQC//snGVIM3Tx6QRzlQBao=
go.opentelemetry.io/contrib/detectors/gcp v1.34.0/go.mod h1:cV4BMFcscUR/ckqLkbfQmF0PRsq8w/lMGzdbCSveBHo=
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.58.0 h1:PS8wXpbyaDJQ2VDHHncMe9Vct0Zn1fEjpsjrLxGJoSc=
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.58.0/go.mod h1:HDBUsEjOuRC0EzKZ1bSaRGZWUBAzo+MhAcUUORSr4D0=
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.58.0 h1:yd02MEjBdJkG3uabWP9apV+OuWRIXGDuJEUJbOHmCFU=
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.58.0/go.mod h1:umTcuxiv1n/s/S6/c2AT/g2CQ7u5C59sHDNmfSwgz7Q=
go.opentelemetry.io/otel v1.34.0 h1:zRLXxLCgL1WyKsPVrgbSdMN4c0FMkDAskSTQP+0hdUY=
go.opentelemetry.io/otel v1.34.0/go.mod h1:OWFPOQ+h4G8xpyjgqo4SxJYdDQ/qmRH+wivy7zzx9oI=
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.54.0 h1:r6I7RJCN86bpD/FQwedZ0vSixDpwuWREjW9oRMsmqDc=
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.54.0/go.mod h1:B9yO6b04uB80CzjedvewuqDhxJxi11s7/GtiGa8bAjI=
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.54.0 h1:TT4fX+nBOA/+LUkobKGW1ydGcn+G3vRw9+g5HwCphpk=
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.54.0/go.mod h1:L7UH0GbB0p47T4Rri3uHjbpCFYrVrwc1I25QhNPiGK8=
go.opentelemetry.io/otel v1.35.0 h1:xKWKPxrxB6OtMCbmMY021CqC45J+3Onta9MqjhnusiQ=
go.opentelemetry.io/otel v1.35.0/go.mod h1:UEqy8Zp11hpkUrL73gSlELM0DupHoiq72dR+Zqel/+Y=
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.35.0 h1:QcFwRrZLc82r8wODjvyCbP7Ifp3UANaBSmhDSFjnqSc=
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.35.0/go.mod h1:CXIWhUomyWBG/oY2/r/kLp6K/cmx9e/7DLpBuuGdLCA=
go.opentelemetry.io/otel/exporters/stdout/stdoutmetric v1.29.0 h1:WDdP9acbMYjbKIyJUhTvtzj601sVJOqgWdUxSdR/Ysc=
go.opentelemetry.io/otel/exporters/stdout/stdoutmetric v1.29.0/go.mod h1:BLbf7zbNIONBLPwvFnwNHGj4zge8uTCM/UPIVW1Mq2I=
go.opentelemetry.io/otel/metric v1.34.0 h1:+eTR3U0MyfWjRDhmFMxe2SsW64QrZ84AOhvqS7Y+PoQ=
go.opentelemetry.io/otel/metric v1.34.0/go.mod h1:CEDrp0fy2D0MvkXE+dPV7cMi8tWZwX3dmaIhwPOaqHE=
go.opentelemetry.io/otel/sdk v1.34.0 h1:95zS4k/2GOy069d321O8jWgYsW3MzVV+KuSPKp7Wr1A=
go.opentelemetry.io/otel/sdk v1.34.0/go.mod h1:0e/pNiaMAqaykJGKbi+tSjWfNNHMTxoC9qANsCzbyxU=
go.opentelemetry.io/otel/sdk/metric v1.34.0 h1:5CeK9ujjbFVL5c1PhLuStg1wxA7vQv7ce1EK0Gyvahk=
go.opentelemetry.io/otel/sdk/metric v1.34.0/go.mod h1:jQ/r8Ze28zRKoNRdkjCZxfs6YvBTG1+YIqyFVFYec5w=
go.opentelemetry.io/otel/trace v1.34.0 h1:+ouXS2V8Rd4hp4580a8q23bg0azF2nI8cqLYnC8mh/k=
go.opentelemetry.io/otel/trace v1.34.0/go.mod h1:Svm7lSjQD7kG7KJ/MUHPVXSDGz2OX4h0M2jHBhmSfRE=
go.opentelemetry.io/otel/metric v1.35.0 h1:0znxYu2SNyuMSQT4Y9WDWej0VpcsxkuklLa4/siN90M=
go.opentelemetry.io/otel/metric v1.35.0/go.mod h1:nKVFgxBZ2fReX6IlyW28MgZojkoAkJGaE8CpgeAU3oE=
go.opentelemetry.io/otel/sdk v1.35.0 h1:iPctf8iprVySXSKJffSS79eOjl9pvxV9ZqOWT0QejKY=
go.opentelemetry.io/otel/sdk v1.35.0/go.mod h1:+ga1bZliga3DxJ3CQGg3updiaAJoNECOgJREo9KHGQg=
go.opentelemetry.io/otel/sdk/metric v1.35.0 h1:1RriWBmCKgkeHEhM7a2uMjMUfP7MsOF5JpUCaEqEI9o=
go.opentelemetry.io/otel/sdk/metric v1.35.0/go.mod h1:is6XYCUMpcKi+ZsOvfluY5YstFnhW0BidkR+gL+qN+w=
go.opentelemetry.io/otel/trace v1.35.0 h1:dPpEfJu1sDIqruz7BHFG3c7528f6ddfSWfFDVt/xgMs=
go.opentelemetry.io/otel/trace v1.35.0/go.mod h1:WUk7DtFp1Aw2MkvqGdwiXYDZZNvA/1J8o6xRXLrIkyc=
go.opentelemetry.io/proto/otlp v0.7.0/go.mod h1:PqfVotwruBrMGOCsRd/89rSnXhoiJIqeYNgFYFoEGnI=
go.opentelemetry.io/proto/otlp v0.15.0/go.mod h1:H7XAot3MsfNsj7EXtrA2q5xSNQ10UqI405h3+duxN4U=
go.opentelemetry.io/proto/otlp v0.19.0/go.mod h1:H7XAot3MsfNsj7EXtrA2q5xSNQ10UqI405h3+duxN4U=
Expand Down
9 changes: 9 additions & 0 deletions telemetry/metrics.go
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,15 @@ type Config struct {
// DatadogHostname defines the hostname to use when emitting metrics to
// Datadog. Only utilized if MetricsSink is set to "dogstatsd".
DatadogHostname string `mapstructure:"datadog-hostname"`

// Otlp Exporter fields
OtlpExporterEnabled bool `mapstructure:"otlp-exporter-enabled"`
OtlpCollectorEndpoint string `mapstructure:"otlp-collector-endpoint"`
OtlpCollectorMetricsURLPath string `mapstructure:"otlp-collector-metrics-url-path"`
OtlpUser string `mapstructure:"otlp-user"`
OtlpToken string `mapstructure:"otlp-token"`
OtlpServiceName string `mapstructure:"otlp-service-name"`
OtlpPushInterval time.Duration `mapstructure:"otlp-push-interval"`
}
Comment on lines +93 to 102
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider documenting defaults & ensuring zero‑value safety for new OTLP fields

The new fields are great, but:

  1. A zero OtlpPushInterval will cause the exporter loop to busy‑spin (time.Sleep(0) yields immediately).
  2. If OtlpCollectorEndpoint or OtlpCollectorMetricsURLPath are left empty, otlpmetrichttp.New will fail at runtime.

Please:
• Set sensible defaults (e.g. time.Second * 15).
• Add validation in New(...) to early‑return a descriptive error when mandatory fields are missing.


// Metrics defines a wrapper around application telemetry functionality. It allows
Expand Down
171 changes: 171 additions & 0 deletions telemetry/otlp_exporter.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
package telemetry

import (
"context"
"encoding/base64"
"fmt"
"log"
"math"
"time"

"github.com/prometheus/client_golang/prometheus"
dto "github.com/prometheus/client_model/go"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp"
otmetric "go.opentelemetry.io/otel/metric"
"go.opentelemetry.io/otel/sdk/metric"
"go.opentelemetry.io/otel/sdk/resource"
semconv "go.opentelemetry.io/otel/semconv/v1.21.0"
)

const meterName = "cosmos-sdk-otlp-exporter"

func StartOtlpExporter(cfg Config) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets have a comment on what this does and how it works

ctx := context.Background()

exporter, err := otlpmetrichttp.New(ctx,
otlpmetrichttp.WithEndpoint(cfg.OtlpCollectorEndpoint),
otlpmetrichttp.WithURLPath(cfg.OtlpCollectorMetricsURLPath),
otlpmetrichttp.WithHeaders(map[string]string{
"Authorization": "Basic " + formatBasicAuth(cfg.OtlpUser, cfg.OtlpToken),
}),
Comment on lines +30 to +32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels brittle in general. Couldn't some collectors expect auth in Http headers, some in grpc request directly, some without authz entirely?

)
if err != nil {
log.Fatalf("OTLP exporter setup failed: %v", err)
}

Comment on lines +34 to +37
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Avoid log.Fatalf inside library code

log.Fatalf terminates the entire node and makes graceful shutdown impossible. Surface the error instead:

- if err != nil {
-     log.Fatalf("OTLP exporter setup failed: %v", err)
- }
+ if err != nil {
+     return fmt.Errorf("OTLP exporter setup failed: %w", err)
+ }

…and bubble it up to the caller as noted in the server/start.go comment.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if err != nil {
log.Fatalf("OTLP exporter setup failed: %v", err)
}
if err != nil {
return fmt.Errorf("OTLP exporter setup failed: %w", err)
}

res, _ := resource.New(ctx, resource.WithAttributes(
semconv.ServiceName(cfg.OtlpServiceName),
))
Comment on lines +38 to +40
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to check the ignored error here? if not, can we comment why we are able to ignore it?


Comment on lines +38 to +41
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Handle and log the error returned by resource.New

The second return value is currently discarded. If the resource cannot be created, the exporter will run with incomplete metadata.

-res, _ := resource.New(ctx, resource.WithAttributes(
-    semconv.ServiceName(cfg.OtlpServiceName),
-))
+res, rErr := resource.New(ctx, resource.WithAttributes(
+    semconv.ServiceName(cfg.OtlpServiceName),
+))
+if rErr != nil {
+    return fmt.Errorf("failed to initialise OTLP resource: %w", rErr)
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
res, _ := resource.New(ctx, resource.WithAttributes(
semconv.ServiceName(cfg.OtlpServiceName),
))
res, rErr := resource.New(ctx, resource.WithAttributes(
semconv.ServiceName(cfg.OtlpServiceName),
))
if rErr != nil {
return fmt.Errorf("failed to initialise OTLP resource: %w", rErr)
}

meterProvider := metric.NewMeterProvider(
metric.WithReader(metric.NewPeriodicReader(exporter,
metric.WithInterval(cfg.OtlpPushInterval))),
metric.WithResource(res),
)
otel.SetMeterProvider(meterProvider)
meter := otel.Meter(meterName)

gauges := make(map[string]otmetric.Float64Gauge)
histograms := make(map[string]otmetric.Float64Histogram)

go func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this behave when shutting a node down? Do we want any kind of graceful shutdown here?

for {
if err := scrapePrometheusMetrics(ctx, meter, gauges, histograms); err != nil {
log.Printf("error scraping metrics: %v", err)
}
time.Sleep(cfg.OtlpPushInterval)
Comment on lines +54 to +58
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit

Suggested change
for {
if err := scrapePrometheusMetrics(ctx, meter, gauges, histograms); err != nil {
log.Printf("error scraping metrics: %v", err)
}
time.Sleep(cfg.OtlpPushInterval)
for ; ; time.Sleep(cfg.OtlpPushInterval){
if err := scrapePrometheusMetrics(ctx, meter, gauges, histograms); err != nil {
log.Printf("error scraping metrics: %v", err)
}

}
}()
}

func scrapePrometheusMetrics(ctx context.Context, meter otmetric.Meter, gauges map[string]otmetric.Float64Gauge, histograms map[string]otmetric.Float64Histogram) error {
metricFamilies, err := prometheus.DefaultGatherer.Gather()
if err != nil {
log.Printf("failed to gather prometheus metrics: %v", err)
return err
}

for _, mf := range metricFamilies {
name := mf.GetName()
for _, m := range mf.Metric {
switch mf.GetType() {
case dto.MetricType_GAUGE:
recordGauge(ctx, meter, gauges, name, mf.GetHelp(), m.Gauge.GetValue(), nil)

case dto.MetricType_COUNTER:
recordGauge(ctx, meter, gauges, name, mf.GetHelp(), m.Counter.GetValue(), nil)

case dto.MetricType_HISTOGRAM:
recordHistogram(ctx, meter, histograms, name, mf.GetHelp(), m.Histogram)

case dto.MetricType_SUMMARY:
recordSummary(ctx, meter, gauges, name, mf.GetHelp(), m.Summary)

default:
continue
}
}
}

return nil
}

func recordGauge(ctx context.Context, meter otmetric.Meter, gauges map[string]otmetric.Float64Gauge, name, help string, val float64, attrs []attribute.KeyValue) {
g, ok := gauges[name]
if !ok {
var err error
g, err = meter.Float64Gauge(name, otmetric.WithDescription(help))
if err != nil {
log.Printf("failed to create gauge %q: %v", name, err)
return
}
gauges[name] = g
}
g.Record(ctx, val, otmetric.WithAttributes(attrs...))
}

func recordHistogram(ctx context.Context, meter otmetric.Meter, histograms map[string]otmetric.Float64Histogram, name, help string, h *dto.Histogram) {
boundaries := make([]float64, 0, len(h.Bucket)-1) // excluding +Inf
bucketCounts := make([]uint64, 0, len(h.Bucket))

for _, bucket := range h.Bucket {
if math.IsInf(bucket.GetUpperBound(), +1) {
continue // Skip +Inf bucket boundary explicitly
}
Comment on lines +114 to +116
Copy link
Contributor

@technicallyty technicallyty Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we skip this? i know it says +Inf boundary, but as someone who is unfamiliar with OTL, i am not sure what the significance is

boundaries = append(boundaries, bucket.GetUpperBound())
bucketCounts = append(bucketCounts, bucket.GetCumulativeCount())
}

hist, ok := histograms[name]
if !ok {
var err error
hist, err = meter.Float64Histogram(
name,
otmetric.WithDescription(help),
otmetric.WithExplicitBucketBoundaries(boundaries...),
)
if err != nil {
log.Printf("failed to create histogram %s: %v", name, err)
return
}
histograms[name] = hist
}

prevCount := uint64(0)
for i, count := range bucketCounts {
countInBucket := count - prevCount
prevCount = count

// Explicitly record the mid-point of the bucket as approximation:
var value float64
if i == 0 {
value = boundaries[0] / 2.0

Check notice

Code scanning / CodeQL

Floating point arithmetic Note

Floating point arithmetic operations are not associative and a possible source of non-determinism
} else {
value = (boundaries[i-1] + boundaries[i]) / 2.0

Check notice

Code scanning / CodeQL

Floating point arithmetic Note

Floating point arithmetic operations are not associative and a possible source of non-determinism

Check notice

Code scanning / CodeQL

Floating point arithmetic Note

Floating point arithmetic operations are not associative and a possible source of non-determinism
}

// Record `countInBucket` number of observations explicitly (approximation):
for j := uint64(0); j < countInBucket; j++ {
hist.Record(ctx, value)
}
Comment on lines +150 to +152
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for j := uint64(0); j < countInBucket; j++ {
hist.Record(ctx, value)
}
for range countInBucket {
hist.Record(ctx, value)
}

}
}

func recordSummary(ctx context.Context, meter otmetric.Meter, gauges map[string]otmetric.Float64Gauge, name, help string, s *dto.Summary) {
recordGauge(ctx, meter, gauges, name+"_sum", help+" (summary sum)", s.GetSampleSum(), nil)
recordGauge(ctx, meter, gauges, name+"_count", help+" (summary count)", float64(s.GetSampleCount()), nil)

for _, q := range s.Quantile {
attrs := []attribute.KeyValue{
attribute.String("quantile", fmt.Sprintf("%v", q.GetQuantile())),
}
recordGauge(ctx, meter, gauges, name, help+" (summary quantile)", q.GetValue(), attrs)
}
}

func formatBasicAuth(username, token string) string {
auth := username + ":" + token
return base64.StdEncoding.EncodeToString([]byte(auth))
}
1 change: 1 addition & 0 deletions tests/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@ require (
github.com/gorilla/websocket v1.5.3 // indirect
github.com/grpc-ecosystem/go-grpc-middleware v1.4.0 // indirect
github.com/grpc-ecosystem/grpc-gateway v1.16.0 // indirect
github.com/grpc-ecosystem/grpc-gateway/v2 v2.26.1 // indirect
github.com/gsterjov/go-libsecret v0.0.0-20161001094733-a6f4afe4910c // indirect
github.com/hashicorp/go-cleanhttp v0.5.2 // indirect
github.com/hashicorp/go-getter v1.7.8 // indirect
Expand Down
Loading