Add open telemetry metric support in Koog #1381

Fant1k34 · 2026-01-19T18:42:15Z

Motivation and Context

PR introduces metrics support based on OpenTelemetry specification. Such metrics allow their listeners observe the behaviour of agent in run-time. All changes are done within OpenTelemetry feature

Link to issue

Added configuration:

addMeterExporter(MetricExporter) -- function within OpenTelemetry feature that accept MetricExporter and sent library metrics via such exporter

Example of OpenTelemetry feature configuration:

install(OpenTelemetry) {
    setServiceInfo(
        "calculator",
        "0.0.1"
    )
    addResourceAttributes(
        mapOf(AttributeKey.stringKey("service.instance.id") to "run-1")
    )
    addMeterExporter(
        OtlpGrpcMetricExporter.builder()
            .setEndpoint("http://localhost:17011")
            .setTimeout(1, TimeUnit.SECONDS)
            .build()
        )
    }

Added metrics:

gen_ai.client.token.usage -- see here
gen_ai.client.operation.duration -- see here
gen_ai.client.tool.count -- custom metric allows observe tool calls and their statuses

Visualisation (via OpenTelemetry plugin):

Breaking Changes

Type of the changes

New feature (non-breaking change which adds functionality)
Bug fix (non-breaking change which fixes an issue)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update
Tests improvement
Refactoring
CI/CD changes
Dependencies update

Checklist

The pull request has a description of the proposed change
I read the Contributing Guidelines before opening the pull request
The pull request uses develop as the base branch
Tests for the changes have been added
All new and existing tests passed

Additional steps for pull requests adding a new feature

An issue describing the proposed change exists
The pull request includes a link to the issue
The change was discussed and approved in the issue
Docs have been added / updated

…on details

…butes and fix found issues

egorklimov · 2026-01-21T11:06:44Z

docs/docs/opentelemetry-support.md

+    - Description: Total token count per operation and token type
+    - When emitted: after an LLM call finishes; recorded separately for input and output tokens
+    - Key attributes:
+        - `gen_ai.operation.name` (required)


(?) chat; generate_content; text_completion

egorklimov · 2026-01-21T11:08:26Z

docs/docs/opentelemetry-support.md

+    - Recommended explicit bucket boundaries (seconds): 0.01, 0.02, 0.04, 0.08, 0.16, 0.32, 0.64, 1.28, 2.56, 5.12,
+      10.24, 20.48, 40.96, 81.92
+    - Key attributes:
+        - `gen_ai.operation.name` (required)


execute_tool

egorklimov · 2026-01-21T11:24:04Z

docs/docs/opentelemetry-support.md

+    - When emitted: on tool completion, failure, or validation failure
+    - Key attributes:
+        - `gen_ai.operation.name` (required) — `EXECUTE_TOOL`
+        - `gen_ai.tool.name` (recommended)


We should be cautious with dynamic values since they can affect the cardinality and performance of the system, as well as the monitoring backend. A general recommendation is that cardinality is not critical if it remains below 100. The current metric cardinality is approximately $3 + 3*n$, where $n$ is the number of tools.

To address this, let's add a method to filter tool names for the attribute, enabling us to limit the cardinality. Additionally, let's document this approach.

sdubov

I like the consistency for current metrics support implementation. Thank you. Please check my comments about some metrics implementation details.

sdubov · 2026-01-21T12:49:21Z

...emetry/src/jvmMain/kotlin/ai/koog/agents/features/opentelemetry/attribute/GenAIAttributes.kt

                override val value: HiddenString = HiddenString(result.toString())
            }
+
+            // gen_ai.tool.status


Why not to report this as a custom attribute? or a special koog attribute. I think it would be cleaner to easier to separate attribute if we keep gen_ai attributes consistent with ones defined in the Open Telemetry documentation. WDYT?

Thanks, I added it as a custom koog attribute. Additionally, I extracted metric gen_ai.tool.call.count to koog.tool.count as a custom metric name to be consistent with Open Telemetry documentation as well

sdubov · 2026-01-21T20:10:15Z

examples/simple-examples/src/main/kotlin/ai/koog/agents/example/calculator/Calculator.kt

                }
            }
+
+            install(OpenTelemetry) {


This seems for me unrelated to the Open Telemetry metrics updates. Is it a leftover from testing?

Yeap, this is unrelated to Open Telemetry metrics feature. But I thought, that it could be great to have at least one example with configured metrics feature to check, if everything works as expected
So yes, it's a leftover from testing and I did that intentionally to have one example to test. Since it's in examples project in koog, I thought that it's fine to enable such feature here

If it's better to remove from Calculator example, I can do so!

sdubov · 2026-01-21T20:12:09Z

...emetry/src/jvmMain/kotlin/ai/koog/agents/features/opentelemetry/attribute/GenAIAttributes.kt

 * - gen_ai.tool.description (recommended)
 * - gen_ai.tool.name (recommended)
+ * - gen_ai.token.type (required)
+ * - gen_ai.tool.status (custom)


Please see the comment about the custom attribute below.

sdubov · 2026-01-21T21:04:31Z

...ntelemetry/src/jvmMain/kotlin/ai/koog/agents/features/opentelemetry/feature/OpenTelemetry.kt

+                    message.metaInfo.inputTokensCount?.let { inputTokensCount ->
+                        tokensCounter.add(
+                            inputTokensCount.toLong(),
+                            Attributes.builder()


Would not it be safer to and easier to use existing wrapper and converter from Koog Attributes to SDK attributes using PII protection and auto type conversion here. Something like this:

... listOf( GenAIAttributes.Operation.Name(GenAIAttributes.Operation.OperationNameType.TEXT_COMPLETION), GenAIAttributes.Provider.Name(provider), GenAIAttributes.Token.Type(GenAIAttributes.Token.TokenType.INPUT), GenAIAttributes.Response.Model(eventContext.model) ).toSdkAttributes(config.isVerbose) ...

Yes, it would!
Thank you, replaced Attributes.builder with listOf and toSdkAttributes functions, additionally added isVerbose attribute there not to propagate config futher

Surprisingly, such way of attributes creation (via listOf().toSdkAttributes(config.isVerbose)) leads to wrong behaviour of metrics
It can be observed on the screenshots (the first screenshot -- how it works with AttributesBuilder, the second -- via toSdkAttributes)

For some reason, second option leads to fall of the metric, however it's cumulative one and it cannot become less that it was before

Thoroughly explore this issue by myself and with the help of chatGPT, we came to conclusion that the reason for such behaviour is custom wrapper under the Attributes class (toSdkAttributes function). For current implementation, it's not optimized and creates a new Attributes object each time. In terms of tracing feature that's totally fine, but not for metric one. Sdk should understand, what has changed from the previous timeframe, and since toSdkAttributes does create Attributes instance each time, it cannot properly link the change of metric.

This behaviour can be observed via configuring LoggingMeterExporter. Each time it prints metrics' state and it has different hashcode and object reference in the system.

Since the problem appears not in plugin (that I use for visualization purpose), but in library (sdk cannot link attributes), I suggest revert changes to AttributesBuilder.build() back

sdubov · 2026-01-21T21:07:17Z

...ntelemetry/src/jvmMain/kotlin/ai/koog/agents/features/opentelemetry/feature/OpenTelemetry.kt

+
+                eventContext.responses.lastOrNull()?.let { message ->
+                    message.metaInfo.inputTokensCount?.let { inputTokensCount ->
+                        tokensCounter.add(


What do you think about extracting this logic a separate extension method (similar to builder methods to start and end spans) that will receive necessary parameters from eventContext and create attributes. It will make the feature code a bit cleaner. For example:

fun LongCounter.addInputTokensCounter( config: OpenTelemetryConfig, inputTokensCount: Long, provider: LLMProvider, model: LLModel ) { this.add( inputTokensCount.toLong(), listOf<Attribute>( GenAIAttributes.Operation.Name(GenAIAttributes.Operation.OperationNameType.TEXT_COMPLETION), GenAIAttributes.Provider.Name(provider), GenAIAttributes.Token.Type(GenAIAttributes.Token.TokenType.INPUT), GenAIAttributes.Response.Model(eventContext.model) ).toSdkAttributes(config.isVerbose) } fun LongCounter.addInputTokensCounter(...) { ... }

I guess, after implementing MetricCollector, such extracting could be excessive, as new MetricCollector class works with add or record function only once (except operationDurationHistogram -- twice, but different set of attributes). Wdyt?

sdubov · 2026-01-21T21:49:59Z

...pentelemetry/src/jvmMain/kotlin/ai/koog/agents/features/opentelemetry/metric/TokenCounter.kt

@@ -0,0 +1,24 @@
+package ai.koog.agents.features.opentelemetry.metric


Would you expect other methods here? I would rename it to tokenCounter.kt to not confuse when new methods are added here as this file is to aggregate different token counter top-level methods.

Makes sense, thanks! I have renamed it!

sdubov · 2026-01-21T21:50:24Z

...telemetry/src/jvmMain/kotlin/ai/koog/agents/features/opentelemetry/metric/ToolCallCounter.kt

@@ -0,0 +1,25 @@
+package ai.koog.agents.features.opentelemetry.metric


Same comment about the file name.

sdubov · 2026-01-21T21:50:36Z

...rc/jvmMain/kotlin/ai/koog/agents/features/opentelemetry/metric/OperationDurationHistogram.kt

@@ -0,0 +1,28 @@
+package ai.koog.agents.features.opentelemetry.metric


Same comment about the file name.

sdubov · 2026-01-21T21:56:33Z

...elemetry/src/jvmMain/kotlin/ai/koog/agents/features/opentelemetry/metric/GenAIMetricNames.kt

+                    get() = "Tool calls count"
+
+                override val unit: String
+                    get() = "tool call"


unit is "tool call"?

Yeap, because this metric -- too calls count. So similarly to token count, metrics here is tool call

sdubov · 2026-01-21T22:07:47Z

...ntelemetry/src/jvmMain/kotlin/ai/koog/agents/features/opentelemetry/feature/OpenTelemetry.kt

+
+                failedToolCall?.let { toolCall ->
+                    toolCall.getDurationSec()?.let { sec ->
+                        toolCallDurationHistogram.record(


Maybe it worth combining those two methods end + record. Let me propose an idea. Here, you want to create a new record with a specific metric. Maybe, instead of having a storage of events, you have a metrics collector class (similar to spans collector that we already have). This metrics collector also will be responsible for storing needed metrics and use this data to create a proper records. You can call the wrapper method record (or end) and it will create a record for a metric under the hood. So, you do not need to perform it with two separate steps: end + record. You will do it as a single step in this collector instead. Wdyt?

Thank you, I implemented this approach as you described:

Introduced MetricCollector class which is responsible for storing necessary metrics, and recording them based on the MetricEvent that appears in OpenTelemetry class like that:

metricCollector.recordEvent( ToolCallEnded( id = eventContext.eventId, timestamp = System.currentTimeMillis(), toolName = eventContext.toolName, status = ToolCallStatus.FAILED ) )

Under the hood, MetricCollector works with MetricEventStorage that is responsible for storing matching events via methods startEvent and endEvent. This approach allows to store paired events and compute some metrics based on difference between

sdubov · 2026-01-22T06:18:09Z

...etry/src/jvmMain/kotlin/ai/koog/agents/features/opentelemetry/feature/OpenTelemetryConfig.kt

+        metricExporters.forEach { exporter ->
+            val reader = PeriodicMetricReader
+                .builder(exporter)
+                .setInterval(Duration.ofSeconds(1))


Does it makes sense to move this interval duration in the configuration as well?

It does! Thanks, I added it

sdubov · 2026-01-22T15:16:30Z

...ntelemetry/src/jvmMain/kotlin/ai/koog/agents/features/opentelemetry/feature/OpenTelemetry.kt

                )
+
+                // Store llm call
+                eventCallStorage.addEventCall(eventContext.eventId, eventContext.model.provider.display)


I've also noticed that you use this EventCallStorage to get start-end time of a metric through getDuration() API. With current implementation, the start and end time should match the timing for a corresponding span. Maybe, you can re-use the duration from the span data instead. It would allow you to get rid of this storage class with necessity to manage it at all. Like this:

(inferenceSpan.span as? ReadableSpan)?.toSpanData()?.let { spanData -> val duration = (spanData.endEpochNanos - spanData.startEpochNanos).toDuration(DurationUnit.NANOSECONDS) toolCallDurationHistogram.record( duration.inWholeSeconds.toDouble(), // attributes }

Yeap, I agree we could get rid of storage class in this case

However, after implementing this approach I have figured out, that OpenTelemetry class that has almost 900 lines is not much cluttered with metrics handling anymore, because of convenient recordEvent method

Additionally, Metric Collector separates logic of ai agent pipeline handling and metric recording. Moreover, it's not coupled with SpanCollector class and works separately (metrics for metric, traces for traces)

One more potential benefit is possibility to monitor states between 2 events. In future, we can save some value of starting event (for example, total spend tokens) and the value of ending event (total spend tokens). And we can record the difference between these states as metric (tokens spend tokens for this LLM Call) or something like that

If we reuse the logic from span collector:

we would need extra logic for OpenTelemetry class to get duration from span or add span as an argument to the MetricCollector, that leads to coupled, dependent classes. Ideally, metrics and spans should not be dependent, as they provide different way to observe system/agent behaviour

we'll need follow strict matching "span" -> "metric" in the future, otherwise we could not record time property of operation

… handling

…tegrate verbosity support

… status handling, and change attributes creation

…nd usage example

…ests with MetricCollector validations

…elpers, and streamline attribute checks

Fant1k34 added 7 commits January 15, 2026 13:46

feat: add OpenTelemetry metrics

b142b29

feat: add metrics attributes, combine metrics and refactor

bb86177

refactor: organize imports across OpenTelemetry modules

0faa02b

docs: update OpenTelemetry support guide with metrics and configurati…

bc2a125

…on details

docs: fix OpenTelemetry docs

7c8e123

feat: add unit tests for ToolCallStorage, MetricNames, and GenAIAttri…

15f13d4

…butes and fix found issues

feat: add llm calls to operation duration metric

e9494a7

Fant1k34 requested a review from sdubov January 20, 2026 16:44

egorklimov reviewed Jan 21, 2026

View reviewed changes

Fant1k34 changed the title ~~[WIP] Ndukin/add open telemetry metric~~ Ndukin/add open telemetry metric Jan 21, 2026

Fant1k34 changed the title ~~Ndukin/add open telemetry metric~~ Add open telemetry metric support in Koog Jan 21, 2026

sdubov requested changes Jan 21, 2026

View reviewed changes

sdubov reviewed Jan 22, 2026

View reviewed changes

Fant1k34 added 6 commits January 26, 2026 19:41

feat: replace EventCallStorage with MetricCollector and enhance event…

eceec2e

… handling

refactor: rename event handling methods, update metric naming, and in…

6693552

…tegrate verbosity support

refactor: migrate GenAIAttributes to KoogAttributes, update tool call…

4a9c0ec

… status handling, and change attributes creation

docs: update OpenTelemetry guide to include meterInterval parameter a…

9e2078c

…nd usage example

feat: implement TestMeter to mock OpenTelemetry meters and update t…

c6ee09d

…ests with MetricCollector validations

refactor: rename and update test classes, introduce getRecordsBy* h…

291c50d

…elpers, and streamline attribute checks

		@@ -0,0 +1,24 @@
		package ai.koog.agents.features.opentelemetry.metric

		@@ -0,0 +1,25 @@
		package ai.koog.agents.features.opentelemetry.metric

		@@ -0,0 +1,28 @@
		package ai.koog.agents.features.opentelemetry.metric

Add open telemetry metric support in Koog #1381

Are you sure you want to change the base?

Add open telemetry metric support in Koog #1381

Conversation

Fant1k34 commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

Breaking Changes

Type of the changes

Checklist

Additional steps for pull requests adding a new feature

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sdubov left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sdubov Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Fant1k34 Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fant1k34 commented Jan 19, 2026 •

edited

Loading

sdubov Jan 22, 2026 •

edited

Loading

Fant1k34 Jan 27, 2026 •

edited

Loading