📋 Prerequisites
📝 Feature Summary
Add dedicated OTel spans for the memory subsystem (memory.write / memory.read / memory.embed / memory.consolidate / memory.evict), alongside the existing gen_ai.* spans on invoke_agent. Purely additive instrumentation — no behavior change, no new runtime dependencies.
❓ Problem Statement / Motivation
kagent's platform-level OTel pipeline is already excellent: A2A metadata propagates as span attributes (v0.9.3), the controller's invoke_agent span carries gen_ai.agent.* + gen_ai.provider.name per the OTel GenAI semconv (verified live against a v0.9.4 deployment, Go OTel SDK 1.43.0), and helm/kagent/values.yaml exposes a clean otel.tracing block.
What's missing is dedicated spans for the memory subsystem: save_memory, load_memory, prefetch_memory, and the auto-extractor that fires every 5th user message.
Without dedicated spans, memory operations are visible only as opaque HTTP POSTs against /api/sessions/{ctx-id}/events — which is fine for HTTP-level latency but makes it impossible to compare kagent against other agent-memory backends on retrieval latency, embedding cost, or write amplification.
Verified baseline (Dynatrace, 2026-05-20, kagent v0.9.4) — 48 h DQL scan in a live tenant:
| Surface |
Emitted today |
| Agent invocation |
invoke_agent span with gen_ai.agent.id, gen_ai.agent.name, gen_ai.operation.name=invoke_agent, gen_ai.provider.name=ollama. Service: kagent-controller v0.9.4, OTel Go SDK 1.43.0. |
| HTTP session/task surface |
POST /api/sessions/{ctx-id}/events, POST /api/tasks, GET /api/agents via go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.68.0. HTTP-standard attrs only. |
| A2A worker (Python runtime) |
a2a.server.events.in_memory_queue_manager.* via Python A2A SDK auto-instrumentation (in-process queue, not user-facing memory). |
Confirmed gap: zero spans named memory.* across the window; the session-events HTTP span carries no memory.operation / memory.store.kind; no memory_* metric names.
Who benefits: kagent operators (retrieval-latency + embedding-cost visibility), the agent-memory benchmarking workstream (apples-to-apples comparison across OSS backends), and the OTel GenAI semconv WG (a reference implementation to point at when memory.* is proposed upstream).
💡 Proposed Solution
Spans (new)
| Span name |
Kind |
Required attributes |
Optional |
memory.write |
INTERNAL |
memory.operation, memory.store.kind |
memory.tenant, memory.input.size_bytes, memory.extracted.facts_count |
memory.read |
INTERNAL |
memory.operation, memory.store.kind, memory.query.k |
memory.tenant, memory.results.count, memory.top_similarity |
memory.embed |
INTERNAL |
memory.embedder.model |
memory.embed.token_count |
memory.consolidate |
INTERNAL |
memory.consolidate.kind |
memory.consolidate.input_items |
memory.evict |
INTERNAL |
memory.evict.reason |
memory.evict.count |
memory.read.kind=prefetch distinguishes recall-before-LLM-dispatch reads from explicit load_memory tool calls.
Resource attributes (set once per process)
memory.sut.name=kagent
memory.sut.architecture=vector
memory.sut.store_backend=pgvector
memory.sut.version (git SHA or release tag)
Reuse of existing gen_ai.*
For embedding calls inside memory.embed and any LLM dispatch inside the auto-extractor we reuse the GenAI semconv kagent already emits on invoke_agent:
gen_ai.system, gen_ai.request.model
gen_ai.operation.name (extended with memory.write.extract, memory.read.rerank)
gen_ai.usage.input_tokens, gen_ai.usage.output_tokens
No new GenAI conventions — we slot in alongside what's already there.
Parent-span hygiene
Memory-read spans emitted during a request must be children of the existing invoke_agent span when the recall happens before LLM dispatch. Keeps the trace tree connected with what users already see in Dynatrace / Honeycomb / Tempo and avoids orphan trees.
Files
go/internal/memory/store.go — wrap save_memory, load_memory, prefetch_memory.
go/internal/memory/extractor.go — memory.consolidate on the auto-extractor (fires every 5th message).
- Session-events handler behind
POST /api/sessions/{ctx-id}/events — wrap with a child memory.write span when the payload is a memory-bearing event. Highest-leverage single change — the HTTP span already exists; we just attach business-level semantics.
go/pkg/telemetry/spans.go — add MemoryOperationName constants reusing the existing tracer (same provider that wires invoke_agent in v0.9.4).
docs/observability/memory.md (NEW).
helm/kagent/values.yaml — document new memory span names under the existing otel.tracing block.
Targets the declarative runtime. The BYO ADK runtime documents Memory API as unsupported and is out of scope.
Question:
- Naming sanity check. Are
memory.read/write/embed/consolidate/evict reasonable next to the existing gen_ai.* envelope on invoke_agent?
- PR shape. Single PR with all six files, or phased (handler-only first, then extractor + tools)?
- Runtime targeting. Confirm declarative runtime is the right target (BYO ADK out of scope per docs).
- Helm doc placement. OK to extend the existing
otel.tracing block in values.yaml, or do you want a new otel.memory sub-block?
I'll wait for a signal here before creating any PR.
🔄 Alternatives Considered
- Wait for upstream OTel GenAI semconv to land memory- natively.* Viable but slow — the GenAI WG cadence has been ~quarterly. We'd rather ship a reference implementation kagent operators can use today and migrate when upstream solidifies. We're tracking
memory-semconv v0.1.0 as the interim contract.
- HTTP-attribute overload on
POST /api/sessions/{ctx-id}/events. Add memory.operation / memory.store.kind as attributes on the existing HTTP span instead of creating new spans. Rejected: doesn't model prefetch_memory (no HTTP boundary) or the auto-extractor (background tick); also conflates HTTP timing with retrieval timing in dashboards.
- Metrics-first (counters + histograms) instead of spans. Useful but insufficient — metrics can't show parent→child causality (which
invoke_agent triggered which memory.read with which memory.query.k). Spans first; metrics derivable from span attributes later via OTel Collector connectors.
- Custom kagent-specific attribute namespace (
kagent.memory.*). Rejected: locks operators into kagent-specific dashboard queries instead of OTel-portable ones. Following the gen_ai.* precedent kagent already adopts.
🎯 Affected Service(s)
Controller Service
📚 Additional Context
- Why now / context. These conventions are being drafted as
memory-semconv v0.1.0 across six OSS agent-memory projects (MemPalace, kagent, sympozium, Graphiti, Mem0, Letta) so a benchmark harness can compare retrieval latency, write amplification, and bi-temporal invalidation churn apples-to-apples. The plan is to ship the convention in two implementations first, then propose it upstream to the OpenTelemetry Semantic Conventions GenAI WG — same pattern gen_ai.* followed.
- Why kagent is the strongest CNCF-context candidate to land it first:
- The OTel scaffolding is already empirically verified in production.
- The
gen_ai.* envelope already in place proves the maintainers accept OTel semconv as the canonical naming source.
- The contribution is purely additive — no behavior change, no new runtime dependencies, OTel Go SDK 1.43.0 already in
go.mod.
- Talk credibility note (informational). This contribution is referenced in an upcoming KubeCon + OSS Summit talk benchmarking OSS agent-memory solutions on Kubernetes (Cognee / MemOS / Honcho / kagent / sympozium / MemPalace). Whatever maintainers decide here is what gets cited — acceptance is not a precondition for the talk.
- Suggested labels:
area/observability, kind/proposal. (Add good first issue if maintainers feel that fits — totally optional.)
🙋 Are you willing to contribute?
📋 Prerequisites
📝 Feature Summary
Add dedicated OTel spans for the memory subsystem (memory.write / memory.read / memory.embed / memory.consolidate / memory.evict), alongside the existing gen_ai.* spans on invoke_agent. Purely additive instrumentation — no behavior change, no new runtime dependencies.
❓ Problem Statement / Motivation
kagent's platform-level OTel pipeline is already excellent: A2A metadata propagates as span attributes (v0.9.3), the controller's
invoke_agentspan carriesgen_ai.agent.*+gen_ai.provider.nameper the OTel GenAI semconv (verified live against a v0.9.4 deployment, Go OTel SDK 1.43.0), andhelm/kagent/values.yamlexposes a cleanotel.tracingblock.What's missing is dedicated spans for the memory subsystem:
save_memory,load_memory,prefetch_memory, and the auto-extractor that fires every 5th user message.Without dedicated spans, memory operations are visible only as opaque HTTP POSTs against
/api/sessions/{ctx-id}/events— which is fine for HTTP-level latency but makes it impossible to compare kagent against other agent-memory backends on retrieval latency, embedding cost, or write amplification.Verified baseline (Dynatrace, 2026-05-20, kagent v0.9.4) — 48 h DQL scan in a live tenant:
invoke_agentspan withgen_ai.agent.id,gen_ai.agent.name,gen_ai.operation.name=invoke_agent,gen_ai.provider.name=ollama. Service:kagent-controllerv0.9.4, OTel Go SDK 1.43.0.POST /api/sessions/{ctx-id}/events,POST /api/tasks,GET /api/agentsviago.opentelemetry.io/contrib/instrumentation/net/http/otelhttpv0.68.0. HTTP-standard attrs only.a2a.server.events.in_memory_queue_manager.*via Python A2A SDK auto-instrumentation (in-process queue, not user-facing memory).Confirmed gap: zero spans named
memory.*across the window; the session-events HTTP span carries nomemory.operation/memory.store.kind; nomemory_*metric names.Who benefits: kagent operators (retrieval-latency + embedding-cost visibility), the agent-memory benchmarking workstream (apples-to-apples comparison across OSS backends), and the OTel GenAI semconv WG (a reference implementation to point at when
memory.*is proposed upstream).💡 Proposed Solution
Spans (new)
memory.writememory.operation,memory.store.kindmemory.tenant,memory.input.size_bytes,memory.extracted.facts_countmemory.readmemory.operation,memory.store.kind,memory.query.kmemory.tenant,memory.results.count,memory.top_similaritymemory.embedmemory.embedder.modelmemory.embed.token_countmemory.consolidatememory.consolidate.kindmemory.consolidate.input_itemsmemory.evictmemory.evict.reasonmemory.evict.countmemory.read.kind=prefetchdistinguishes recall-before-LLM-dispatch reads from explicitload_memorytool calls.Resource attributes (set once per process)
memory.sut.name=kagentmemory.sut.architecture=vectormemory.sut.store_backend=pgvectormemory.sut.version(git SHA or release tag)Reuse of existing
gen_ai.*For embedding calls inside
memory.embedand any LLM dispatch inside the auto-extractor we reuse the GenAI semconv kagent already emits oninvoke_agent:gen_ai.system,gen_ai.request.modelgen_ai.operation.name(extended withmemory.write.extract,memory.read.rerank)gen_ai.usage.input_tokens,gen_ai.usage.output_tokensNo new GenAI conventions — we slot in alongside what's already there.
Parent-span hygiene
Memory-read spans emitted during a request must be children of the existing
invoke_agentspan when the recall happens before LLM dispatch. Keeps the trace tree connected with what users already see in Dynatrace / Honeycomb / Tempo and avoids orphan trees.Files
go/internal/memory/store.go— wrapsave_memory,load_memory,prefetch_memory.go/internal/memory/extractor.go—memory.consolidateon the auto-extractor (fires every 5th message).POST /api/sessions/{ctx-id}/events— wrap with a childmemory.writespan when the payload is a memory-bearing event. Highest-leverage single change — the HTTP span already exists; we just attach business-level semantics.go/pkg/telemetry/spans.go— addMemoryOperationNameconstants reusing the existing tracer (same provider that wiresinvoke_agentin v0.9.4).docs/observability/memory.md(NEW).helm/kagent/values.yaml— document new memory span names under the existingotel.tracingblock.Targets the declarative runtime. The BYO ADK runtime documents Memory API as unsupported and is out of scope.
Question:
memory.read/write/embed/consolidate/evictreasonable next to the existinggen_ai.*envelope oninvoke_agent?otel.tracingblock invalues.yaml, or do you want a newotel.memorysub-block?I'll wait for a signal here before creating any PR.
🔄 Alternatives Considered
memory-semconv v0.1.0as the interim contract.POST /api/sessions/{ctx-id}/events. Addmemory.operation/memory.store.kindas attributes on the existing HTTP span instead of creating new spans. Rejected: doesn't modelprefetch_memory(no HTTP boundary) or the auto-extractor (background tick); also conflates HTTP timing with retrieval timing in dashboards.invoke_agenttriggered whichmemory.readwith whichmemory.query.k). Spans first; metrics derivable from span attributes later via OTel Collector connectors.kagent.memory.*). Rejected: locks operators into kagent-specific dashboard queries instead of OTel-portable ones. Following thegen_ai.*precedent kagent already adopts.🎯 Affected Service(s)
Controller Service
📚 Additional Context
memory-semconv v0.1.0across six OSS agent-memory projects (MemPalace, kagent, sympozium, Graphiti, Mem0, Letta) so a benchmark harness can compare retrieval latency, write amplification, and bi-temporal invalidation churn apples-to-apples. The plan is to ship the convention in two implementations first, then propose it upstream to the OpenTelemetry Semantic Conventions GenAI WG — same patterngen_ai.*followed.gen_ai.*envelope already in place proves the maintainers accept OTel semconv as the canonical naming source.go.mod.area/observability,kind/proposal. (Addgood first issueif maintainers feel that fits — totally optional.)🙋 Are you willing to contribute?