Refine recording errors documentation to include logs and avoid span events #3228

pellared · 2025-12-19T13:20:59Z

Towards OTEP: Span Event API deprecation plan

Towards open-telemetry/opentelemetry-specification#4429

Towards open-telemetry/opentelemetry-go-contrib#7254 (comment)
Towards open-telemetry/opentelemetry-go-contrib#7470

Supersedes #2296

"Prototype" (do this across all instrumentation libraries in Go): open-telemetry/opentelemetry-go-contrib#7254 (comment)

Note that this is needed toward stabilization of otelhttp which is the OTel Go instrumentation library for HTTP server and client as we want to have a clear and stable way on how to record errors.

Changes

Adds comprehensive guidance on recording errors via logs (event records) instead of span events
Clarifies the distinction between "error" and "failed operation" concepts
Deprecates the recommendation to use Span.RecordException for error recording
Establishes consistency requirements for error.type across all signals (spans, metrics, logs)

Other

I was also considering deprecating Semantic conventions for exceptions on spans, but decided to hold off and better scope it to a separate PR.

An next steps (probably after stabilizing parts of this document) we could deprecate:

Changing "error" into "exception" terminology (which better aligns with the specification e.g. Span.RecordException and current semantic conventions) can be done in a separate PR

I was thinking about adding error.stacktrace per 2025-12-09 Logs SIG meeting, but I felt that it would be better to scope it to a separate PR as this is already big enough. But probably we would prefer to use exception.* attributes anyway.

…ion plan

Copilot

Pull request overview

This PR refines the error recording documentation to align with the Span Event API deprecation plan (OTEP 4430), which is necessary for stabilizing the otelhttp instrumentation library in OpenTelemetry Go.

Key changes:

Adds comprehensive guidance on recording errors via logs/events instead of span events
Clarifies the distinction between "error" and "failed operation" concepts
Deprecates the recommendation to use Span.RecordException for error recording
Establishes consistency requirements for error.type across all signals (spans, metrics, logs)

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
docs/general/recording-errors.md	Major restructure of error recording guidance: adds new sections for logs, clarifies when operations are failed vs encountering errors, removes Java exception handling example, discourages span events for error recording, and adds guidance for severity levels in log-based error recording
.chloggen/3228.yaml	Adds changelog entry documenting the enhancement to error recording documentation

docs/general/recording-errors.md

…ocumentation

…ording errors documentation

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

docs/general/recording-errors.md

alexmojaki · 2026-01-07T12:45:52Z

docs/general/recording-errors.md

+
+When the instrumented operation failed, the instrumentation:
+
+- SHOULD set the span status code to `Error`,


By the definitions above, this would happen any time the span ends with an exception thrown by the instrumented operation. But the instrumentation may know that the exception doesn't actually represent an error, e.g. that the exception is expected to be handled outside the span (see pydantic/logfire#1361 for example).

We still want to be able to record info about these kinds of exceptions, including a traceback. But they shouldn't be marked as errors. That means that there's also no place here to store the exception message, since the span status description can't be set.

PTAL 6e232d4

alexmojaki · 2026-01-07T12:51:38Z

docs/general/recording-errors.md

+- SHOULD set [`SeverityNumber`][SeverityNumber].
+
+When the error happens during an operation,
+it is RECOMMENDED to set [`SeverityNumber`][SeverityNumber] below 9 (INFO).


I think what you're saying here is "when the error happens before the operation has ended and is handled, so the operation is still expected to complete successfully". That should be clarified.

In that situation, I find it surprising to consider this as severity below INFO. I would usually log this as a warning, it's concerning enough to warrant some attention. A user is likely to configure logging to filter out everything below INFO, meaning they wouldn't see any sign of transient errors at all. If there was internal retrying to overcome the error, you'd likely see a span taking mysteriously long and think there was a performance problem.

I think what you're saying here is "when the error happens before the operation has ended and is handled, so the operation is still expected to complete successfully". That should be clarified.

It is also when it has ended and when it is not handled (operation fails).
This tries to address #3228 (comment)

I would usually log this as a warning, it's concerning enough to warrant some attention.

I disagree. Transient errors are happening all the time. I would consider it as logs overuse.

A user is likely to configure logging to filter out everything below INFO, meaning they wouldn't see any sign of transient errors at all. If there was internal retrying to overcome the error, you'd likely see a span taking mysteriously long and think there was a performance problem.

This is not true. Note that the user would get the information about the transient errors via metrics. Then if the users want diagnose it further they can always set even "TRACE" severity for given error.type.

I disagree. Transient errors are happening all the time. I would consider it as logs overuse.

Based on what? We record transient errors in our application as warnings and I'm glad we do.

Note that the user would get the information about the transient errors via metrics.

Unless you're lucky, there would be no way to determine from metrics what happened for a specific given span. And the user might not even think to look at metrics at all, or know how.

Then if the users want diagnose it further

But they can't diagnose past events.

they can always set even "TRACE" severity for given error.type.

Is this particularly easy to do? In any case I think it would be better if the user had the option to set the severity of the log to something higher.

Going to address it together with #3228 (comment)

Saying "above info" or "below info" without enough context will seem confusing.

PTAL 14f21c6

…elines

docs/general/recording-errors.md

alexmojaki · 2026-01-07T13:32:44Z

docs/general/recording-errors.md

+- SHOULD set the [`error.type`][ErrorType] attribute,
+- SHOULD set [`error.message`][ErrorMessage] attribute to add additional
+  information about the error, for example, an exception message,
+- SHOULD set the `error.stacktrace` attribute.


i thought you didn't want to introduce new attributes. why not exception.stacktrace?

Changing "error" into "exception" terminology (which better aligns with the specification e.g. Span.RecordException and current semantic conventions) can be done in a separate PR.

There are too many open discussions around this and it is not clear yet which way we want to go.

You'd only set the stacktrace in the case of an exception, right?

This is implementation/language specific. E.g. usually in Go errors do not have stacktraces. However, they may have it e.g. when using github.com/pkg/errros.

so will you add a new attribute to the registry?

Not in the scope of this PR or at least not until there is a clear agreement.

There are too many open discussions around this and it is not clear yet which way we want to go.

Note that this document is in Development status.

alexmojaki · 2026-01-07T13:33:23Z

docs/general/recording-errors.md

+When the instrumented operation failed, the instrumentation:

-  It's NOT RECOMMENDED to duplicate status code or `error.type` in span status description.
+- SHOULD set the span status code to `Error` if this is a semantical error,


i think the condition needs to be elaborated

Any proposal how? Here is an example what constitutes a semantic error for HTTP spans: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/http/http-spans.md#status. Maybe a hyperlink as this is a good example?

SHOULD set the span status code to Error, unless this is an exception that doesn't actually represent an error, e.g. it is being used for control flow, or is expected to be handled gracefully outside the span. For example, a 400 HTTP status code is not an error on an HTTP server span because it indicates a problem with the user's request, not the application. But a 500 HTTP status code is an error.

Side note. Problably when we change exception to error terminology it would be more clear.

Similar feedback in #3228 (comment)

PTAL 14f21c6

Co-authored-by: Alex Hall <[email protected]>

alexmojaki · 2026-01-07T14:09:56Z

docs/general/recording-errors.md

+- SHOULD set the span status code to `Error` if this is a semantic error,
+- SHOULD set the [`error.type`][ErrorType] attribute,
+- SHOULD set [`error.message`][ErrorMessage] attribute to add additional
+  information about the error, for example, an exception message,


This is incompatible with:

semantic-conventions/docs/registry/attributes/error.md

Lines 19 to 21 in 502202e

It is also NOT RECOMMENDED to duplicate the value of `exception.message` in `error.message`.

`error.message` is NOT RECOMMENDED for metrics or spans due to its unbounded cardinality and overlap with span status.

one of them needs to change

…dations

…ntions into recording-errors

docs/general/recording-errors.md

…tion and clarifying recording practices for retried errors

it it renduant with the the line below

pellared · 2026-01-07T20:11:28Z

SIG meeting notes:
Some changes done here are seen as too big and too controversial.
Going to make smaller steps on things that we agreed on.

Personal opinion: I think it was a good as this PR initiated some important discussions and helps us planning next steps.

pellared · 2026-01-07T21:47:23Z

SIG meeting notes:
We also agreed that in otelhttp we should not use Record Exception API nor "recording exception" guidelines. We should just follow "recording errors on spans" and "recording errors on metrics" guidelines.
CC @open-telemetry/go-maintainers

Refine recording errors documentation towards Span Event API deprecat…

ed8f108

…ion plan

pellared requested review from a team as code owners December 19, 2025 13:21

github-project-automation bot added this to Semantic Conventions Triage Dec 19, 2025

github-project-automation bot moved this to Untriaged in Semantic Conventions Triage Dec 19, 2025

pellared changed the title ~~Refine recording errors documentation towards Span Event API deprecat…~~ Refine recording errors documentation towards Span Event API deprecation plan Dec 19, 2025

pellared mentioned this pull request Dec 19, 2025

Record span-ending exceptions as span attributes instead of span event or log open-telemetry/opentelemetry-specification#4429

Open

pellared changed the title ~~Refine recording errors documentation towards Span Event API deprecation plan~~ Refine recording errors documentation to include logs and avoid span events Dec 19, 2025

add chlog entry

d96eedf

github-actions bot added the enhancement New feature or request label Dec 19, 2025

refine chlog

13f95d2

pellared requested a review from Copilot December 19, 2025 13:35

Copilot started reviewing on behalf of pellared December 19, 2025 13:35 View session

Copilot AI reviewed Dec 19, 2025

View reviewed changes

pellared mentioned this pull request Dec 19, 2025

Do not use span.RecordError for terminating errors open-telemetry/opentelemetry-go-contrib#7470

Open

pellared added 2 commits December 19, 2025 14:50

Fix grammar and clarify logging recommendations in recording errors d…

0baed85

…ocumentation

Update links to OpenTelemetry specification for version 1.52.0 in rec…

6b5bc87

…ording errors documentation

pellared requested a review from Copilot December 19, 2025 13:52

Copilot started reviewing on behalf of pellared December 19, 2025 13:52 View session

pellared added this to Logs SIG Dec 19, 2025

pellared self-assigned this Dec 19, 2025

pellared moved this to In progress in Logs SIG Dec 19, 2025

Copilot AI reviewed Dec 19, 2025

View reviewed changes

pellared requested a review from Copilot December 19, 2025 14:50

Copilot started reviewing on behalf of pellared December 19, 2025 14:51 View session

Copilot AI reviewed Dec 19, 2025

View reviewed changes

docs/general/recording-errors.md Outdated Show resolved Hide resolved

docs/general/recording-errors.md Outdated Show resolved Hide resolved

docs/general/recording-errors.md Outdated Show resolved Hide resolved

docs/general/recording-errors.md Outdated Show resolved Hide resolved

pellared added 2 commits December 19, 2025 16:15

Apply feedback

6390bfd

Refine recording errors section

b352235

cijothomas reviewed Dec 19, 2025

View reviewed changes

docs/general/recording-errors.md Outdated Show resolved Hide resolved

cijothomas reviewed Dec 19, 2025

View reviewed changes

docs/general/recording-errors.md Outdated Show resolved Hide resolved

pellared added 3 commits January 5, 2026 12:04

logs to use error.type as other singals

02744ca

example for operation failure in logging recommendations

aa38670

refine handling of retried or handled errors in logging recommendations

1be560d

pellared requested a review from lmolkova January 5, 2026 11:39

pellared added the changelog.opentelemetry.io label Jan 5, 2026

dashpole reviewed Jan 5, 2026

View reviewed changes

docs/general/recording-errors.md Outdated Show resolved Hide resolved

docs/general/recording-errors.md Outdated Show resolved Hide resolved

simplify error recording guidelines

36405a0

pellared requested a review from dashpole January 7, 2026 11:09

alexmojaki reviewed Jan 7, 2026

View reviewed changes

docs/general/recording-errors.md Show resolved Hide resolved

alexmojaki requested changes Jan 7, 2026

View reviewed changes

github-project-automation bot moved this from Untriaged to Blocked in Semantic Conventions Triage Jan 7, 2026

pellared added 2 commits January 7, 2026 14:18

clarify failed operation definition and update span status requirements

060846c

refine definition of failed operations and update error handling guid…

6e232d4

…elines

pellared requested a review from alexmojaki January 7, 2026 13:27

alexmojaki reviewed Jan 7, 2026

View reviewed changes

Update docs/general/recording-errors.md

521e340

Co-authored-by: Alex Hall <[email protected]>

pellared requested a review from alexmojaki January 7, 2026 13:41

alexmojaki reviewed Jan 7, 2026

View reviewed changes

pellared added 2 commits January 7, 2026 17:55

clarify error handling in span recording and adjust severity recommen…

14f21c6

…dations

Merge branch 'recording-errors' of github.com:pellared/semantic-conve…

c07f05c

…ntions into recording-errors

pellared requested a review from alexmojaki January 7, 2026 16:58

dashpole reviewed Jan 7, 2026

View reviewed changes

docs/general/recording-errors.md Outdated Show resolved Hide resolved

dashpole approved these changes Jan 7, 2026

View reviewed changes

pellared added 2 commits January 7, 2026 18:27

refine error handling guidelines by removing the failed operation sec…

266c084

…tion and clarifying recording practices for retried errors

remove guidance on recording retried errors in metrics

c52e846

it it renduant with the the line below

pellared marked this pull request as draft January 7, 2026 20:11

pellared closed this Jan 7, 2026

github-project-automation bot moved this from In progress to Done in Logs SIG Jan 7, 2026


		When the instrumented operation failed, the instrumentation:

		- SHOULD set the span status code to `Error`,

	It is also NOT RECOMMENDED to duplicate the value of `exception.message` in `error.message`.

	`error.message` is NOT RECOMMENDED for metrics or spans due to its unbounded cardinality and overlap with span status.

Refine recording errors documentation to include logs and avoid span events #3228

Refine recording errors documentation to include logs and avoid span events #3228

Conversation

pellared commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Other

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pellared Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexmojaki Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pellared Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pellared Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pellared Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

pellared commented Dec 19, 2025 •

edited

Loading

pellared Jan 7, 2026 •

edited

Loading

alexmojaki Jan 7, 2026 •

edited

Loading

pellared Jan 7, 2026 •

edited

Loading

pellared Jan 7, 2026 •

edited

Loading

pellared Jan 7, 2026 •

edited

Loading

pellared commented Jan 7, 2026 •

edited

Loading