Description
Three LogError calls in DefaultMessageManager.InvokeHandler are emitted after the telemetry Activity opened by HandlerInvoker.InvokeAsync has already been disposed. As a result, those log records carry no trace context: in any observability platform that ingests logs via OTLP (Datadog, etc.), SpanId and TraceId are zeroed out, and operators cannot click from the log to the corresponding handler trace.
Steps to reproduce
- Register an SQS poller and a handler with
AddAWSMessageBus.
- Wire OpenTelemetry tracing via
AWS.Messaging.Telemetry.OpenTelemetry (tracing.AddAWSMessagingInstrumentation()), and ship logs through an OTLP exporter.
- Have the handler return
MessageProcessStatus.Failed() for a message (or throw an exception that escapes HandlerInvoker, or fault its Task).
- Observe the
"Message handling completed unsuccessfully for message ID {MessageId}" (or sibling) log record in the log backend.
Expected behavior
The outcome log records should sit inside the same Activity that wraps the handler invocation, so they share TraceId / SpanId with the handler span and can be correlated with it.
Actual behavior
The records have SpanId: "0000000000000000" and TraceFlags: "None". Example (sanitized) record from a Datadog log ingested via OTLP:
{
"Attributes": {
"MessageId": "<message-id>",
"{OriginalFormat}": "Message handling completed unsuccessfully for message ID {MessageId}"
},
"CategoryName": "AWS.Messaging.Services.DefaultMessageManager",
"SeverityText": "Error",
"SpanId": "0000000000000000",
"TraceFlags": "None"
}
The MessageId attribute is the only correlator available; nothing ties the record to the handler trace.
Root cause
The Activity is opened inside HandlerInvoker.InvokeAsync and ends when its using block exits:
HandlerInvoker.cs#L45
using (var trace = _telemetryFactory.Trace("Processing message", messageEnvelope))
{
// ... handler invocation ...
}
DefaultMessageManager.InvokeHandler calls that method, awaits the returned task, and then decides whether to log the outcome:
DefaultMessageManager.cs#L167-L208
private async Task<bool> InvokeHandler(MessageEnvelope messageEnvelope, SubscriberMapping subscriberMapping, CancellationToken cancelToken)
{
var isSuccessful = false;
var handlerTask = _handlerInvoker.InvokeAsync(messageEnvelope, subscriberMapping, cancelToken);
try
{
await handlerTask;
}
catch (InvalidMessageHandlerSignatureException) { throw; }
catch (AWSMessagingException) { /* swallowed */ }
catch (Exception ex)
{
_logger.LogError(ex, "An exception has been thrown from handler '{HandlerType}' ...", ...); // L185
}
_inFlightMessageMetadata.Remove(messageEnvelope, out _);
if (handlerTask.IsCompletedSuccessfully)
{
if (handlerTask.Result.IsSuccess) { /* delete */ }
else
{
_logger.LogError("Message handling completed unsuccessfully for message ID {MessageId}", ...); // L200
await _sqsMessageCommunication.ReportMessageFailureAsync(messageEnvelope);
}
}
else if (handlerTask.IsFaulted)
{
_logger.LogError(handlerTask.Exception, "An exception has been thrown from handler '{HandlerType}' ...", ...); // L206
await _sqsMessageCommunication.ReportMessageFailureAsync(messageEnvelope);
}
return isSuccessful;
}
By the time any of L185, L200, or L206 execute, HandlerInvoker.InvokeAsync has returned and the Activity has been disposed, so Activity.Current is no longer the handler activity (it is null or the poller's parent activity).
The same problem applies to all three call sites:
DefaultMessageManager.cs:185 — handler exception that escapes HandlerInvoker and is not InvalidMessageHandlerSignatureException or AWSMessagingException.
DefaultMessageManager.cs:200 — handler returned MessageProcessStatus.Failed().
DefaultMessageManager.cs:206 — handlerTask.IsFaulted branch.
Impact
- Outcome / failure logs cannot be correlated with the handler trace in any platform.
- For users who rely on click-through from log to trace as their primary debugging workflow, the most operationally interesting records (failures, faulted tasks) are exactly the ones with no trace context.
- The
MessageId attribute is the only correlator, which forces a manual second query to find the related trace.
Environment
AWS.Messaging version: 1.3.0
AWS.Messaging.Telemetry.OpenTelemetry version: 1.0.0
- .NET 10
- Logs and traces exported via OTLP (Datadog backend in our case, but the issue is in the producer side and is platform-independent).
Description
Three
LogErrorcalls inDefaultMessageManager.InvokeHandlerare emitted after the telemetryActivityopened byHandlerInvoker.InvokeAsynchas already been disposed. As a result, those log records carry no trace context: in any observability platform that ingests logs via OTLP (Datadog, etc.),SpanIdandTraceIdare zeroed out, and operators cannot click from the log to the corresponding handler trace.Steps to reproduce
AddAWSMessageBus.AWS.Messaging.Telemetry.OpenTelemetry(tracing.AddAWSMessagingInstrumentation()), and ship logs through an OTLP exporter.MessageProcessStatus.Failed()for a message (or throw an exception that escapesHandlerInvoker, or fault itsTask)."Message handling completed unsuccessfully for message ID {MessageId}"(or sibling) log record in the log backend.Expected behavior
The outcome log records should sit inside the same
Activitythat wraps the handler invocation, so they shareTraceId/SpanIdwith the handler span and can be correlated with it.Actual behavior
The records have
SpanId: "0000000000000000"andTraceFlags: "None". Example (sanitized) record from a Datadog log ingested via OTLP:{ "Attributes": { "MessageId": "<message-id>", "{OriginalFormat}": "Message handling completed unsuccessfully for message ID {MessageId}" }, "CategoryName": "AWS.Messaging.Services.DefaultMessageManager", "SeverityText": "Error", "SpanId": "0000000000000000", "TraceFlags": "None" }The
MessageIdattribute is the only correlator available; nothing ties the record to the handler trace.Root cause
The Activity is opened inside
HandlerInvoker.InvokeAsyncand ends when itsusingblock exits:HandlerInvoker.cs#L45DefaultMessageManager.InvokeHandlercalls that method, awaits the returned task, and then decides whether to log the outcome:DefaultMessageManager.cs#L167-L208By the time any of
L185,L200, orL206execute,HandlerInvoker.InvokeAsynchas returned and theActivityhas been disposed, soActivity.Currentis no longer the handler activity (it isnullor the poller's parent activity).The same problem applies to all three call sites:
DefaultMessageManager.cs:185— handler exception that escapesHandlerInvokerand is notInvalidMessageHandlerSignatureExceptionorAWSMessagingException.DefaultMessageManager.cs:200— handler returnedMessageProcessStatus.Failed().DefaultMessageManager.cs:206—handlerTask.IsFaultedbranch.Impact
MessageIdattribute is the only correlator, which forces a manual second query to find the related trace.Environment
AWS.Messagingversion: 1.3.0AWS.Messaging.Telemetry.OpenTelemetryversion: 1.0.0