Description
Is there an existing issue for this?
- I have searched the existing issues
Describe the bug
When troubleshooting test failures with DistributedApplicationTestingBuilder
I've been facing problems where I'm missing logs - I'll get a container move to FailedToStart
, but no output to go alongside it - which makes fixing the failure very tricky.
After a bit of investigation, I worked out that by adding a Task.Delay
after an exception (but before disposing DistributedApplicationTestingBuilder
), the logs would come through, so it seems like DistributedApplicationTestingBuilder
needs to do some kind of log flushing before disposing itself.
Expected Behavior
Test output should include all stdout/stderr output from the container that failed to start - particularly the very last lines as this likely contains details of the error you need to fix.
Steps To Reproduce
The following example starts a container which writes "Some Error" to standard output, and then exists with exit 1234
to cause the container to fail to start.
- Run the below code
- Expect to see the text
Some Error
somewhere int he output (It is not there) - Uncomment the
Task.Delay()
and re-run the test - Now you do see
Some Error
in the logs
[Test] // NUnit
[CancelAfter(30_000)]
public async Task FlushTest(CancellationToken ct)
{
var builder = DistributedApplicationTestingBuilder.Create();
builder.Services.AddLogging(logging =>
{
logging
.SetMinimumLevel(LogLevel.Debug)
.AddFilter(builder.Environment.ApplicationName, LogLevel.Debug)
.AddFilter("Aspire.", LogLevel.Debug);
});
builder.AddContainer("failure", "mcr.microsoft.com/dotnet/runtime")
.WithEntrypoint("sh")
.WithArgs("-c", "echo Some Error; exit 1234")
.WithHttpEndpoint(targetPort: 8080)
.WithHttpHealthCheck();
var app = builder.Build();
await app.StartAsync(ct);
try
{
await app.ResourceNotifications.WaitForResourceHealthyAsync("failure", ct);
}
catch
{
// Try giving logs time to flush
//await Task.Delay(5_000);
throw;
}
}
Exceptions (if any)
Without the Task.Delay
, you get the following output
dbug: Microsoft.Extensions.Hosting.Internal.Host[1]
Hosting starting
info: Aspire.Hosting.DistributedApplication[0]
Aspire version: 9.2.0-preview.1.25163.7+543a7dada97c3ec6a7758e1ce876889fd005873a
info: Aspire.Hosting.DistributedApplication[0]
Distributed application starting.
info: Aspire.Hosting.DistributedApplication[0]
Application host directory is: S:\MYORG.Aspire\src\MYORG.Aspire.AppHost.Tests
dbug: Aspire.Hosting.Cli.CliBackchannel[0]
Aspire CLI backchannel socket path was not specified.
info: Aspire.Hosting.Dcp.DcpHost[0]
Starting DCP with arguments: start-apiserver --monitor 39540 --detach --kubeconfig "C:\Users\USERNAME\AppData\Local\Temp\aspire.rpstdru0.hpv\kubeconfig"
dbug: Aspire.Hosting.Dcp.DcpExecutor[0]
Watching over DCP Executable resources.
dbug: Aspire.Hosting.Dcp.DcpExecutor[0]
Watching over DCP Container resources.
dbug: Aspire.Hosting.Dcp.DcpExecutor[0]
Watching over DCP Service resources.
dbug: Aspire.Hosting.Dcp.DcpExecutor[0]
Watching over DCP Endpoint resources.
info: Aspire.Hosting.Dcp.dcp.start-apiserver.api-server[0]
Starting API server...
info: Aspire.Hosting.Dcp.dcp.start-apiserver.api-server[0]
API server started {"Address": "127.0.0.1", "Port": 57516}
info: Aspire.Hosting.Dcp.dcp.start-apiserver.dcp-host[0]
Starting DCP controller host
info: Aspire.Hosting.Dcp.dcp.start-apiserver.dcp-host[0]
Started all services {"count": 1}
info: Aspire.Hosting.Dcp.dcpctrl.ContainerOrchestrator[0]
runtime status {"ContainerRuntime": "", "runtime": "podman", "status": {"Installed":false,"Running":false,"Error":"exec: \"podman\": executable file not found in %PATH%\nfailed to start podman command 'Info'"}}
dbug: Aspire.Hosting.Dcp.KubernetesService[0]
Successfully read Kubernetes configuration from 'C:\Users\USERNAME\AppData\Local\Temp\aspire.rpstdru0.hpv\kubeconfig' after 3172 milliseconds.
info: Aspire.Hosting.Dcp.dcpctrl.ContainerOrchestrator[0]
runtime status {"ContainerRuntime": "", "runtime": "docker", "status": {"Installed":true,"Running":true,"Error":""}}
info: Aspire.Hosting.Dcp.dcpctrl[0]
starting controller manager
dbug: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
Resource failure/failure-uxasffqy changed state: Starting
info: Aspire.Hosting.DistributedApplication[0]
Distributed application started. Press Ctrl+C to shut down.
dbug: Microsoft.Extensions.Hosting.Internal.Host[2]
Hosting started
dbug: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
Waiting for resource 'failure' to enter the 'Healthy' state.
dbug: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
Resource failure/failure-uxasffqy changed state: Starting
dbug: Aspire.Hosting.Dcp.DcpExecutor[0]
Starting log streaming for failure-uxasffqy.
info: Aspire.Hosting.Dcp.dcp.start-apiserver.api-server.container-logstreamer.ContainerOrchestrator[0]
runtime status {"ContainerRuntime": "", "runtime": "podman", "status": {"Installed":false,"Running":false,"Error":"exec: \"podman\": executable file not found in %PATH%\nfailed to start podman command 'Info'"}}
info: Aspire.Hosting.Dcp.dcpctrl.NetworkReconciler[0]
network created {"NetworkName": {"name":"default-aspire-network"}, "Reconciliation": 2, "Network": "default-aspire-network-csbs4hjo28"}
info: Aspire.Hosting.Dcp.dcp.start-apiserver.api-server.container-logstreamer.ContainerOrchestrator[0]
runtime status {"ContainerRuntime": "", "runtime": "docker", "status": {"Installed":true,"Running":true,"Error":""}}
info: MYORG.Aspire.AppHost.Tests.Resources.failure[0]
1: 2025-03-20T08:52:41.9090000Z f81b96b2a584facf8668015e8b4259f110235b76bf4e19cb6de0f3141574bb0b
info: Aspire.Hosting.Dcp.dcpctrl.ContainerReconciler[0]
Added new ContainerNetworkConnection {"Container": {"name":"failure-uxasffqy"}, "Reconciliation": 5, "Container": "f81b96b2a584facf8668015e8b4259f110235b76bf4e19cb6de0f3141574bb0b", "ContainerNetworkConnection": "/failure-uxasffqy-default-aspire-network-2h4auef9oo"}
fail: Aspire.Hosting.Dcp.dcpctrl.ContainerReconciler[0]
failed to start Container {"Container": {"name":"failure-uxasffqy"}, "Reconciliation": 7, "ContainerID": "f81b96b2a584facf8668015e8b4259f110235b76bf4e19cb6de0f3141574bb0b", "error": "container f81b96b2a584facf8668015e8b4259f110235b76bf4e19cb6de0f3141574bb0b start failed (exit code 210)\nobject not found\ncontainer not found"}
dbug: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
Resource failure/failure-uxasffqy changed state: Starting -> FailedToStart
fail: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
Stopped waiting for resource 'failure' to become healthy because it failed to start.
If you uncomment the Task.Delay()
, you will then get the following additional output
fail: Aspire.Hosting.Dcp.dcpproc.monitor[0]
error finding process {"pid": 43524, "error": "no process found with PID 43524"}
fail: Aspire.Hosting.Dcp.dcpproc.monitor[0]
child service process could not be monitored {"PID": 43524, "error": "no process found with PID 43524"}
fail: Aspire.Hosting.Dcp.dcpproc[0]
the program finished with an error {"ExitCode": 1, "error": "no process found with PID 43524"}
info: Trayport.Aspire.AppHost.Tests.Resources.failure[0]
2: 2025-03-20T09:03:45.5613832Z Some Error
.NET Version info
9.2.0-preview.1.25163.7+543a7dada97c3ec6a7758e1ce876889fd005873a
Anything else?
No response
Activity
Allow logs to fully flush before disposing of `DcpExecutor`.
DcpExecutor
. #8423Allow logs to fully flush before disposing of `DcpExecutor`.
Allow logs to fully flush before disposing of `DcpExecutor`.
afscrome commentedon Mar 31, 2025
I think I need to have to abandon my PR as this will need changes in DCP - I couldn't find a way to trigger DCP to shutdown in a way that the app host could fully read the logs before DCP shut down.
The test cases in the PR are still good though.