Skip to content

Final container logs missing when disposing DistributedApplicationTestingBuilder #8206

Open
@afscrome

Description

@afscrome

Is there an existing issue for this?

  • I have searched the existing issues

Describe the bug

When troubleshooting test failures with DistributedApplicationTestingBuilder I've been facing problems where I'm missing logs - I'll get a container move to FailedToStart, but no output to go alongside it - which makes fixing the failure very tricky.

After a bit of investigation, I worked out that by adding a Task.Delay after an exception (but before disposing DistributedApplicationTestingBuilder), the logs would come through, so it seems like DistributedApplicationTestingBuilder needs to do some kind of log flushing before disposing itself.

Expected Behavior

Test output should include all stdout/stderr output from the container that failed to start - particularly the very last lines as this likely contains details of the error you need to fix.

Steps To Reproduce

The following example starts a container which writes "Some Error" to standard output, and then exists with exit 1234 to cause the container to fail to start.

  1. Run the below code
  2. Expect to see the text Some Error somewhere int he output (It is not there)
  3. Uncomment the Task.Delay() and re-run the test
  4. Now you do see Some Error in the logs
   [Test]  // NUnit
   [CancelAfter(30_000)]
   public async Task FlushTest(CancellationToken ct)
   {
      var builder = DistributedApplicationTestingBuilder.Create();
      builder.Services.AddLogging(logging =>
      {
         logging
            .SetMinimumLevel(LogLevel.Debug)
            .AddFilter(builder.Environment.ApplicationName, LogLevel.Debug)
            .AddFilter("Aspire.", LogLevel.Debug);
      });

      builder.AddContainer("failure", "mcr.microsoft.com/dotnet/runtime")
         .WithEntrypoint("sh")
         .WithArgs("-c", "echo Some Error; exit 1234")
         .WithHttpEndpoint(targetPort: 8080)
         .WithHttpHealthCheck();

      var app = builder.Build();
      await app.StartAsync(ct);

      try
      {
         await app.ResourceNotifications.WaitForResourceHealthyAsync("failure", ct);
      }
      catch
      {
         // Try giving logs time to flush
         //await Task.Delay(5_000);
         throw;
      }
   }

Exceptions (if any)

Without the Task.Delay, you get the following output

dbug: Microsoft.Extensions.Hosting.Internal.Host[1]
      Hosting starting
info: Aspire.Hosting.DistributedApplication[0]
      Aspire version: 9.2.0-preview.1.25163.7+543a7dada97c3ec6a7758e1ce876889fd005873a
info: Aspire.Hosting.DistributedApplication[0]
      Distributed application starting.
info: Aspire.Hosting.DistributedApplication[0]
      Application host directory is: S:\MYORG.Aspire\src\MYORG.Aspire.AppHost.Tests
dbug: Aspire.Hosting.Cli.CliBackchannel[0]
      Aspire CLI backchannel socket path was not specified.
info: Aspire.Hosting.Dcp.DcpHost[0]
      Starting DCP with arguments: start-apiserver --monitor 39540 --detach --kubeconfig "C:\Users\USERNAME\AppData\Local\Temp\aspire.rpstdru0.hpv\kubeconfig"
dbug: Aspire.Hosting.Dcp.DcpExecutor[0]
      Watching over DCP Executable resources.
dbug: Aspire.Hosting.Dcp.DcpExecutor[0]
      Watching over DCP Container resources.
dbug: Aspire.Hosting.Dcp.DcpExecutor[0]
      Watching over DCP Service resources.
dbug: Aspire.Hosting.Dcp.DcpExecutor[0]
      Watching over DCP Endpoint resources.
info: Aspire.Hosting.Dcp.dcp.start-apiserver.api-server[0]
      Starting API server...
info: Aspire.Hosting.Dcp.dcp.start-apiserver.api-server[0]
      API server started	{"Address": "127.0.0.1", "Port": 57516}
info: Aspire.Hosting.Dcp.dcp.start-apiserver.dcp-host[0]
      Starting DCP controller host
info: Aspire.Hosting.Dcp.dcp.start-apiserver.dcp-host[0]
      Started all services	{"count": 1}
info: Aspire.Hosting.Dcp.dcpctrl.ContainerOrchestrator[0]
      runtime status	{"ContainerRuntime": "", "runtime": "podman", "status": {"Installed":false,"Running":false,"Error":"exec: \"podman\": executable file not found in %PATH%\nfailed to start podman command 'Info'"}}
dbug: Aspire.Hosting.Dcp.KubernetesService[0]
      Successfully read Kubernetes configuration from 'C:\Users\USERNAME\AppData\Local\Temp\aspire.rpstdru0.hpv\kubeconfig' after 3172 milliseconds.
info: Aspire.Hosting.Dcp.dcpctrl.ContainerOrchestrator[0]
      runtime status	{"ContainerRuntime": "", "runtime": "docker", "status": {"Installed":true,"Running":true,"Error":""}}
info: Aspire.Hosting.Dcp.dcpctrl[0]
      starting controller manager
dbug: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
      Resource failure/failure-uxasffqy changed state: Starting
info: Aspire.Hosting.DistributedApplication[0]
      Distributed application started. Press Ctrl+C to shut down.
dbug: Microsoft.Extensions.Hosting.Internal.Host[2]
      Hosting started
dbug: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
      Waiting for resource 'failure' to enter the 'Healthy' state.
dbug: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
      Resource failure/failure-uxasffqy changed state: Starting
dbug: Aspire.Hosting.Dcp.DcpExecutor[0]
      Starting log streaming for failure-uxasffqy.
info: Aspire.Hosting.Dcp.dcp.start-apiserver.api-server.container-logstreamer.ContainerOrchestrator[0]
      runtime status	{"ContainerRuntime": "", "runtime": "podman", "status": {"Installed":false,"Running":false,"Error":"exec: \"podman\": executable file not found in %PATH%\nfailed to start podman command 'Info'"}}
info: Aspire.Hosting.Dcp.dcpctrl.NetworkReconciler[0]
      network created	{"NetworkName": {"name":"default-aspire-network"}, "Reconciliation": 2, "Network": "default-aspire-network-csbs4hjo28"}
info: Aspire.Hosting.Dcp.dcp.start-apiserver.api-server.container-logstreamer.ContainerOrchestrator[0]
      runtime status	{"ContainerRuntime": "", "runtime": "docker", "status": {"Installed":true,"Running":true,"Error":""}}
info: MYORG.Aspire.AppHost.Tests.Resources.failure[0]
      1: 2025-03-20T08:52:41.9090000Z f81b96b2a584facf8668015e8b4259f110235b76bf4e19cb6de0f3141574bb0b
info: Aspire.Hosting.Dcp.dcpctrl.ContainerReconciler[0]
      Added new ContainerNetworkConnection	{"Container": {"name":"failure-uxasffqy"}, "Reconciliation": 5, "Container": "f81b96b2a584facf8668015e8b4259f110235b76bf4e19cb6de0f3141574bb0b", "ContainerNetworkConnection": "/failure-uxasffqy-default-aspire-network-2h4auef9oo"}
fail: Aspire.Hosting.Dcp.dcpctrl.ContainerReconciler[0]
      failed to start Container	{"Container": {"name":"failure-uxasffqy"}, "Reconciliation": 7, "ContainerID": "f81b96b2a584facf8668015e8b4259f110235b76bf4e19cb6de0f3141574bb0b", "error": "container f81b96b2a584facf8668015e8b4259f110235b76bf4e19cb6de0f3141574bb0b start failed (exit code 210)\nobject not found\ncontainer not found"}
dbug: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
      Resource failure/failure-uxasffqy changed state: Starting -> FailedToStart
fail: Aspire.Hosting.ApplicationModel.ResourceNotificationService[0]
      Stopped waiting for resource 'failure' to become healthy because it failed to start.

If you uncomment the Task.Delay(), you will then get the following additional output

    fail: Aspire.Hosting.Dcp.dcpproc.monitor[0]
          error finding process	{"pid": 43524, "error": "no process found with PID 43524"}
    fail: Aspire.Hosting.Dcp.dcpproc.monitor[0]
          child service process could not be monitored	{"PID": 43524, "error": "no process found with PID 43524"}
    fail: Aspire.Hosting.Dcp.dcpproc[0]
          the program finished with an error	{"ExitCode": 1, "error": "no process found with PID 43524"}
    info: Trayport.Aspire.AppHost.Tests.Resources.failure[0]
          2: 2025-03-20T09:03:45.5613832Z Some Error

.NET Version info

9.2.0-preview.1.25163.7+543a7dada97c3ec6a7758e1ce876889fd005873a

Anything else?

No response

Activity

added
area-app-modelIssues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication
on Mar 20, 2025
changed the title Final logs missing when disposing `DistributedApplicationTestingBuilder` Final container logs missing when disposing `DistributedApplicationTestingBuilder` on Mar 20, 2025
added a commit that references this issue on Mar 30, 2025
fe13058
added 2 commits that reference this issue on Mar 30, 2025
978ccb0
1b678fd
added and removed
area-app-modelIssues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication
on Mar 31, 2025
afscrome

afscrome commented on Mar 31, 2025

@afscrome
ContributorAuthor

I think I need to have to abandon my PR as this will need changes in DCP - I couldn't find a way to trigger DCP to shutdown in a way that the app host could fully read the logs before DCP shut down.

The test cases in the PR are still good though.

added this to the Backlog milestone on Mar 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions

    Final container logs missing when disposing `DistributedApplicationTestingBuilder` · Issue #8206 · dotnet/aspire