Update health check to ensure blob containers created at right time #9159

RussKie · 2025-05-08T08:30:25Z

Resolves #9139
Resolves #9145

The underlying issue is attributed to a lack of health checks for child resources (as those have no lifetime of their own). In the nutshell, whenever an Azurite emulator is starting up, the readiness of the emulator is indicated by the readiness of the "blobs" resources (represented by BloblServiceClient). Previously, child blob contrainers were created on ResourceReadyEvent, but this created an opportunity for a race condition - a client could attempt to connect after the resource reported healthy but before child resources were created - the flaky test highlighted this problem.

To fix the issue we're now creating an individual health check for each blob container resource. To make it simple, a blob container is being created within the health check itself.

davidfowl · 2025-05-08T09:05:34Z

Hmm, we had this debate when @sebastienros did the database creation and I thought we decided to use the ResourceReady event (which runs after health checks).

eerhardt · 2025-05-08T14:56:22Z

Hmm, we had this debate when @sebastienros did the database creation and I thought we decided to use the ResourceReady event (which runs after health checks).

For databases, do we:

aspire/src/Aspire.Hosting.SqlServer/SqlServerBuilderExtensions.cs

Lines 58 to 77 in 88ff939

    
           builder.Eventing.Subscribe<ResourceReadyEvent>(sqlServer, async (@event, ct) => 
        
           { 
        
               if (connectionString is null) 
        
               { 
        
                   throw new DistributedApplicationException($"ResourceReadyEvent was published for the '{sqlServer.Name}' resource but the connection string was null."); 
        
               } 
        
               using var sqlConnection = new SqlConnection(connectionString); 
        
               await sqlConnection.OpenAsync(ct).ConfigureAwait(false); 
        
               if (sqlConnection.State != System.Data.ConnectionState.Open) 
        
               { 
        
                   throw new InvalidOperationException($"Could not open connection to '{sqlServer.Name}'"); 
        
               } 
        
               foreach (var sqlDatabase in sqlServer.DatabaseResources) 
        
               { 
        
                   await CreateDatabaseAsync(sqlConnection, sqlDatabase, @event.Services, ct).ConfigureAwait(false); 
        
               } 
        
           });

aspire/src/Aspire.Hosting.PostgreSQL/PostgresBuilderExtensions.cs

Lines 66 to 90 in 88ff939

    
           builder.Eventing.Subscribe<ResourceReadyEvent>(postgresServer, async (@event, ct) => 
        
           { 
        
               if (connectionString is null) 
        
               { 
        
                   throw new DistributedApplicationException($"ResourceReadyEvent was published for the '{postgresServer.Name}' resource but the connection string was null."); 
        
               } 
        
               // Non-database scoped connection string 
        
               using var npgsqlConnection = new NpgsqlConnection(connectionString + ";Database=postgres;"); 
        
               await npgsqlConnection.OpenAsync(ct).ConfigureAwait(false); 
        
               if (npgsqlConnection.State != System.Data.ConnectionState.Open) 
        
               { 
        
                   throw new InvalidOperationException($"Could not open connection to '{postgresServer.Name}'"); 
        
               } 
        
               foreach (var name in postgresServer.Databases.Keys) 
        
               { 
        
                   if (builder.Resources.FirstOrDefault(n => string.Equals(n.Name, name, StringComparisons.ResourceName)) is PostgresDatabaseResource postgreDatabase) 
        
                   { 
        
                       await CreateDatabaseAsync(npgsqlConnection, postgreDatabase, @event.Services, ct).ConfigureAwait(false); 
        
                   } 
        
               } 
        
           });

In general, I dislike using health checks to mutate state. They should just be used to check health, not make something "healthy".

RussKie · 2025-05-08T23:09:09Z

I am not super fond of this result, but I couldn't find a way to add a healthcheck for individual blob containers.
I originally added blob container creation within ResourceReadyEvent, however, it appears this makes the test flaky - there's a race condition, and the client may attempt to access blob container before those get created.
If I add a check alongside the blob storage, then ResourceReadyEvent never get fired (since no containers yet exist). I couldn't add a healthcheck within the even - the service collection at this point is already locked.

Any suggestions?

eerhardt · 2025-05-09T16:16:20Z

but I couldn't find a way to add a healthcheck for individual blob containers.

I don't think we do healthchecks for child resources anywhere else. For example, CosmosDB seems to honly have it for the whole service.

aspire/src/Aspire.Hosting.Azure.CosmosDB/AzureCosmosDBExtensions.cs

Lines 104 to 122 in e3d170c

    
           builder.ApplicationBuilder.Eventing.Subscribe<ResourceReadyEvent>(builder.Resource, async (@event, ct) => 
        
           { 
        
               if (cosmosClient is null) 
        
               { 
        
                   throw new InvalidOperationException("CosmosClient is not initialized."); 
        
               } 
        
               await cosmosClient.ReadAccountAsync().WaitAsync(ct).ConfigureAwait(false); 
        
               foreach (var database in builder.Resource.Databases) 
        
               { 
        
                   var db = (await cosmosClient.CreateDatabaseIfNotExistsAsync(database.DatabaseName, cancellationToken: ct).ConfigureAwait(false)).Database; 
        
                   foreach (var container in database.Containers) 
        
                   { 
        
                       await db.CreateContainerIfNotExistsAsync(container.ContainerName, container.PartitionKeyPath, cancellationToken: ct).ConfigureAwait(false); 
        
                   } 
        
               } 
        
           });

Because the ResourceReadyEvent blocks the resource's "healthy" state until all ResourceReadyEvent listeners complete, the parent resource won't be marked "healthy" until creating the child resources is complete. And the child resources won't be "healthy" until the parent resource is "healthy".

I think we should be able to follow the existing patterns in Sql, Postgres, and in Azure CosmosDB. What doesn't work about the existing pattern?

Any other suggestions here @sebastienros or @mitchdenny ?

sebastienros · 2025-05-09T16:26:06Z

I don't think we do healthchecks for child resources anywhere else

SqlServer/Postgres databases have one. It's done by using their own connection string which has the Database= property in it so establishing the connection retrieved from ConnectionStringAvailableEvent is sufficient.

Resolves #9139 Resolves #9145

… time

RussKie · 2025-05-13T01:03:17Z

Thanks @sebastienros for help and guidance. How does this look now?

src/Aspire.Hosting.Azure.Storage/AzureBlobStorageContainerHealthCheck.cs

… time

src/Aspire.Hosting.Azure.Storage/AzureStorageExtensions.cs

...ents/Aspire.Azure.Storage.Blobs/AspireBlobStorageExtensions.BlobStorageContainerComponent.cs

tests/Aspire.Hosting.Azure.Tests/AzureStorageEmulatorFunctionalTests.cs

src/Aspire.Hosting.Azure.Storage/AzureStorageExtensions.cs

eerhardt

Looks good.

tests/Aspire.Hosting.Azure.Tests/AzureBicepResourceTests.cs

RussKie · 2025-05-15T22:28:48Z

src/Aspire.Hosting.Azure.Storage/AzureStorageExtensions.cs

-            {
-                throw new DistributedApplicationException($"BlobServiceClient was not created for the '{builder.Resource.Name}' resource.");
-            }
+            // This event is triggered when the health check is healthy.


Suggested change

// This event is triggered when the health check is healthy.

// This event is triggered when the emulator has started, and BlobServiceClient is marked as healthy.

I disagree with the change, that's not what I wanted to convey. I am saying that this event happens after the health check, only if it's healthy. Yes it implied the emulator has started (with more information than just "started") and "BlobServiceClient" is healthy doesn't mean much. The storage itself is healthy, the client is not "marked" anything.

Fair enough, but "health check is healthy" isn't providing much information either. There are multiple health checks now; it would be good to clarify which health check is triggering this event.

RussKie · 2025-05-15T22:31:26Z

src/Aspire.Hosting.Azure.Storage/AzureStorageExtensions.cs

+        var healthCheckKey = $"{resource.Name}_check";
+
+        BlobServiceClient? blobServiceClient = null;
+        builder.ApplicationBuilder.Services.AddHealthChecks().AddAzureBlobStorage(sp =>
+        {
+            return blobServiceClient ??= CreateBlobServiceClient(connectionString ?? throw new InvalidOperationException("Connection string is not initialized."));
+        }, name: healthCheckKey);


Why do we need to duplicate the HC here? Or why do we need to keep the HC on lines:160-167?

Here it's on the Blobs resource it. Line 160 is on the Emulator resource. Doing it on the storage is not sufficient as the WaitForHealthyAsync doesn't bubble up to the parent resources.

If it were just for the existing tests we could probably not have this specific one. But it's more consistent to keep it if we do it for containers.

radical · 2025-05-16T18:32:37Z

tests/Aspire.Hosting.Azure.Tests/AzureStorageEmulatorFunctionalTests.cs

@@ -135,9 +135,10 @@ public async Task VerifyAzureStorageEmulatorResource()

    [Fact]
    [RequiresDocker]
-    [QuarantinedTest("https://github.com/dotnet/aspire/issues/9139")]


Are we sure that this can be dropped? For quarantined tests we want to take it out after it has been green for a certain number of runs (~100 right now).

Will add it back then.

What else? Reopening the issues? Is tracking automatic or is there a process to follow to unquarantine like for aspnet?

Yes, re-open the issue. And it will be tracked automatically. And I will take care of taking it out of quarantine for now. It will get semi-automated in medium term.

RussKie requested review from davidfowl, mitchdenny and eerhardt May 8, 2025 08:30

RussKie self-assigned this May 8, 2025

github-actions bot added the area-integrations Issues pertaining to Aspire Integrations packages label May 8, 2025

Update health check to ensure blob containers created at right time

4ba607c

Resolves #9139 Resolves #9145

RussKie force-pushed the igveliko/fix_9139 branch from 538fc7d to dae66f7 Compare May 13, 2025 01:00

fixup! Update health check to ensure blob containers created at right…

ec3a2df

… time

RussKie force-pushed the igveliko/fix_9139 branch from dae66f7 to ec3a2df Compare May 13, 2025 01:02

sebastienros approved these changes May 13, 2025

View reviewed changes

sebastienros reviewed May 13, 2025

View reviewed changes

src/Aspire.Hosting.Azure.Storage/AzureBlobStorageContainerHealthCheck.cs Outdated Show resolved Hide resolved

fixup! Update health check to ensure blob containers created at right…

6aa390f

… time

RussKie requested a review from radical as a code owner May 13, 2025 05:54

RussKie removed the request for review from radical May 13, 2025 05:54

fixup! Update health check to ensure blob containers created at right…

e993433

… time

eerhardt reviewed May 13, 2025

View reviewed changes

src/Aspire.Hosting.Azure.Storage/AzureStorageExtensions.cs Outdated Show resolved Hide resolved

sebastienros added 2 commits May 13, 2025 13:27

Prevent multiple container checks

eddebef

Use better variable name

5bd0c61

eerhardt reviewed May 13, 2025

View reviewed changes

src/Aspire.Hosting.Azure.Storage/AzureStorageExtensions.cs Outdated Show resolved Hide resolved

eerhardt reviewed May 13, 2025

View reviewed changes

src/Aspire.Hosting.Azure.Storage/AzureStorageExtensions.cs Outdated Show resolved Hide resolved

sebastienros added 4 commits May 13, 2025 15:22

Move container creation to RunAsEmulator

48e2d5d

Register single hc for blobs

9737260

Reuse blobserviceclient

80ce587

Fix registrations

4ebf0d7

sebastienros added 3 commits May 13, 2025 16:06

Remove custom heathcheck

a72f514

Remove custom healthcheck

c21f230

Remove specific health checks

53fac90

eerhardt reviewed May 14, 2025

View reviewed changes

...ents/Aspire.Azure.Storage.Blobs/AspireBlobStorageExtensions.BlobStorageContainerComponent.cs Show resolved Hide resolved

eerhardt reviewed May 14, 2025

View reviewed changes

tests/Aspire.Hosting.Azure.Tests/AzureStorageEmulatorFunctionalTests.cs Outdated Show resolved Hide resolved

sebastienros added 2 commits May 14, 2025 12:19

Add blob and container health checks

3242c5c

Fix test

ba67f13

sebastienros requested a review from eerhardt May 15, 2025 15:58

eerhardt reviewed May 15, 2025

View reviewed changes

src/Aspire.Hosting.Azure.Storage/AzureStorageExtensions.cs Outdated Show resolved Hide resolved

eerhardt reviewed May 15, 2025

View reviewed changes

src/Aspire.Hosting.Azure.Storage/AzureStorageExtensions.cs Outdated Show resolved Hide resolved

eerhardt reviewed May 15, 2025

View reviewed changes

src/Aspire.Hosting.Azure.Storage/AzureStorageExtensions.cs Outdated Show resolved Hide resolved

eerhardt approved these changes May 15, 2025

View reviewed changes

Feedback

273233b

RussKie commented May 15, 2025

View reviewed changes

tests/Aspire.Hosting.Azure.Tests/AzureBicepResourceTests.cs Show resolved Hide resolved

RussKie commented May 15, 2025

View reviewed changes

Improve comment

30641bd

sebastienros merged commit 7baf34b into main May 16, 2025
254 checks passed

sebastienros deleted the igveliko/fix_9139 branch May 16, 2025 18:21

radical reviewed May 16, 2025

View reviewed changes

sebastienros mentioned this pull request May 27, 2025

[release/9.3] Fix Blob Container Connection String Format Exception #9496

Merged

github-actions bot locked and limited conversation to collaborators Jun 16, 2025

	// This event is triggered when the health check is healthy.
	// This event is triggered when the emulator has started, and BlobServiceClient is marked as healthy.

Update health check to ensure blob containers created at right time #9159

Update health check to ensure blob containers created at right time #9159

Uh oh!

Conversation

RussKie commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davidfowl commented May 8, 2025

Uh oh!

eerhardt commented May 8, 2025

Uh oh!

RussKie commented May 8, 2025

Uh oh!

eerhardt commented May 9, 2025

Uh oh!

sebastienros commented May 9, 2025

Uh oh!

RussKie commented May 13, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eerhardt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sebastienros May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

RussKie commented May 8, 2025 •

edited

Loading

sebastienros May 15, 2025 •

edited

Loading