Description
Background:
During one of the random PPAF exercise, it was identified that when the following quorum loss condition is met, and the user provides a cancellation token, SDK honors the token, however the write request doesn't hedge, given that the threshold is much lower than the cancellation token expiry time:
- Quorum loss injected with the quorum replicas (3 out of 4 replicas are down).
- The primary replica is specifically down.
- A cancellation token with
5
seconds of timeout value is provided. - An availability strategy with a pre defined threshold of
1
second is provided.
Sample Code:
CosmosClientOptions clientOptions = new CosmosClientOptions
{
ApplicationName = "dkunda-ppaf-app",
EnableContentResponseOnWrite = true,
ApplicationPreferredRegions = new List<string> { Regions.NorthCentralUS, Regions.CentralUS, Regions.WestUS2 },
ConnectionMode = ConnectionMode.Direct,
ConsistencyLevel = Cosmos.ConsistencyLevel.Session,
AvailabilityStrategy = AvailabilityStrategy.CrossRegionHedgingStrategy(
threshold: TimeSpan.FromMilliseconds(1000), // Threshold value is much lesser than the cancellation token expiry time.
thresholdStep: TimeSpan.FromMilliseconds(50),
enableMultiWriteRegionHedge: true),
};
CancellationTokenSource cts = new CancellationTokenSource();
cts.CancelAfter(TimeSpan.FromSeconds(5)); // Cancellation token expiry time is 5 seconds.
Comment comment = new Comment(Guid.NewGuid().ToString(), "pk", random.Next().ToString(), "[email protected]", "This document is intended for ppaf testing demo.");
ItemResponse<Comment> writeResponse = await container.CreateItemAsync<Comment>(
item: comment,
partitionKey: new Cosmos.PartitionKey(comment.postId),
requestOptions: requestOptions,
cancellationToken: cts.Token
);
Observation:
- Write requests doesn't hedge on multiple regions.
Acceptance Criteria:
- Write requests should hedge on other regions, mentioned in the preferred regions.
Sample Diagnostics:
Diagnostics-1
{
"Summary": {
"GatewayCalls": {
"(200, 0)": 3
}
},
"name": "CreateItemAsync",
"start datetime": "2025-03-11T01:20:26.062Z",
"duration in milliseconds": 5002.7305,
"data": {
"Client Configuration": {
"Client Created Time Utc": "2025-03-11T01:19:53.2738777Z",
"MachineId": "hashedMachineName:dd823358-1397-c938-1a2d-e52a0b922240",
"NumberOfClientsCreated": 1,
"NumberOfActiveClients": 1,
"ConnectionMode": "Direct",
"User Agent": "cosmos-netstandard-sdk/3.47.2|1|X64|Microsoft Windows 10.0.26100|.NET 6.0.36|L|F1|dkunda-ppaf-app",
"ConnectionConfig": {
"gw": "(cps:50, urto:6, p:False, httpf: False)",
"rntbd": "(cto: 5, icto: -1, mrpc: 30, mcpe: 65535, erd: True, pr: ReuseUnicastPort)",
"other": "(ed:False, be:False)"
},
"ConsistencyConfig": "(consistency: Session, prgns:[North Central US, Central US, West US 2], apprgn: )",
"ProcessorCount": 12
}
},
"children": [
{
"name": "ItemSerialize",
"duration in milliseconds": 0.0684
},
{
"name": "Microsoft.Azure.Cosmos.Handlers.RequestInvokerHandler",
"duration in milliseconds": 5002.0814,
"children": [
{
"name": "Get Collection Cache",
"duration in milliseconds": 0.0015
},
{
"name": "Microsoft.Azure.Cosmos.Handlers.DiagnosticsHandler",
"duration in milliseconds": 5001.2278,
"children": [
{
"name": "Microsoft.Azure.Cosmos.Handlers.TelemetryHandler",
"duration in milliseconds": 5001.0083,
"children": [
{
"name": "Microsoft.Azure.Cosmos.Handlers.RetryHandler",
"duration in milliseconds": 5000.8221,
"children": [
{
"name": "Microsoft.Azure.Cosmos.Handlers.RouterHandler",
"duration in milliseconds": 5000.4928,
"children": [
{
"name": "Microsoft.Azure.Cosmos.Handlers.TransportHandler",
"duration in milliseconds": 5000.3789,
"children": [
{
"name": "Microsoft.Azure.Documents.ServerStoreModel Transport Request",
"duration in milliseconds": 5000.0713,
"data": {
"Client Side Request Stats": {
"Id": "AggregatedClientSideRequestStatistics",
"ContactedReplicas": [
{
"Count": 1,
"Uri": "rntbd://cdb-ms-test61-northcentralus1-be1.documents-test.windows-int.net:14008/apps/bad4fbdb-e7dd-45be-9480-7d9fd240a2a5/services/3385fbba-d551-4f50-a6a9-c92c49bd48f6/partitions/b04f9c0b-defd-46d9-a8ed-0f275fb96430/replicas/133859571086460706s/"
}
],
"RegionsContacted": [
],
"FailedReplicas": [
],
"ForceAddressRefresh": [
{
"No change to cache": [
"rntbd://cdb-ms-test61-northcentralus1-be1.documents-test.windows-int.net:14008/apps/bad4fbdb-e7dd-45be-9480-7d9fd240a2a5/services/3385fbba-d551-4f50-a6a9-c92c49bd48f6/partitions/b04f9c0b-defd-46d9-a8ed-0f275fb96430/replicas/133859571086460706s/"
]
},
{
"No change to cache": [
"rntbd://cdb-ms-test61-northcentralus1-be1.documents-test.windows-int.net:14008/apps/bad4fbdb-e7dd-45be-9480-7d9fd240a2a5/services/3385fbba-d551-4f50-a6a9-c92c49bd48f6/partitions/b04f9c0b-defd-46d9-a8ed-0f275fb96430/replicas/133859571086460706s/"
]
},
{
"No change to cache": [
"rntbd://cdb-ms-test61-northcentralus1-be1.documents-test.windows-int.net:14008/apps/bad4fbdb-e7dd-45be-9480-7d9fd240a2a5/services/3385fbba-d551-4f50-a6a9-c92c49bd48f6/partitions/b04f9c0b-defd-46d9-a8ed-0f275fb96430/replicas/133859571086460706s/"
]
}
],
"AddressResolutionStatistics": [
{
"StartTimeUTC": "2025-03-11T01:20:26.0641997Z",
"EndTimeUTC": "2025-03-11T01:20:26.1964295Z",
"TargetEndpoint": "https://dkunda-ppaf-session-northcentralus.documents-test.windows-int.net//addresses/?$resolveFor=dbs%2fLHw67A%3d%3d%2fcolls%2fLHw67K25IVs%3d%2fdocs&$filter=protocol eq rntbd&$partitionKeyRangeIds=0"
},
{
"StartTimeUTC": "2025-03-11T01:20:27.2010140Z",
"EndTimeUTC": "2025-03-11T01:20:27.3422914Z",
"TargetEndpoint": "https://dkunda-ppaf-session-northcentralus.documents-test.windows-int.net//addresses/?$resolveFor=dbs%2fLHw67A%3d%3d%2fcolls%2fLHw67K25IVs%3d%2fdocs&$filter=protocol eq rntbd&$partitionKeyRangeIds=0"
},
{
"StartTimeUTC": "2025-03-11T01:20:29.3649661Z",
"EndTimeUTC": "2025-03-11T01:20:29.7096044Z",
"TargetEndpoint": "https://dkunda-ppaf-session-northcentralus.documents-test.windows-int.net//addresses/?$resolveFor=dbs%2fLHw67A%3d%3d%2fcolls%2fLHw67K25IVs%3d%2fdocs&$filter=protocol eq rntbd&$partitionKeyRangeIds=0"
}
],
"StoreResponseStatistics": [
],
"HttpResponseStats": [
{
"StartTimeUTC": "2025-03-11T01:20:26.0642570Z",
"DurationInMs": 73.8448,
"RequestUri": "https://dkunda-ppaf-session-northcentralus.documents-test.windows-int.net//addresses/?$resolveFor=dbs%2fLHw67A%3d%3d%2fcolls%2fLHw67K25IVs%3d%2fdocs&$filter=protocol eq rntbd&$partitionKeyRangeIds=0",
"ResourceType": "Document",
"HttpMethod": "GET",
"ActivityId": "4fd31309-a2ac-48da-ab8a-3be945588581",
"StatusCode": "OK"
},
{
"StartTimeUTC": "2025-03-11T01:20:27.2010746Z",
"DurationInMs": 74.2997,
"RequestUri": "https://dkunda-ppaf-session-northcentralus.documents-test.windows-int.net//addresses/?$resolveFor=dbs%2fLHw67A%3d%3d%2fcolls%2fLHw67K25IVs%3d%2fdocs&$filter=protocol eq rntbd&$partitionKeyRangeIds=0",
"ResourceType": "Document",
"HttpMethod": "GET",
"ActivityId": "4fd31309-a2ac-48da-ab8a-3be945588581",
"StatusCode": "OK"
},
{
"StartTimeUTC": "2025-03-11T01:20:29.3650050Z",
"DurationInMs": 279.9272,
"RequestUri": "https://dkunda-ppaf-session-northcentralus.documents-test.windows-int.net//addresses/?$resolveFor=dbs%2fLHw67A%3d%3d%2fcolls%2fLHw67K25IVs%3d%2fdocs&$filter=protocol eq rntbd&$partitionKeyRangeIds=0",
"ResourceType": "Document",
"HttpMethod": "GET",
"ActivityId": "4fd31309-a2ac-48da-ab8a-3be945588581",
"StatusCode": "OK"
}
]
}
}
}
]
}
]
}
]
}
]
}
]
},
{
"name": "CosmosOperationCanceledException",
"duration in milliseconds": 0.0139,
"data": {
"Operation Cancelled Exception": "System.Threading.Tasks.TaskCanceledException: A task was canceled.\r\n at Microsoft.Azure.Documents.RequestRetryUtility.ProcessRequestAsync[TRequest,IRetriableResponse](Func`1 executeAsync, Func`1 prepareRequest, IRequestRetryPolicy`2 policy, CancellationToken cancellationToken, Func`1 inBackoffAlternateCallbackMethod, Nullable`1 minBackoffForInBackoffCallback)\r\n at Microsoft.Azure.Documents.RequestRetryUtility.ProcessRequestAsync[TRequest,IRetriableResponse](Func`1 executeAsync, Func`1 prepareRequest, IRequestRetryPolicy`2 policy, CancellationToken cancellationToken, Func`1 inBackoffAlternateCallbackMethod, Nullable`1 minBackoffForInBackoffCallback)\r\n at Microsoft.Azure.Documents.StoreClient.ProcessMessageAsync(DocumentServiceRequest request, CancellationToken cancellationToken, IRetryPolicy retryPolicy)\r\n at Microsoft.Azure.Cosmos.Handlers.TransportHandler.ProcessMessageAsync(RequestMessage request, CancellationToken cancellationToken) in D:\\stash\\azure-cosmos-dotnet-v3\\Microsoft.Azure.Cosmos\\src\\Handler\\TransportHandler.cs:line 122\r\n at Microsoft.Azure.Cosmos.Handlers.TransportHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken) in D:\\stash\\azure-cosmos-dotnet-v3\\Microsoft.Azure.Cosmos\\src\\Handler\\TransportHandler.cs:line 33\r\n at Microsoft.Azure.Cosmos.Handlers.RouterHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken) in D:\\stash\\azure-cosmos-dotnet-v3\\Microsoft.Azure.Cosmos\\src\\Handler\\RouterHandler.cs:line 42\r\n at Microsoft.Azure.Cosmos.RequestHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken) in D:\\stash\\azure-cosmos-dotnet-v3\\Microsoft.Azure.Cosmos\\src\\Handler\\RequestHandler.cs:line 59\r\n at Microsoft.Azure.Cosmos.Handlers.AbstractRetryHandler.ExecuteHttpRequestAsync(Func`1 callbackMethod, Func`3 callShouldRetry, Func`3 callShouldRetryException, CancellationToken cancellationToken) in D:\\stash\\azure-cosmos-dotnet-v3\\Microsoft.Azure.Cosmos\\src\\Handler\\AbstractRetryHandler.cs:line 75\r\n at Microsoft.Azure.Cosmos.Handlers.AbstractRetryHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken) in D:\\stash\\azure-cosmos-dotnet-v3\\Microsoft.Azure.Cosmos\\src\\Handler\\AbstractRetryHandler.cs:line 28\r\n at Microsoft.Azure.Cosmos.RequestHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken) in D:\\stash\\azure-cosmos-dotnet-v3\\Microsoft.Azure.Cosmos\\src\\Handler\\RequestHandler.cs:line 59\r\n at Microsoft.Azure.Cosmos.Handlers.TelemetryHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken) in D:\\stash\\azure-cosmos-dotnet-v3\\Microsoft.Azure.Cosmos\\src\\Handler\\TelemetryHandler.cs:line 28\r\n at Microsoft.Azure.Cosmos.RequestHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken) in D:\\stash\\azure-cosmos-dotnet-v3\\Microsoft.Azure.Cosmos\\src\\Handler\\RequestHandler.cs:line 59\r\n at Microsoft.Azure.Cosmos.Handlers.DiagnosticsHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken) in D:\\stash\\azure-cosmos-dotnet-v3\\Microsoft.Azure.Cosmos\\src\\Handler\\DiagnosticsHandler.cs:line 26\r\n at Microsoft.Azure.Cosmos.RequestHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken) in D:\\stash\\azure-cosmos-dotnet-v3\\Microsoft.Azure.Cosmos\\src\\Handler\\RequestHandler.cs:line 59\r\n at Microsoft.Azure.Cosmos.Handlers.RequestInvokerHandler.BaseSendAsync(RequestMessage request, CancellationToken cancellationToken) in D:\\stash\\azure-cosmos-dotnet-v3\\Microsoft.Azure.Cosmos\\src\\Handler\\RequestInvokerHandler.cs:line 144\r\n at Microsoft.Azure.Cosmos.CrossRegionHedgingAvailabilityStrategy.RequestSenderAndResultCheckAsync(Func`3 sender, RequestMessage request, String hedgedRegion, CancellationToken cancellationToken, CancellationTokenSource cancellationTokenSource, ITrace trace) in D:\\stash\\azure-cosmos-dotnet-v3\\Microsoft.Azure.Cosmos\\src\\Routing\\AvailabilityStrategy\\CrossRegionHedgingAvailabilityStrategy.cs:line 298"
}
}
]
}
]
}
Metadata
Metadata
Assignees
Type
Projects
Status
In Progress