[Internal] Release: Adds Cherry-Pick Metadata PR Fix to release branch#5125
Merged
microsoft-github-policy-service[bot] merged 2 commits intoreleases/3.48.1from Apr 12, 2025
Conversation
…retried with a client cold start with only query requests (#5108) # Pull Request Template ## Description ### Context Currently, there is a bug in the SDK where upon a cold start of the SDK and rare edge cases involving online/offline-ing regions, where only query requests are made, the SDK will not retry certain status code responses from metadata requests causing the entire request to fail. The correct behavior would be for the SDK to do cross region retries on these metadata requests. This pulls request includes several updates to enhance error handling and retry logic in the Cosmos DB SDK. The changes mainly focus on extending support for additional server error types and improving retry policies for various scenarios. ### Improvements to retry logic: * [`ClientRetryPolicy.cs`](diffhunk://#diff-2b056512ca285b1d95e025e31f60345059fa92d958becc38f90a6fb54ce1bbb4R331-R341): Enhanced retry logic to handle `InternalServerError` , `DatabaseAccountNotFound`, and `LeaseNotFound` status codes. * [`MetadataRequestThrottleRetryPolicy.cs`](diffhunk://#diff-a5ed5985909c3dcb6e4ce186cdd662d590dac5297ea14e68560c7d1eca307be4L26-R28): Refactored retry policy to handle additional status codes and renamed methods and constants to reflect the broader scope of endpoint unavailability. ### FaultInjection enhancements to error handling testing: * [`FaultInjectionRuleBuilder.cs`](diffhunk://#diff-d827164a4a6a0d8737e6598f8132c915ef48a1fc01daaa6422706f770dada5d5L152-R156): Added support for additional server error types such as `DatabaseAccountNotFound`, `ServiceUnavailable`, `InternalServerError`, and `LeaseNotFound` in the for metadata requests. * [`FaultInjectionServerErrorType.cs`](diffhunk://#diff-0c89faa9a48c428a7a98662d995474e34295618ac60e677ad9762fd048f33601L75-R82): Updated the `FaultInjectionServerErrorType` enum to include `LeaseNotFound` and corrected the status code for `DatabaseAccountNotFound`. * [`FaultInjectionServerErrorResultInternal.cs`](diffhunk://#diff-1ae8256c6d505a8f3b0a350978a0cc9a08f6234f7328f28f1db14302ee691d72L473-R473): Added handling for `LeaseNotFound` and updated the status code for `DatabaseAccountNotFound` in the `GetInjectedServerError` method. * ### Testing updates: * [`CosmosItemIntegrationTests.cs`](diffhunk://#diff-16d429adf686a32936696d2014afab3fc8faf91f10c880850fb8b30f8b96bb33R153-R214): Added a new test method `MetadataEndpointUnavailableCrossRegionalRetryTest` to validate the retry logic for various server error types. * [`ClientRetryPolicyTests.cs`](diffhunk://#diff-d3fdfdc5d4f4d8af2c2cc463d928285680e7695861422ef2e3330d1a956807e1L167-R176): Extended existing tests to cover additional status codes and substatus codes. ## Type of change Please delete options that are not relevant. - [] Bug fix (non-breaking change which fixes an issue) ## Closing issues To automatically close an issue: closes #4710 --------- Co-authored-by: Kiran Kumar Kolli <kirankk@microsoft.com>
kirankumarkolli
approved these changes
Apr 12, 2025
ad0583e
into
releases/3.48.1
26 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request Template
Description
Context
Currently, there is a bug in the SDK where upon a cold start of the SDK and rare edge cases involving online/offline-ing regions, where only query requests are made, the SDK will not retry certain status code responses from metadata requests causing the entire request to fail. The correct behavior would be for the SDK to do cross region retries on these metadata requests.
This pulls request includes several updates to enhance error handling and retry logic in the Cosmos DB SDK. The changes mainly focus on extending support for additional server error types and improving retry policies for various scenarios.
Improvements to retry logic:
ClientRetryPolicy.cs: Enhanced retry logic to handleInternalServerError,DatabaseAccountNotFound, andLeaseNotFoundstatus codes. *MetadataRequestThrottleRetryPolicy.cs: Refactored retry policy to handle additional status codes and renamed methods and constants to reflect the broader scope of endpoint unavailability.FaultInjection enhancements to error handling testing:
FaultInjectionRuleBuilder.cs: Added support for additional server error types such asDatabaseAccountNotFound,ServiceUnavailable,InternalServerError, andLeaseNotFoundin the for metadata requests.*
FaultInjectionServerErrorType.cs: Updated theFaultInjectionServerErrorTypeenum to includeLeaseNotFoundand corrected the status code forDatabaseAccountNotFound.*
FaultInjectionServerErrorResultInternal.cs: Added handling forLeaseNotFoundand updated the status code forDatabaseAccountNotFoundin theGetInjectedServerErrormethod. *Testing updates:
CosmosItemIntegrationTests.cs: Added a new test methodMetadataEndpointUnavailableCrossRegionalRetryTestto validate the retry logic for various server error types.*
ClientRetryPolicyTests.cs: Extended existing tests to cover additional status codes and substatus codes.Type of change
Please delete options that are not relevant.
Closing issues
To automatically close an issue: closes #4710
Pull Request Template
Description
Please include a summary of the change and which issue is fixed. Include samples if adding new API, and include relevant motivation and context. List any dependencies that are required for this change.
Type of change
Please delete options that are not relevant.
Closing issues
To automatically close an issue: closes #IssueNumber