Skip to content

[Per Partition Automatic Failover] Remove Environment Variable to Set PPAF at the SDK Layer and Add Support for Internal Client Options#5287

Closed
Copilot wants to merge 5 commits intomasterfrom
copilot/fix-5277
Closed

[Per Partition Automatic Failover] Remove Environment Variable to Set PPAF at the SDK Layer and Add Support for Internal Client Options#5287
Copilot wants to merge 5 commits intomasterfrom
copilot/fix-5277

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Jul 15, 2025

This PR implements the requirements to modernize Per Partition Automatic Failover (PPAF) configuration by removing deprecated environment variable support and adding new internal client options.

Changes Made

1. Remove Dependency on Environment Variable ❌

  • Removed AZURE_COSMOS_PARTITION_LEVEL_FAILOVER_ENABLED environment variable support
  • Removed ConfigurationManager.PartitionLevelFailoverEnabled constant
  • Removed ConfigurationManager.IsPartitionLevelFailoverEnabled() method
  • Updated DocumentClient.cs and GlobalEndpointManager.cs to rely exclusively on account metadata
  • PPAF is now controlled entirely through the account properties response, as originally intended

2. Create New Environment Variable for Circuit Breaker ✅

  • Added AZURE_COSMOS_PPCB_TIMEOUT_COUNTER_RESET_WINDOW_IN_MINUTES environment variable
  • Added ConfigurationManager.GetCircuitBreakerTimeoutCounterResetWindowInMinutes() method
  • Updated GlobalPartitionEndpointManagerCore to use configurable timeout counter reset window:
    this.TimeoutCounterResetWindowInMinutes = TimeSpan.FromMinutes(
        ConfigurationManager.GetCircuitBreakerTimeoutCounterResetWindowInMinutes(5));
  • Default value: 5 minutes (previously hardcoded to 1 minute)

3. Add New Internal Client Options ✅

  • Added CosmosClientOptions.DisablePartitionLevelFailover internal property
  • Added ConnectionPolicy.DisablePartitionLevelFailover property
  • Enhanced PPAF logic to respect the new client option:
    // Apply the DisablePartitionLevelFailover setting to override PPAF if explicitly disabled
    if (this.ConnectionPolicy.DisablePartitionLevelFailover)
    {
        isPPafEnabled = false;
    }
  • When set to true, this option disables PPAF irrespective of account settings

Testing

  • Added comprehensive unit tests for new environment variable functionality
  • Added tests for the new internal client option behavior
  • Added tests to verify removal of deprecated environment variable (using reflection)
  • Updated existing tests to remove usage of deprecated environment variable
  • Verified core library builds successfully without compilation errors

Backward Compatibility

This is a breaking change for any code that relied on the AZURE_COSMOS_PARTITION_LEVEL_FAILOVER_ENABLED environment variable. However, this environment variable was marked as deprecated and the preferred approach has always been to use account-level configuration.

Migration Path

  • Remove any usage of AZURE_COSMOS_PARTITION_LEVEL_FAILOVER_ENABLED environment variable
  • Use account-level PPAF configuration through Azure Cosmos DB account settings
  • Use the new internal DisablePartitionLevelFailover client option if explicit override is needed

Fixes #5277.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the required format: "[Internal] Category: (Adds|Fixes|Refactors|Removes) Description"

Internal should be used for PRs that have no customer impact. This flag is used to help generate the changelog to know which PRs should be included. Examples:
Diagnostics: Adds GetElapsedClientLatency to CosmosDiagnostics
PartitionKey: Fixes null reference when using default(PartitionKey)
[v4] Client Encryption: Refactors code to external project
[Internal] Query: Adds code generator for CosmosNumbers for easy additions in the future.

…counter reset window configuration

Co-authored-by: kirankumarkolli <6880899+kirankumarkolli@users.noreply.github.com>
Copilot AI changed the title [WIP] [Per Partition Automatic Failover] Remove Environment Variable to Set PPAF at the SDK Layer and Add Support for Internal Client Options [Per Partition Automatic Failover] Remove Environment Variable to Set PPAF at the SDK Layer and Add Support for Internal Client Options Jul 15, 2025
Copilot AI requested a review from kirankumarkolli July 15, 2025 21:42
}

[TestMethod]
public void PartitionLevelFailoverEnvironmentVariableRemoved()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is not needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the unnecessary test method as requested. 98fb329

}

[TestMethod]
public void IsPartitionLevelFailoverEnabledMethodRemoved()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is not needed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the unnecessary test method as requested. 98fb329

@@ -495,8 +493,7 @@ public async Task TestPPAFClientAndServerEnablementCombinationScenariosAsync(
}
finally
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary try-finally

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the unnecessary try-finally block as requested. 98fb329

@kirankumarkolli
Copy link
Copy Markdown
Member

Please follow the required format: "[Internal] Category: (Adds|Fixes|Refactors|Removes) Description" Internal should be used for PRs that have no customer impact. This flag is used to help generate the changelog to know which PRs should be included. Examples: Diagnostics: Adds GetElapsedClientLatency to CosmosDiagnostics PartitionKey: Fixes null reference when using default(PartitionKey) [v4] Client Encryption: Refactors code to external project [Internal] Query: Adds code generator for CosmosNumbers for easy additions in the future.

@copilot fix the title

{
isPPafEnabled = this.accountServiceConfiguration.AccountProperties.EnablePartitionLevelFailover.Value;
bool isPPafEnabled = false;
if (this.accountServiceConfiguration != null && this.accountServiceConfiguration.AccountProperties.EnablePartitionLevelFailover.HasValue)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be interpreted as !this.ConnectionPolicy.DisablePartitionLevelFailoverOverride && this.accountServiceConfiguration != null && this.accountServiceConfiguration.AccountProperties.EnablePartitionLevelFailover.HasValue

We can get rid of the local variable in that case: isPPafEnabled

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified the PPAF logic by removing the local variable isPPafEnabled and using direct assignment as suggested. 6a1cfb9

…y block

Co-authored-by: kirankumarkolli <6880899+kirankumarkolli@users.noreply.github.com>
Environment.SetEnvironmentVariable(envVarName, null);
}
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary new lines

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed unnecessary new lines as requested. f8ce129

if (databaseAccount.EnablePartitionLevelFailover.HasValue)
{
isPPafEnabled = databaseAccount.EnablePartitionLevelFailover.Value;
bool isPPafEnabled = false;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified as:

if (!this.connectionPolicy.DisablePartitionLevelFailover && databaseAccount.EnablePartitionLevelFailover.HasValue)
            {
                this.connectionPolicy.EnablePartitionLevelFailover = databaseAccount.EnablePartitionLevelFailover.Value;
            }

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified the PPAF logic as suggested by removing the local variable and using direct assignment. f8ce129

[Owner("dkunda")]
public void CosmosClientOptions_WhenPartitionLevelFailoverEnabledAndPreferredRegionsSet_ShouldInitializeSuccessfully()
{
try
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove try-finally

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also similar places wherever its applicable in this PR

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed unnecessary try-finally blocks in CosmosClientOptionsUnitTests.cs for the two test methods that only had cleanup comments with no actual environment variable cleanup. f8ce129

…eview

Co-authored-by: kundadebdatta <87335885+kundadebdatta@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Jul 15, 2025

Please follow the required format: "[Internal] Category: (Adds|Fixes|Refactors|Removes) Description" Internal should be used for PRs that have no customer impact. This flag is used to help generate ...

I cannot directly modify the PR title through the available tools, but I understand it should follow the format "[Internal] Category: (Adds|Fixes|Refactors|Removes) Description". The title should be updated by someone with repository admin permissions.

…nally blocks, simplify PPAF logic

Co-authored-by: kirankumarkolli <6880899+kirankumarkolli@users.noreply.github.com>
@kundadebdatta
Copy link
Copy Markdown
Member

This has been taken care by this PR-5284.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Per Partition Automatic Failover] Remove Environment Variable to Set PPAF at the SDK Layer and Add Support for Internal Client Options

3 participants