-
Notifications
You must be signed in to change notification settings - Fork 3k
Revert Dual Endpoint Tracking #40451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
API change check API changes are not detected in this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 18 out of 18 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (3)
sdk/cosmos/azure-cosmos/tests/test_service_retry_policies_async.py:386
- Verify that the expected number of request endpoints (4) now reflects the intended in-region retry behavior after the dual endpoint removal and that no necessary retries are inadvertently omitted.
assert len(connection_retry_policy.request_endpoints) == 4
sdk/cosmos/azure-cosmos/tests/test_fault_injection_transport.py:349
- Confirm that enabling multiple write locations in this test fully exercises the new code path and does not impact tests expecting single write location behavior when not explicitly enabled.
use_multiple_write_locations=True,
sdk/cosmos/azure-cosmos/azure/cosmos/_location_cache.py:59
- Ensure that downstream components correctly handle regional routing contexts that now contain only a primary endpoint, without relying on any alternate endpoint logic.
def get_regional_routing_contexts_by_loc(new_locations):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @tvaron3
for endpoint in endpoints: | ||
if endpoint not in endpoints_attempted: | ||
if success_count >= 4: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should still have a max health check count. Do you recall one internal customer using 100 regions. Let's still limit this to maybe 3 max.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes agree, but the endpoints now will be max 3 because there is no alternate endpoints. We grab first two write regions and the first two read regions. This should max get us three unique endpoints. I can add some unit tests for endpoints to health check to verify this behavior.
@@ -23,7 +23,7 @@ class TestServiceRetryPolicies(unittest.TestCase): | |||
REGION1 = "West US" | |||
REGION2 = "East US" | |||
REGION3 = "West US 2" | |||
REGIONAL_ENDPOINT = RegionalRoutingContext(host, host) | |||
REGIONAL_ENDPOINT = RegionalRoutingContext(host) | |||
|
|||
@classmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expand to test all operation types and with fault injection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a small comment but looks good otherwise - thanks Tomas!
connection_policy = ConnectionPolicy() | ||
connection_policy.UseMultipleWriteLocations = use_multiple_write_locations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these two lines can be changed to just pass in the setting in the client initialization - multiple_write_locations=use_multiple_write_locations
Background
The SDK used to track two endpoints for writes. One was the global endpoint,
https://account.documents.azure.com
, and the other was the regional endpointhttps://account-eastus.documents.azure.com
. Gateway would return these randomly for the write region and the sdk would use that one to write and use the other endpoint as the backup. This was done for write availability in single write region accounts and for load balancing across compute federations.Changes
The dual endpoint logic is being abstracted away using a service called Azure Traffic Manager https://learn.microsoft.com/en-us/azure/traffic-manager/traffic-manager-overview. Gateway will now only send the regional endpoint to the sdk and the global endpoint should only be used for metadata calls in the sdk. This pr removes all the dual endpoint logic from the sdk and tests.
Other Changes
The dual endpoints introduced a regression for multiple write region accounts in the sdk where we would failover even if the client was not setup with multiple write regions enabled. This is fixed in this pr. Simplified health check logic as no alternate endpoints to check anymore.