Skip to content

[cosmos] Metadata retry: dropped caller future silently cancels cross-region failover #4253

@NaluTripician

Description

@NaluTripician

Summary

Same class of defect as Azure/azure-cosmos-dotnet-v3#5805 (fix in Azure/azure-cosmos-dotnet-v3#5806).

When a control-plane metadata read against an unhealthy preferred region is in the middle of internal HTTP timeout escalation, dropping the caller future (Rust's cancellation mechanism) immediately drops the in-flight attempt and exits the retry loop. The client retry policy's cross-region decision is therefore never reached — the customer sees a cancelled / timed-out future with no cross-region attempt made.

Files / lines

  • sdk/cosmos/azure_data_cosmos/src/pipeline/retry_handler.rs — retry loop (~lines 124-145)

The loop .awaits the wrapped future without shielding it from drop, and there is no retry-scoped cancellation token decoupled from the caller's future.

Repro (sketch)

  1. Fresh CosmosClient (cold metadata).
  2. Preferred regions [A, B]; A unreachable (blackhole / 503).
  3. Wrap the client call in tokio::time::timeout(Duration::from_secs(36), client.read_item(...)).
  4. Observe: timeout fires, caller future is dropped, no attempt against region B.

With region A healthy the path is not exercised and the call succeeds.

Expected behavior

Even when the caller future is dropped during an in-flight metadata attempt, the retry policy is consulted; if it indicates a cross-region retry, one bounded attempt against region B executes before the cancellation surfaces.

Proposed fix direction

  • Execute each retry attempt inside tokio::spawn so that dropping the caller's future does not immediately drop the in-flight attempt. Join the spawned task with a bounded timeout.
  • Bound the extra lifetime with an internal tokio::time::timeout matching the grace window (e.g., 10s, aligned with the .NET fix).
  • Alternatively, thread a tokio_util::sync::CancellationToken that is decoupled from the caller's future and is only tripped when the retry policy has already decided not to retry.

Either approach re-raises the original error (not the grace timeout) when the grace attempt fails or expires.

Cross-references

This defect class was identified in a cross-SDK investigation prompted by the .NET fix. Tracking issues:

/cc @NaluTripician

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

Untriaged

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions