Skip to content

Spec: hub-region-processing-only header for 404/1002 retries#4310

Closed
NaluTripician wants to merge 8 commits intoAzure:mainfrom
NaluTripician:spec/issue-4303-hub-region-header
Closed

Spec: hub-region-processing-only header for 404/1002 retries#4310
NaluTripician wants to merge 8 commits intoAzure:mainfrom
NaluTripician:spec/issue-4303-hub-region-header

Conversation

@NaluTripician
Copy link
Copy Markdown

Summary

Adds a design spec only (no implementation) proposing functional parity with .NET PR Azure/azure-cosmos-dotnet-v3#5447: emit x-ms-cosmos-hub-region-processing-only: true on the second-and-later 404 / sub-status 1002 retry for single-master Cosmos DB accounts. This mitigates session-token storms during region failback for session-consistency reads.

The spec is published at sdk/cosmos/azure_data_cosmos/docs/HUB_REGION_PROCESSING_HEADER_SPEC.md so it can be reviewed in repo before any code lands. Tracks #4303.

Why a spec PR first

The .NET fix is a small change with subtle semantics — a per-operation latch that only activates after a specific failure shape, plus a header that opts retries into hub-region-only processing. Mirroring it in Rust touches the retry policy, header constants, and CHANGELOG, and there are at least three judgement calls (latch threshold mapping, latch lifetime, scope vs. issue text). Reviewing the spec in isolation lets us settle those before code review.

What's in the spec

  • §1 Goals & Motivation — problem statement, goals, non-goals, primary target.
  • §2 Architectural OverviewBackOffRetryHandlerbefore_send_requestshould_retry flow with ASCII diagram.
  • §3 Behavior Specification — trigger conditions, outbound effect, invariants preserved, worked example (East US 2 / Central US failback).
  • §4 Code Changes — concrete Rust snippets for constants.rs (new cosmos_headers! entry), client_retry_policy.rs (latch field, before_send_request injection, should_retry_on_session_not_available set-point), and CHANGELOG.md.
  • §5 Side Effects & Risk — SE-001…SE-005 with severity and mitigation.
  • §6 Alternatives Considered — 4 alternatives with verdicts.
  • §7 Testing Strategy — 7-row matrix with suggested test names plus validation gates.
  • §8 Open Questions & Future Work — both pre-resolved (see below).

Pre-resolved open questions

  • OQ-1 — Skip-set rotation parity. Day 1 ships .NET parity only — header + latch. Skip-set rotation is explicitly out of scope and captured under Future Work for a follow-up if post-rollout telemetry indicates the header alone is insufficient.
  • OQ-2 — CHANGELOG version block. Verified ## 0.32.0 (Unreleased) already exists on main; spec instructs implementer to add a new ### Features Added subsection above the existing ### Breaking Changes, no new version block.

Convention compliance

Mirrors the in-repo spec convention established by FEED_OPERATIONS_SPEC.md (#4261): Markdown doc under <crate>/docs/, status header block, numbered TOC, numbered sections. Code snippets follow repo conventions verified against existing source — cosmos_headers! macro registration, lowercase "true" literal (matches ALLOW_TENTATIVE_WRITES precedent), test_ prefix on all tests (file-local convention in client_retry_policy.rs), CHANGELOG PR-link suffix.

Out of scope (explicitly)

  • Implementation. Phase 3 lands in a follow-up PR after spec review.
  • Skip-set rotation on 403/3 (NotWriteRegion). Future Work only.
  • Diagnostics surface for "hub-region-only retry." Future Work pending broader diagnostics design.

Tracks #4303.

Adds a design spec proposing parity with .NET PR
Azure/azure-cosmos-dotnet-v3#5447: emit
`x-ms-cosmos-hub-region-processing-only: true` on second-and-later
404/1002 retries for single-master accounts to mitigate session-token
storms during region failback.

Spec only — no implementation. Mirrors the in-repo convention
established by FEED_OPERATIONS_SPEC.md (Azure#4261): a Markdown design
document at `sdk/cosmos/<crate>/docs/<NAME>_SPEC.md` with status
header, numbered TOC, and numbered sections covering goals,
architecture, behavior, code changes, side effects, alternatives,
testing, and open questions.

Open questions are pre-resolved in §8: Day 1 ships .NET parity only
(no skip-set rotation), and the CHANGELOG entry will append under the
existing `## 0.32.0 (Unreleased)` block.

Tracks Azure#4303.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@NaluTripician NaluTripician requested a review from a team as a code owner April 29, 2026 19:01
Copilot AI review requested due to automatic review settings April 29, 2026 19:01
@github-actions github-actions Bot added the Cosmos The azure_cosmos crate label Apr 29, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a design-only specification for introducing x-ms-cosmos-hub-region-processing-only: true on second-and-later 404/1002 retries (single-master) to mitigate session-token storms during region failback, mirroring the .NET behavior.

Changes:

  • Introduces a new Markdown design spec describing motivation, intended retry/latch behavior, and proposed code touchpoints.
  • Documents suggested implementation details (constants, retry policy latch + header injection), risks, alternatives, and test matrix.

Comment thread sdk/cosmos/azure_data_cosmos/docs/HUB_REGION_PROCESSING_HEADER_SPEC.md Outdated
Comment thread sdk/cosmos/azure_data_cosmos/docs/HUB_REGION_PROCESSING_HEADER_SPEC.md Outdated
Comment thread sdk/cosmos/azure_data_cosmos/docs/HUB_REGION_PROCESSING_HEADER_SPEC.md Outdated
@NaluTripician
Copy link
Copy Markdown
Author

AI Review Summary

Reviewed this spec PR against Azure/azure-cosmos-dotnet-v3#5447 and the actual Rust source on main (client_retry_policy.rs).

Headline: The central design — latch on the 2nd 1002, header rides the next retry — is structurally incompatible with Rust's current single-master retry budget. should_retry_on_session_not_available returns DoNotRetry the moment session_token_retry_count > 1 (line 294), so the latch flips on the same call that ends the operation; there is no subsequent before_send_request for the header to attach to. The spec's stated Goal #1 (.NET observable parity) is not achievable as written without also extending the read-retry budget — that change is currently absent from §4.2.

11 inline findings posted: 3 🔴 Blocking, 4 🟡 Recommendation, 3 🟢 Suggestion, 1 💬 Observation. Four of these reinforce existing unresolved comments by @Copilot and are tagged accordingly — they are intentional re-raises because the underlying design issues remain unaddressed.

Top items to resolve before code lands:

  1. Choose explicitly: extend read-retry budget for true .NET parity, or change trigger point and document the parity gap.
  2. Replace the per-request can_use_multiple_write_locations gate with an account-level accessor (per @Copilot).
  3. Reconcile the worked example and AC-3/AC-6/AC-7 tests with whichever option above is chosen.

⚠️ AI-generated review — may be incorrect. Agree? → resolve the conversation. Disagree? → reply with your reasoning.

Comment thread sdk/cosmos/azure_data_cosmos/docs/HUB_REGION_PROCESSING_HEADER_SPEC.md Outdated
Comment thread sdk/cosmos/azure_data_cosmos/docs/HUB_REGION_PROCESSING_HEADER_SPEC.md Outdated
Comment thread sdk/cosmos/azure_data_cosmos/docs/HUB_REGION_PROCESSING_HEADER_SPEC.md Outdated
Comment thread sdk/cosmos/azure_data_cosmos/docs/HUB_REGION_PROCESSING_HEADER_SPEC.md Outdated
Comment thread sdk/cosmos/azure_data_cosmos/docs/HUB_REGION_PROCESSING_HEADER_SPEC.md Outdated
Comment thread sdk/cosmos/azure_data_cosmos/docs/HUB_REGION_PROCESSING_HEADER_SPEC.md Outdated
Comment thread sdk/cosmos/azure_data_cosmos/docs/HUB_REGION_PROCESSING_HEADER_SPEC.md Outdated
Comment thread sdk/cosmos/azure_data_cosmos/docs/HUB_REGION_PROCESSING_HEADER_SPEC.md Outdated
Comment thread sdk/cosmos/azure_data_cosmos/docs/HUB_REGION_PROCESSING_HEADER_SPEC.md Outdated
Major spec revision in response to 15 review threads on PR Azure#4310.
All threads accepted; zero rejections — every comment was traced to
a specific code anchor and verified.

Key correctness fixes:

- Latch trigger moved from `session_token_retry_count >= 2` to
  `== 1`. The current single-master budget at line 294 returns
  `DoNotRetry` once the counter exceeds 1, so the previous trigger
  flipped the latch on the same call that ended the operation —
  the header would never have reached the wire. This is a
  deliberate parity gap with .NET; full retry-count parity now
  captured as ALT-5 / Future Work.

- Single-master gate switched from the per-request
  `can_use_multiple_write_locations` field (always false for
  reads) to the account-level
  `LocationCache::can_use_multiple_write_locations()`. The
  previous gate would have fired the latch on multi-master reads
  too.

- Header value changed from `"true"` to `"True"` to match
  .NET's `bool.TrueString` wire format. The local
  `ALLOW_TENTATIVE_WRITES` precedent is a different header with
  an independent backend parser; we cannot assume case-insensitive
  parsing without a contract reference.

- Sub-status label corrected: 1002 in the 404 context is
  `READ_SESSION_NOT_AVAILABLE`, not `PARTITION_KEY_RANGE_GONE`
  (the latter is 410/1002).

- §4.3 CHANGELOG instructions rewritten to add
  `### Features Added` under the existing
  `## 0.32.0 (Unreleased)` block, not create a new block. OQ-2
  and §4.3 no longer contradict.

Documentation/maintainability fixes:

- Problem Statement (§1.1) rewritten to accurately describe the
  current single-retry-via-write-locations behavior.
- Inline-comment requirement at the increment site to pin the
  trigger boundary against silent breakage.
- AC-3 added: boundary regression test at counter == 1.
- Plain-`bool` (vs .NET `volatile bool`) justification stated
  inline in §3.1.
- AC-3/AC-6/AC-7 (post-latch 503 / 410/1022 retries) intentionally
  dropped — those retries don't exist under the current budget;
  AC-6 covers connection-failure persistence which does still
  exist.
- SE-003 promoted from "Phase 3 implementer must verify" to a
  formal §7.4 Acceptance Gates block (AG-1..AG-4).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@NaluTripician
Copy link
Copy Markdown
Author

Review feedback addressed — spec revision pushed (4ae40ee76)

All 15 review threads accepted; zero rejections. Each item was traced to a specific code anchor on main and verified before fixing. Resolution map below — please mark threads resolved as you confirm.

Blockers (🔴) — fixed, please verify and resolve

Thread Path/Line Fix
Latch has no outbound effect (Findings #1 & #3 in AI summary) §3.1, §3.4, §4.2(c), §7.1 Trigger moved from session_token_retry_count >= 2 to == 1. Verified client_retry_policy.rs:294 returns DoNotRetry once > 1, so the prior trigger flipped on the call that ended the operation — header never reached the wire. New trigger latches on the same call that returns Retry for the one permitted retry; header rides that retry.
Wrong gate (can_use_multiple_write_locations per-request) §3.1 #3, §4.2(c) Replaced with account-level LocationCache::can_use_multiple_write_locations() (verified: location_cache.rs:381, write_endpoints().len() > 1). Spec now explicitly forbids the two per-request variants and explains why.
@Copilot reinforcement on the same gate §3.1 #3 Same fix — single change resolves both.

Recommendations (🟡) — fixed

Thread Fix
Inaccurate baseline in Problem Statement §1.1 rewritten to describe single-retry-via-write-locations behavior, citing the actual code (!can_use_multiple_write_locations branch returning DoNotRetry on > 1).
@Copilot reinforcement on same Same fix.
Wire-format mismatch ("true" vs "True") §3.2, §3.3, §4.2(b), AC-2 wording: header value now "True" (capitalized) matching .NET bool.TrueString. §3.3 explains why this deliberately diverges from the adjacent ALLOW_TENTATIVE_WRITES "true" precedent (different header, different parser, no shared contract).
Sub-status name wrong §4.2(d), SE-005: 404/1002 = READ_SESSION_NOT_AVAILABLE. The PARTITION_KEY_RANGE_GONE reference (which is 410/1002) is removed.
Trigger boundary fragile §4.2(c): mandatory inline comment at the session_token_retry_count += 1; site cross-referencing the spec. AC-3 added: boundary-pinning regression test that pins counter==1 → header, counter==2 → no header.

Suggestions (🟢) — fixed

Thread Fix
Latch placement ambiguity Moot under the new design — latch site is now a single if immediately after the increment, no branch ambiguity. §4.2(c) shows the exact code.
Threading model (plain bool vs volatile bool) One-sentence justification added in §3.1 and §4.2(a) doc comment: "plain bool is correct because ClientRetryPolicy is mutated through &mut self for the duration of an operation." Also surfaced as AG-4 in §7.4 acceptance gates.
@Copilot reinforcement on §4.3/OQ-2 contradiction §4.3 rewritten: insert ### Features Added above the existing ### Breaking Changes in the existing ## 0.32.0 (Unreleased) block. No new version block. Matches OQ-2.

Observation (💬) — fixed

Thread Fix
Promote SE-003 to acceptance gate New §7.4 Acceptance Gates block (AG-1..AG-4) makes per-operation construction (AG-1), cross-operation isolation test (AG-2), within-operation persistence (AG-3), and the &mut self-only invariant (AG-4) blocking for the implementation PR.

Architectural decision: parity vs retry-budget

The headline blocker (Finding #1 in the AI review) gave us two paths:

  • (A) Extend Rust's read-retry budget to mirror .NET's sessionTokenRetryCount > ReadEndpoints.Count and keep the latch on the second 1002.
  • (B) Latch on the first 1002 and accept that Rust's existing single-master read budget runs the header on the one permitted retry.

Picked (B). Rationale documented in the new §1.2 Goals (re-scoped to "header-emission parity within Rust's existing retry budget") and as a new ALT-5 in §6 with full reasoning. (A) is preserved as a Future Work item with a concrete implementation outline. Bundling a retry-budget extension with header injection couples two unrelated risks and changes the read contract for every read in the SDK, not just 404/1002 — the same scope-discipline argument that kept skip-set rotation out of OQ-1.

CI

The Build Analyze failure is the only red on this PR; all 5 build/test matrix jobs (windows_stable, ubuntu_stable, ubuntu_nightly, ubuntu_msrv, macos_stable) pass. Logs are on Azure DevOps and not fetchable from outside, but spec-only PRs with no Rust source change typically fail Build Analyze on workspace-wide compliance scans (no version bump in any crate, no Cargo.lock change). Per the original PR scope (spec only) I'm leaving this as known-noise; happy to investigate the actual log if you can paste it.

Diff stats

4ae40ee76: 1 file changed, 199 insertions(+), 64 deletions(-).

All four findings accepted; verified each against `main` source.

- New-1 (blocker): §4.2(c) gate accessor rewritten. Verified
  `global_endpoint_manager.rs:31` declares
  `location_cache: Mutex<LocationCache>`; an accessor that
  returns `&LocationCache` cannot outlive the lock guard, so
  the previous `location_cache().can_use_multiple_write_locations()`
  chain would not compile. Spec now mandates a forwarding method
  `GlobalEndpointManager::account_supports_multi_write()` that
  encapsulates `self.location_cache.lock().unwrap().can_use_multiple_write_locations()`,
  matching the `.lock().unwrap()` pattern already used at lines
  283 and 354.

- New-2 (blocker): §4.2(c) sample replaces `... == false` with
  `!...`. The previous form would trip clippy's
  `bool_comparison` lint, and §7.3 mandates
  `cargo clippy -- -D warnings`. Spec example now passes the
  same gate it requires of implementations.

- New-3 (recommendation): §4.2(c) placement instruction tightened
  from "before any return" to "after the discovery-enabled early
  return and the counter increment, before the budget/gating
  conditionals." Removes the contradiction with §3.1 condition Azure#2
  (`enable_endpoint_discovery` early-out at the top of the
  function).

- New-4 (suggestion): §3.2 and AC-6 row reworded —
  "non-counted under `MAX_RETRY_COUNT_ON_CONNECTION_FAILURE`"
  → "tracked under a separate budget,
  `MAX_RETRY_COUNT_ON_CONNECTION_FAILURE`, rather than the
  `session_token_retry_count` budget". Removes the misleading
  implication that connection-failure retries are unbounded.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@NaluTripician
Copy link
Copy Markdown
Author

Second-pass review addressed — 86dc5f167

All four findings accepted. Verified each against main source before applying.

# Severity Finding Resolution
New-1 🔴 Correctness location_cache: Mutex<LocationCache> so &LocationCache can't escape the lock guard §4.2(c) now mandates a forwarding method on GlobalEndpointManager: pub(crate) fn account_supports_multi_write(&self) -> bool { self.location_cache.lock().unwrap().can_use_multiple_write_locations() }. Pattern matches existing .lock().unwrap() callers at lines 283 and 354. The latch condition is now !self.global_endpoint_manager.account_supports_multi_write().
New-2 🔴 Lint ... == false would trip bool_comparison and §7.3 mandates -D warnings §4.2(c) sample uses !.... Spec example now passes the gate it requires of implementations.
New-3 🟡 Precision "before any return" contradicts the enable_endpoint_discovery early-return at the top of the function §4.2(c) tightened to "after the discovery-enabled early return and the counter increment, before the budget/gating conditionals."
New-4 🟢 Wording "non-counted under MAX_RETRY_COUNT_ON_CONNECTION_FAILURE" reads as if connection-failure retries are unbounded §3.2 and AC-6 row reworded to "tracked under a separate budget, MAX_RETRY_COUNT_ON_CONNECTION_FAILURE, rather than the session_token_retry_count budget."

Diff: 4ae40ee76..86dc5f167 (1 file, +40/-22). §3.4, §6 ALT-5, §7.1, §7.4 unchanged.

NaluTripician and others added 4 commits April 29, 2026 13:47
GitHub renders Mermaid natively; the box-drawing version had alignment drift in raw view (mismatched column widths from variable-width chars in the inner labels). Same nodes, same edges, same captions; just renders cleanly now.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- SE-003 severity changed from the undefined Foundational to the legend-defined Breaking - accepted via AG-1..AG-4. Removes a reviewer distraction; the AG cross-reference makes the acceptance path explicit.

- Added a Confidence line under Status: traced the ClientRetryPolicy call graph and the GlobalEndpointManager/LocationCache mutex layering against main, but explicitly call out that the backend's case-sensitivity for the new header value is assumed (matching .NET's bool.TrueString) and not verified against a contract.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Rework spec to acknowledge the azure_data_cosmos / azure_data_cosmos_driver
split that landed on main:

- Confidence statement now scopes the driver-resident forward target
- New `Crate (interim home)` and `Crate (forward target)` headers
- New §1.5 Two-crate landscape table mapping concerns to crates and
  documenting that azure_data_cosmos does not yet depend on the driver
- §2 grows a `Forward target` subsection with a second Mermaid diagram
  showing the driver pipeline shape and the future HubRegionHeaderPolicy
- §4 leads with a crate-boundary note explaining why all file paths are
  under the SDK
- New SE-006 (crate-boundary drift) under Side Effects & Risk
- New ALT-6 enumerating three concrete prerequisites the driver lacks
  today (status-code retry, account topology surface, per-operation
  state model) and a 1:1 cross-walk table for the eventual migration
- New AG-5 acceptance gate requiring a // TODO(driver-migration):
  anchor at the latch declaration and emission sites
- New `Driver migration` bullet under §8 Future Work, listed first

Wire output and Day-1 implementation are unchanged: the SDK is still
the only place where the trigger's prerequisites exist on main, so
landing in ClientRetryPolicy is correct today and the migration is a
refactor, not a behavior change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fixes from PR Deep Reviewer pass on commit 36fe8cc:

Blocking
  Driver-crate file paths now use the canonical
    sdk/cosmos/azure_data_cosmos_driver/src/ prefix
    (driver/cosmos_driver.rs, driver/transport/pipeline.rs,
    driver/transport/mod.rs). Path convention noted in confidence
    header so reviewers can grep with confidence
  ALT-6 prerequisite Azure#2 reworded from No account topology surface
    (overstated; Region exists) to No account-metadata fetch or
    topology tracking, with the underscore-prefix placeholder
    semantic called out
  ALT-6 cross-walk gains a Status column. Trigger and single-master
    gate rows are flagged Design-pending (gated on prerequisites Azure#1
    and Azure#2); emission, latch, and constant rows remain Mechanical.
    Removes the false impression that the migration is copy/paste
  Replaced the forward-target Mermaid in section 2 with an honest
    diagram showing the current driver pipeline plus the three
    prerequisite gaps. The previous diagram implied a 404/1002
    feedback hook that does not exist on main

Recommendations
  Section 1.5 dependency row split into its own logical row
    (SDK consumes driver / Not yet) so the - dashes stop reading
    as third-crate ambiguity
  SE-006 mitigation now explains why double-emission cannot occur
    (SDK to driver migration removes ClientRetryPolicy status-code
    retry as part of the same change set)
  Confidence header reconciled with section 8: header-emission
    relocation is straightforward; only the driver-side retry
    shape is design-pending

Suggestions
  Section 4 crate-boundary callout trimmed to one sentence
  AC-6 wording clarifies connection failure is the SDK term
  Mermaid arrow consistency restored (no more vs mix; new
    diagram uses straight arrows only)

Approach-level critique was a ratification: ship the SDK version,
migrate when the driver gains prerequisites. No structural change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@NaluTripician
Copy link
Copy Markdown
Author

Driver-aligned rework (commits 36fe8ccca -> fb47493fc)

Per feedback that "transport and routing has moved into azure_data_cosmos_driver but this spec is mostly touching the SDK," the spec was reworked to acknowledge the two-crate landscape on main and then iterated through two passes of deep review.

Rework (commit 36fe8ccca)

  • New crate-boundary metadata (Crate (interim home) / Crate (forward target)) at top.
  • New section 1.5 Two-crate landscape with a responsibility table mapping each concern to SDK vs driver and noting the SDK does not yet depend on the driver crate on main.
  • New Forward target subsection in section 2 with a second Mermaid diagram for the driver pipeline.
  • New section 4 crate-boundary note prefacing all SDK file paths.
  • New SE-006 (crate-boundary drift) in Side Effects.
  • New ALT-6 enumerating three concrete driver prerequisites (no status-code retry, no account-metadata fetch / topology tracking, mismatched per-operation state model) with a 1:1 cross-walk for the eventual migration.
  • New AG-5 acceptance gate requiring // TODO(driver-migration): anchors at the latch declaration and emission sites.
  • New Driver migration bullet leading section 8 Future Work.

Wire output and Day-1 SDK implementation are unchanged: the SDK is still the only crate where the trigger's prerequisites exist on main.

Pass 1 deep-review findings (commit fb47493fc addresses)

# Severity Finding Resolution
1 Blocking Driver-crate citations used bare cosmos_driver.rs instead of canonical driver/cosmos_driver.rs All paths fixed; convention noted in confidence header
2 Blocking ALT-6 prerequisite #2 ("no account topology surface") was overstated since Region exists Reworded to "no account-metadata fetch or topology tracking"; underscore-prefix placeholder semantic called out
3 Blocking ALT-6 cross-walk hand-waved missing subsystem (driver retry hook) as if mechanical Added Status column; trigger and single-master-gate rows flagged Design-pending (gated on prerequisites #1/#2)
4 Blocking Forward-target Mermaid showed a 404/1002 feedback arrow that doesn't exist on main Replaced with honest diagram showing current driver pipeline + three prerequisite gaps as a separate Gap subgraph
5 Recommend section 1.5 SDK->Driver dependency row formatting was ambiguous Renamed to SDK consumes driver with explicit Not yet
6 Recommend SE-006 mitigation didn't explain why double-emission can't occur Added: SDK->driver migration removes ClientRetryPolicy's status-code retry as part of the same change set
7 Recommend Confidence header ("migration not yet designed") contradicted section 8 (which had a design) Split claim: header-emission relocation is straightforward; only the driver-side retry shape is design-pending
8 Suggest section 4 crate-boundary callout was redundant with section 1.5 Trimmed to one sentence
9 Suggest AC-6 mixed "connection failure" / "transport failure" terminology Clarified "connection failure" as the SDK's term for transport-layer failure
10 Suggest Arrow style inconsistency in Mermaid labels Resolved as part of #4 (ALT-6 diagram rewrite)

Pass 2 deep-review verdict

Reviewable. All four blockers resolved, all six lower-severity findings resolved, no new findings, no approach-level concerns. The reviewer's approach-level note: "The spec remains correctly scoped: a surgical SDK-side implementation with explicit forward-looking documentation of the driver-side migration, gated on prerequisites that are out of scope for this work item."

Ready for another team pass.

@NaluTripician NaluTripician changed the base branch from main to release/azure_data_cosmos-previews April 30, 2026 17:48
@NaluTripician NaluTripician changed the base branch from release/azure_data_cosmos-previews to main April 30, 2026 17:49
@NaluTripician
Copy link
Copy Markdown
Author

Closing in favor of #4320, which retargets this spec to release/azure_data_cosmos-previews. The Cosmos previews branch is where the feature actually needs to land and is 79 commits ahead of main. Deep-review findings against the previews-branch source will be posted as inline comments on #4320.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Cosmos The azure_cosmos crate

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants