-
Notifications
You must be signed in to change notification settings - Fork 533
[Internal] Diagnostics: Adds spec for CosmosDiagnostics compaction summary mode #5644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
f5f433a
982cd98
ebbc233
a19ffa6
dca524b
619b328
8dce13c
3556eef
56c449a
5f7c315
ccd17b5
f03c76c
4e4a030
43f6d9e
c2c356f
b24564a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,99 @@ | ||
| # OpenSpec — Spec-Driven Development for the Azure Cosmos DB .NET SDK | ||
|
|
||
| This directory contains [OpenSpec](https://github.com/openspec-dev/openspec) artifacts for the Azure Cosmos DB .NET v3 SDK. OpenSpec provides a structured, AI-assisted workflow for proposing, specifying, designing, and implementing changes. | ||
|
|
||
| ## Why OpenSpec? | ||
|
|
||
| The Cosmos DB .NET SDK is a large, complex codebase (~1,400+ source files) with many interdependent subsystems — retry policies, handler pipelines, cross-region routing, change feed processing, query execution, and more. OpenSpec helps by: | ||
|
|
||
| 1. **Capturing behavioral contracts** — Specs define *what* a feature should do (invariants, edge cases, error handling), not *how* it's implemented. This makes them durable even as implementation evolves. | ||
| 2. **Guiding AI-assisted development** — AI agents use specs as context when proposing and implementing changes, leading to more accurate code generation. | ||
| 3. **Reducing tribal knowledge** — Complex features like PPAF, cross-region hedging, and the handler pipeline have subtle invariants that are easy to break. Specs make these invariants explicit and reviewable. | ||
|
|
||
| ## Directory Structure | ||
|
|
||
| ``` | ||
| openspec/ | ||
| ├── config.yaml # Project context and artifact rules | ||
| ├── README.md # This file | ||
| ├── specs/ # Main spec catalog (living documentation) | ||
| │ ├── README.md # Spec index organized by area | ||
| │ ├── retry-and-failover/ | ||
| │ │ └── spec.md | ||
| │ └── ... | ||
| ├── changes/ # Active changes (in-progress work) | ||
| │ └── archive/ # Completed changes | ||
| ``` | ||
|
|
||
| | Concept | Location | Purpose | | ||
| |---------|----------|---------| | ||
| | **Specs** | `openspec/specs/<feature>/spec.md` | Living behavioral contracts for major feature areas. | | ||
| | **Changes** | `openspec/changes/<name>/` | In-progress work with proposal, design, and task artifacts. | | ||
| | **Archive** | `openspec/changes/archive/` | Completed changes with full context preserved. | | ||
| | **Config** | `openspec/config.yaml` | Project context and per-artifact rules that guide AI. | | ||
|
|
||
| ## Workflow | ||
|
|
||
| ``` | ||
| Propose ──▶ Specs ──▶ Design ──▶ Tasks ──▶ Apply ──▶ Archive | ||
| ``` | ||
|
|
||
| | Command | Purpose | | ||
| |---------|---------| | ||
| | `/opsx:propose <name>` | Create a new change with proposal, design, and task artifacts | | ||
| | `/opsx:apply [name]` | Implement tasks from a change | | ||
| | `/opsx:explore [topic]` | Investigate ideas or problems without making code changes | | ||
| | `/opsx:archive [name]` | Archive a completed change | | ||
|
|
||
| ## Writing Good Specs | ||
|
|
||
| Specs capture **behavioral contracts** using [EARS notation](https://en.wikipedia.org/wiki/Easy_Approach_to_Requirements_Syntax) (WHEN/THEN/SHALL). They should answer: "What do I need to know to safely modify this feature?" | ||
|
|
||
| ### What a spec should include | ||
|
|
||
| 1. **Purpose** — One-paragraph summary of what the feature does | ||
| 2. **Public API surface** — Key types, methods, and their contracts (C# code blocks) | ||
| 3. **Requirements** — Behavioral requirements using EARS notation (`WHEN <condition>, THEN the SDK SHALL <behavior>`) | ||
| 4. **Reference tables** — Status code tables, configuration defaults, parameter matrices for dense reference data | ||
| 5. **Edge cases** — Non-obvious behaviors, race conditions, failure modes | ||
| 6. **Interactions** — How this feature relates to other SDK components (cross-spec links) | ||
| 7. **References** — Links to source files and existing design docs | ||
|
|
||
| ### What a spec should NOT include | ||
|
|
||
| - Implementation details (specific variable names, internal algorithms) | ||
| - Performance benchmarks (these change; use test projects instead) | ||
| - Step-by-step code walkthroughs (that's what `docs/SdkDesign.md` is for) | ||
|
|
||
| ## When to Create or Update Specs | ||
|
|
||
| **Create a new spec when:** | ||
| - Adding a new major feature to the SDK | ||
| - An area has complex invariants that are easy to break | ||
| - The same behavioral rules are explained in multiple PR reviews | ||
|
|
||
| **Update an existing spec when:** | ||
| - Your PR changes behavior covered by a spec | ||
| - A bug fix reveals an invariant that wasn't captured | ||
| - A design doc in `docs/` gets updated | ||
|
|
||
| **Don't need a spec for:** | ||
| - Pure refactoring with no behavioral change | ||
| - Test-only changes, documentation updates, dependency bumps | ||
|
|
||
| ## Best Practices | ||
|
|
||
| | ✅ Do | ❌ Don't | | ||
| |-------|---------| | ||
| | Be specific about invariants (status codes, timeouts) | Copy implementation details into specs | | ||
| | Use EARS notation for requirements | Create a spec per class (group by feature area) | | ||
| | Include cross-spec "Interactions" sections | Let specs go stale | | ||
| | Reference source files by path | Duplicate content from `docs/` | | ||
| | Update specs as part of behavioral change PRs | Skip `/opsx:explore` for complex changes | | ||
| | Review spec diffs in PRs like code | Archive before the PR is merged | | ||
|
|
||
| ## Related Documentation | ||
|
|
||
| - [Spec Index](specs/README.md) — All specs organized by area | ||
| - [SdkDesignGuidelines.md](../SdkDesignGuidelines.md) — Public API contract rules | ||
| - [docs/SdkDesign.md](../docs/SdkDesign.md) — SDK architecture overview |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,105 @@ | ||
| # Diagnostics Compaction — Design | ||
|
|
||
| ## Summary Compaction Algorithm | ||
|
|
||
| ### Data Collection | ||
|
|
||
| Walk the `ITrace` tree (same traversal as `SummaryDiagnostics.CollectSummaryFromTraceTree()`) to collect all `StoreResponseStatistics` and `HttpResponseStatistics` entries from every `ClientSideRequestStatisticsTraceDatum` in the trace hierarchy. | ||
|
|
||
| ### Region Grouping | ||
|
|
||
| Group collected entries by `Region` (string). Entries with a null/empty region are grouped under `"Unknown"`. | ||
|
|
||
| ### Per-Region Summary | ||
|
|
||
| For each region group (ordered chronologically by request start time): | ||
|
|
||
| 1. **First**: Full details of the chronologically first request | ||
| 2. **Last**: Full details of the chronologically last request (omitted if only 1 request) | ||
| 3. **Middle entries** (all except first and last): Group by `(StatusCode, SubStatusCode)`: | ||
| - **Count**: Number of requests in this group | ||
| - **TotalRequestCharge**: Sum of RU charges | ||
| - **MinDurationMs / MaxDurationMs / P50DurationMs / AvgDurationMs**: Latency statistics | ||
|
|
||
| ### Size Enforcement | ||
|
|
||
| 1. Serialize the summary JSON | ||
| 2. If `serializedBytes <= MaxDiagnosticsSummarySizeBytes` → return as-is | ||
| 3. If `serializedBytes > MaxDiagnosticsSummarySizeBytes` → return truncated output | ||
|
|
||
| ### Handling Both Direct and Gateway Requests | ||
|
|
||
| Both `StoreResponseStatistics` (direct mode) and `HttpResponseStatistics` (gateway mode) are collected and treated uniformly in the summary. The aggregated groups include entries from both transport paths. An optional `"TransportType"` field (`"Direct"` / `"Gateway"`) can be included in aggregated groups if needed to distinguish. | ||
|
|
||
| ## Request Flow | ||
|
|
||
| ```mermaid | ||
| flowchart TD | ||
| A["ToString(DiagnosticsVerbosity)"] --> B{Verbosity?} | ||
| B -->|Detailed| C["Existing TraceJsonWriter path"] | ||
| B -->|Summary| D["DiagnosticsSummaryWriter"] | ||
| D --> E["Walk ITrace tree"] | ||
| E --> F["Collect StoreResponseStatistics\n+ HttpResponseStatistics"] | ||
| F --> G["Group by Region"] | ||
| G --> H["Per region:\nFirst + Last + Aggregated Middle"] | ||
| H --> I["Serialize to JSON"] | ||
| I --> J{Size <= Max?} | ||
| J -->|Yes| K["Return summary JSON"] | ||
| J -->|No| L["Return truncated JSON"] | ||
| C --> M["Return full trace JSON"] | ||
| ``` | ||
|
|
||
| ## Files to Create | ||
|
|
||
| | File | Description | | ||
| |------|-------------| | ||
| | `Microsoft.Azure.Cosmos/src/Diagnostics/DiagnosticsVerbosity.cs` | `DiagnosticsVerbosity` enum | | ||
| | `Microsoft.Azure.Cosmos/src/Diagnostics/DiagnosticsSummaryWriter.cs` | Summary computation and JSON serialization logic | | ||
|
|
||
| ## Files to Modify | ||
|
|
||
| | File | Change | | ||
| |------|--------| | ||
| | `CosmosClientOptions.cs` | Add `DiagnosticsVerbosity` and `MaxDiagnosticsSummarySizeBytes` properties with validation | | ||
| | `CosmosDiagnostics.cs` | Add `ToString(DiagnosticsVerbosity)` abstract overload | | ||
| | `CosmosTraceDiagnostics.cs` | Implement `ToString(DiagnosticsVerbosity)` overload; delegate to `DiagnosticsSummaryWriter` when verbosity is `Summary` | | ||
| | `TraceWriter.TraceJsonWriter.cs` | Add summary serialization path that delegates to `DiagnosticsSummaryWriter` when verbosity is `Summary` | | ||
| | `SummaryDiagnostics.cs` | Extend `CollectSummaryFromTraceTree()` to support region-grouped collection with ordering | | ||
| | `ClientSideRequestStatisticsTraceDatum.cs` | Ensure `StoreResponseStatistics` and `HttpResponseStatistics` lists are accessible for summary computation | | ||
|
|
||
| ## Contract/Baseline Updates | ||
|
|
||
| | File | Change | | ||
| |------|--------| | ||
| | `ContractEnforcementTests.cs` baseline | Update public API contract for new enum and properties | | ||
|
|
||
| ## Alternatives Considered | ||
|
|
||
| ### Alternative 1: Emit summary alongside truncated trace tree | ||
| Instead of replacing the full trace, emit the summary _alongside_ the first + last children of the trace tree. | ||
|
|
||
| **Pros:** Preserves some trace structure for tooling that parses it. | ||
| **Cons:** Larger output size; complex to implement; defeats the purpose of compaction. | ||
| **Decision:** Rejected — summary replaces the full trace. The `First` and `Last` entries in each region summary provide the detailed bookends. | ||
|
|
||
| ### Alternative 2: Per-request verbosity via RequestOptions | ||
| Add a `DiagnosticsVerbosity` property to `RequestOptions` for per-request control. | ||
|
|
||
| **Pros:** More granular control. | ||
| **Cons:** Verbosity is a serialization concern, not a request concern. The `ToString(DiagnosticsVerbosity)` overload provides the same flexibility without complicating `RequestOptions`. | ||
| **Decision:** Deferred. Can be added later if needed. | ||
|
|
||
| ### Alternative 3: Transport type distinction in aggregated groups | ||
| Include a `TransportType` field (`"Direct"` / `"Gateway"`) in each aggregated group. | ||
|
|
||
| **Pros:** Helps distinguish transport-specific issues. | ||
| **Cons:** Increases output size; `StatusCode/SubStatusCode` is usually sufficient. | ||
| **Decision:** Deferred. Can add later if customer feedback warrants it. | ||
|
|
||
| ## Key References | ||
|
|
||
| - `Microsoft.Azure.Cosmos/src/Diagnostics/CosmosTraceDiagnostics.cs` — concrete diagnostics implementation | ||
| - `Microsoft.Azure.Cosmos/src/Tracing/TraceWriter.TraceJsonWriter.cs` — current trace serialization | ||
| - `Microsoft.Azure.Cosmos/src/Diagnostics/SummaryDiagnostics.cs` — existing summary aggregation (foundation) | ||
| - `Microsoft.Azure.Cosmos/src/Tracing/TraceData/ClientSideRequestStatisticsTraceDatum.cs` — stats data | ||
| - `docs/SdkDesign.md` — SDK architecture overview | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,71 @@ | ||
| # Diagnostics Compaction — Proposal | ||
|
|
||
| ## Problem | ||
|
|
||
| `CosmosDiagnostics.ToString()` produces a JSON trace that grows **unboundedly** with retries. Each retry attempt creates a new child `ITrace` node containing a full `ClientSideRequestStatisticsTraceDatum` with complete `StoreResponseStatistics` and `HttpResponseStatistics` entries. In pathological scenarios (sustained 429 throttling, transient failures, cross-region failovers), a single operation's diagnostics can grow to hundreds of KB. | ||
|
|
||
| **Impact:** | ||
| - **Log truncation** — monitoring systems (Application Insights, Azure Monitor, etc.) silently drop oversized log entries | ||
| - **Memory pressure** — large diagnostic strings increase GC overhead, especially at high throughput | ||
| - **Readability** — operators cannot quickly extract signal from noise when hundreds of identical retry entries are listed | ||
|
|
||
| **Example scenario:** A point read that encounters 50 retries due to 429 throttling in West US 2, then fails over to East US 2 with 10 more retries, produces ~60 full `StoreResponseStatistics` entries in the trace tree. With summary mode, this compacts to: first request + last request + 1 aggregated group per region. | ||
|
|
||
| ## Proposed Approach | ||
|
|
||
| Introduce a **`DiagnosticsVerbosity`** concept (modeled after [Azure/azure-sdk-for-rust#3592](https://github.com/Azure/azure-sdk-for-rust/pull/3592)) that controls how `CosmosDiagnostics.ToString()` serializes trace data: | ||
|
|
||
| | Mode | Behavior | Use Case | | ||
| |------|----------|----------| | ||
| | **Detailed** (default) | Current behavior — full trace tree output | Debugging, development | | ||
| | **Summary** | Region-grouped compaction with first/last + aggregated middle | Production logging, size-constrained environments | | ||
|
|
||
| **Key design principle:** The in-memory representation (`ITrace` tree, `ClientSideRequestStatisticsTraceDatum`) stays **unchanged**. Compaction only happens at **serialization time** in the `TraceJsonWriter` path. This preserves full programmatic access to diagnostics data while reducing serialized output size. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟡 Recommendation · Completeness: Scope Gap Cross-partition query diagnostics growth not addressed @kirankumarkolli noted 'Cross partition queries are other bigger source of issues.' The proposal focuses on retry-heavy scenarios (429 throttling, failover), but cross-partition queries can produce large diagnostics due to fan-out across many physical partitions — regardless of retries. A query fanning out to 50 partitions produces 50 separate Suggestion: Add a 'Scope Limitations' note or expand 'Non-Goals' to acknowledge that cross-partition query fan-out diagnostics are not addressed in this phase, and note whether a future phase could address it. |
||
|
|
||
| ## SDK Area | ||
|
|
||
| - **Primary:** Diagnostics | ||
| - **Secondary:** Client-config (new options properties) | ||
|
|
||
| ## Preview vs GA | ||
|
|
||
| The `DiagnosticsVerbosity` enum and related options should ship as **GA** (non-preview) since it's an additive, backward-compatible feature with no impact when not opted into. | ||
|
|
||
| ## Backward Compatibility | ||
|
|
||
| - **Default is `Detailed`** — no behavioral change for existing users | ||
| - **No breaking changes** — `ToString()` output format only changes when `Summary` is explicitly opted into | ||
| - **Programmatic API unchanged** — `GetContactedRegions()`, `GetFailedRequestCount()`, etc. continue to work from the full in-memory trace regardless of verbosity | ||
|
|
||
| ## Rollout Strategy | ||
|
|
||
| 1. Ship with `Detailed` as default in initial release | ||
| 2. Document `Summary` mode in SDK documentation and changelog | ||
| 3. Consider making `Summary` the default in a future major version after customer feedback | ||
|
|
||
| ## Non-Goals | ||
|
|
||
| - Changing the in-memory `ITrace` tree structure | ||
| - Modifying the `Detailed` mode output format | ||
| - Adding new programmatic APIs beyond `ToString(DiagnosticsVerbosity)` overload | ||
| - Per-request verbosity override via `RequestOptions` (can be added later) | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟢 Suggestion · Completeness: Replica Information Consider preserving replica/endpoint information in summary mode @NaluTripician noted in PR comments: 'Also important, how can we preserve important information such as which replicas are contacted/failing?' This concern isn't addressed in the current spec. Summary mode compacts middle requests into aggregated stats, but replica-level detail (which specific replicas were contacted, which failed) is potentially lost. This information is critical for IcM debugging — knowing that replica X in region Y is the one failing is often the key diagnostic signal. Suggestion: Add a requirement or open question about preserving replica/endpoint information in summary mode output. At minimum, the |
||
|
|
||
| ## Resolved Questions | ||
|
|
||
| 1. **Should `AggregatedGroups` include an `AvgDurationMs` field?** The Rust SDK only includes min/max/P50. Adding avg is cheap to compute but adds to the output size. _Decision: Include avg. It's a single field and provides useful signal._ | ||
|
|
||
| 2. **Should the summary include the `children` trace tree at all?** Currently proposed as replacing the entire trace output. An alternative is to emit the summary _alongside_ a truncated trace tree (e.g., first + last children only). _Decision: Summary replaces the full trace. The `First` and `Last` entries in each region summary provide the detailed bookends._ | ||
|
|
||
| 3. **Gateway vs Direct distinction in aggregated groups.** Should each `AggregatedGroup` indicate whether it's from Direct or Gateway transport? _Decision: Defer. The `StatusCode/SubStatusCode` combination is usually sufficient. Can add a `TransportType` field later if needed._ | ||
|
|
||
| 4. **Caching.** The Rust SDK caches serialized JSON per verbosity level via `OnceLock`. Should the .NET SDK cache the summary JSON? _Decision: Yes, use `Lazy<string>` or similar. `ToString()` may be called multiple times (logging, telemetry, etc.)._ | ||
|
|
||
| 5. **Thread safety.** `CosmosDiagnostics.Verbosity` as a settable property on a potentially shared object needs consideration. _Decision: Use the `ToString(DiagnosticsVerbosity)` overload which avoids mutating state entirely. The property is set once from `CosmosClientOptions` during response creation and read during serialization._ | ||
|
|
||
| ## References | ||
|
|
||
| - **Rust SDK PR:** [Azure/azure-sdk-for-rust#3592](https://github.com/Azure/azure-sdk-for-rust/pull/3592) — `DiagnosticsContext` with `Summary` and `Detailed` modes | ||
| - **Current .NET diagnostics:** `Microsoft.Azure.Cosmos/src/Diagnostics/` and `Microsoft.Azure.Cosmos/src/Tracing/` | ||
| - **Existing summary:** `SummaryDiagnostics.cs` — aggregates `(StatusCode, SubStatusCode)` counts (foundation to build on) | ||
| - **Trace tree:** `ITrace` → `Trace` with recursive children and `ClientSideRequestStatisticsTraceDatum` data | ||
| - **Related spec:** `openspec/specs/diagnostics-and-observability/spec.md` | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Recommendation · Completeness: Missing Alternative
Hybrid encoding alternative from review discussion not captured
@kirankumarkolli raised 'JSON is verbose, thoughts on encoding which might help?' and @NaluTripician replied with thoughts from offline discussion: 'Consider a hybrid format with JSON text readable summary + encoded information that contains more details for debugging.'
This is a substantive alternative approach that emerged from review, but it's not listed in the Alternatives Considered section. Future readers won't know this option was discussed and evaluated.
Suggestion: Add 'Alternative 4: Hybrid format (readable summary + encoded details)' with pros/cons and decision rationale, even if the decision is to defer. This preserves the design context from the review discussion.