Skip to content

Conversation

@keivenchang
Copy link
Contributor

@keivenchang keivenchang commented Jan 27, 2026

Overview:

Update Dynamo docs after the kvstats Prometheus names removal (DYN-1967).

Details:

  • Update docs

Where should the reviewer start?

docs/observability/metrics.md
docs/kubernetes/autoscaling.md

Related Issues: (use Closes / Fixes / Resolves / Relates to)

Relates to DYN-1967

/coderabbit profile chill

Summary by CodeRabbit

  • Documentation

    • Updated autoscaling guidance with new SGLang KV cache hit rate metric for Decode service recommendations.
    • Renamed KV Router metrics section and introduced new kv_cache_events_applied counter metric with event tracking labels.
  • Chores

    • Updated code generation documentation and examples to reflect new metric naming conventions.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>
@keivenchang keivenchang requested a review from a team as a code owner January 27, 2026 23:10
@github-actions github-actions bot added docs documentation Improvements or additions to documentation labels Jan 27, 2026
@keivenchang keivenchang self-assigned this Jan 27, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 27, 2026

Walkthrough

The pull request updates KV cache and router metrics throughout documentation and code examples. The changes replace kvstats_gpu_cache_usage_percent with sglang:cache_hit_rate in autoscaling guidance, rename the KV Router section from "kvstats" to "kvrouter", and introduce the new dynamo_component_kv_cache_events_applied metric with associated labels in observability documentation.

Changes

Cohort / File(s) Summary
Metrics Documentation
docs/kubernetes/autoscaling.md, docs/observability/metrics.md, fern/pages/kubernetes/autoscaling.md, fern/pages/observability/metrics.md
Replaced KV cache usage metric from kvstats_gpu_cache_usage_percent to sglang:cache_hit_rate in autoscaling context. Renamed KV Router section from "kvstats" to "kvrouter" and replaced old GPU cache metrics with new dynamo_component_kv_cache_events_applied counter metric featuring event_type and status labels.
Python Codegen Documentation
lib/bindings/python/codegen/README.md, lib/bindings/python/codegen/src/gen_python_prometheus_names.rs
Updated code examples and generated output text to reflect metric naming changes from kvstats to kvrouter (e.g., kvstats_active_blockskv_cache_events_applied). Removed reference to macro-expansion documentation.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 Hops through metrics with delight,
Old kvstats fading from sight,
Cache hit rates and routers anew,
Dynamo components shining through!
Documentation refreshed and bright!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and concisely summarizes the main change: updating metrics documentation after kvstats removal with the issue reference DYN-1967.
Description check ✅ Passed The description follows the required template with all key sections completed: Overview explains the kvstats removal context, Details outlines documentation updates, and specific files are called out for review with related issue reference.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@fern/pages/observability/metrics.md`:
- Around line 125-131: The metric name documented as
`dynamo_component_kv_cache_events_applied` is missing the required `_total`
Prometheus suffix; update the documentation to
`dynamo_component_kv_cache_events_applied_total` wherever it appears
(referencing the KV Router metric entry and the `kv_cache_events_applied` metric
mention) to match team conventions (see `prometheus_names.rs` and PR 3035) while
leaving the existing label set (`event_type`, `status`) unchanged.

Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>
@keivenchang keivenchang force-pushed the keivenchang/DYN-1967__kvstats-document-outdated branch from 28a4b85 to d924fae Compare January 28, 2026 00:10
@keivenchang keivenchang marked this pull request as ready for review January 28, 2026 00:11
Copy link
Contributor

@biswapanda biswapanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm from k8s perspective

Copy link
Contributor

@jh-nv jh-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@keivenchang keivenchang merged commit 8e72fb6 into main Jan 28, 2026
49 of 50 checks passed
@keivenchang keivenchang deleted the keivenchang/DYN-1967__kvstats-document-outdated branch January 28, 2026 01:24
keivenchang added a commit that referenced this pull request Jan 28, 2026
Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
keivenchang added a commit that referenced this pull request Jan 28, 2026
Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs documentation Improvements or additions to documentation size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants