Rate limiting observability (metrics and tracing) PR A#5701
Open
Sanskarzz wants to merge 1 commit into
Open
Conversation
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5701 +/- ##
=======================================
Coverage 70.66% 70.66%
=======================================
Files 667 667
Lines 67607 67625 +18
=======================================
+ Hits 47775 47790 +15
- Misses 16361 16365 +4
+ Partials 3471 3470 -1 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
RejectionSourcecontract andDecision.RejectedBywith stable identifiers for shared server, shared tool, per-user server, and per-usertool buckets.
ratelimit.Allowadapter andRateLimitedError, which are already used by MCPServer and vMCP.exhausted.
Part of #4553
Type of change
Test plan
task test-e2e)API Compatibility
v1beta1API, OR theapi-break-allowedlabel is applied and the migration guidance is described above.Changes
pkg/ratelimit/limiter.gopkg/ratelimit/errors.gopkg/ratelimit/limiter_test.gopkg/vmcp/ratelimit/decorator_test.goDoes this introduce a user-facing change?
No. Rate-limit enforcement, responses, Redis keys, retry timing, and fail-open behavior are unchanged.
Implementation plan
Approved implementation plan
RejectionSourcetype.Decision.RejectedByfrom the first rejected index returned by theexisting atomic Lua check.
ratelimit.Allowhelper returns aRateLimitedError.RejectedByempty for allowed decisions.first-rejection ordering.
Follow-up PRs
This is the first PR in the rate-limit observability work. To keep each review
focused, the remaining changes will be opened sequentially after the preceding
PR merges:
Core rate-limit metrics
Add OpenTelemetry instruments for allowed/rejected decisions, Redis errors,
and Lua check latency. Decision metrics will use the rejection metadata from
this PR to distinguish shared versus per-user and server versus tool checks.
Redis failures will be classified into bounded error types, and latency will
cover each attempted atomic Lua check without adding tool names or user IDs
as high-cardinality labels.
This PR will include focused metric-reader unit tests, update MCPServer
middleware ordering so telemetry is active when rate limiting executes, and
extend the existing vMCP rate-limit E2E to enable Prometheus, send allowed
and rejected traffic, scrape
/metrics, and verify non-zero decision andlatency series. The new metric names, labels, and counting semantics will be
documented in the same PR.
Fail-open metric
Add a dedicated counter for requests that proceed because Redis was
unavailable. This is separate from the Redis-error counter: the error metric
reports the dependency failure, while the fail-open metric reports its
effect on enforcement. The implementation will preserve the current policy
in both production paths: MCPServer HTTP middleware and the vMCP
core.VMCPdecorator continue forwarding requests after infrastructureerrors.
Focused tests will verify one increment per affected request, no increments
for normal allows or rate-limit rejections, repeated outage behavior, and
continued delegation through both enforcement paths. Its operational
meaning will be added to the metric documentation in the same PR.
Request-span attributes
Annotate the existing request span rather than creating a second span.
Allowed requests will report
rate_limit.decision=allowed,rate_limit.rejected_by=none, andrate_limit.fail_open=false. Rejectedrequests will identify the bucket preserved by this PR. Redis failures will
report an allowed decision with
rate_limit.fail_open=true, making itpossible to distinguish a normal allow from an unenforced request during an
outage.
Span-recorder tests will cover allowed, rejected, and fail-open outcomes
through both the MCPServer middleware and vMCP decorator, including the
invariant that rejected calls never reach the next handler or inner
core.VMCP. The tracing attribute contract will be documented in that PR.Each follow-up will contain the tests and documentation as needed.
Special notes for reviewers
The new metadata uses the rejected bucket index already returned by
bucket.ConsumeAll; it does not introduce another rate-limit implementation or perform a second Redis check. It is propagated through the existingratelimit.AllowandRateLimitedErrorpath, while the vMCPcore.VMCPdecorator implementation remains unchanged.