feat(worker): add token bucket rate limiter Durable Object#5504
feat(worker): add token bucket rate limiter Durable Object#5504
Conversation
Implement a production-grade token bucket rate limiter for the AI Gateway using Cloudflare Durable Objects for consistent enforcement at high concurrency. Key features: - Token bucket algorithm with lazy refill (no background timers) - Supports request-based and cost-based (cents) limiting - Segment types: global, per-user (Helicone-User-Id), per-property - Policy header format: [quota];w=[window];u=[unit];s=[segment] - Atomic operations via DO's single-threaded execution model - Configurable fail-open/fail-closed on errors - Policy change detection and graceful handling Files added: - TokenBucketRateLimiterDO: Core DO with bucket state management - policyParser: Robust policy string parsing with validation - segmentExtractor: Header-based segment identifier extraction - tokenBucketClient: Worker integration layer 90 unit tests covering: - Policy parsing edge cases - Token bucket math (refill, consume, reset time) - Segment extraction and DO key generation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub. 3 Skipped Deployments
|
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
This pull request has been ignored for the connected project Preview Branches by Supabase. |
|
This pull request has been ignored for the connected project Preview Branches by Supabase. |
|
Claude finished @replicas-connector[bot]'s task —— View job PR Review: Token Bucket Rate LimiterScore: 8/10 - Well-implemented production-grade rate limiter with excellent test coverage and thoughtful design decisions. Todo List:
Detailed Review✅ StrengthsExcellent Algorithm Design
Strong Security & Robustness
Excellent Test Coverage
Production-Ready Integration
|
|
CI/CD Failure - Resolved Workflow Fix: Fixed lint error Commit: bebeff5 |
|
CI/CD Failure Workflow |
- Add eslint-disable comment for control-regex in segmentExtractor - Fix prettier formatting issues in policyParser, segmentExtractor, tokenBucketClient - Remove unused SegmentExtractionError import - Change single quotes to double quotes for string literal Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
CI/CD Failure - Resolved Workflow Root Cause: The Fix: Regenerated Commits:
|
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
CI/CD Failure - Resolved Workflow Root Cause: This workflow ran against an intermediate commit that still had wrangler version mismatch in the generated types. Fix: The fix was already pushed in commit Status: The new workflow run |
Use the same wrangler version as CI (4.53.0) to ensure the generated worker-configuration.d.ts matches exactly. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Integrates the TokenBucketRateLimiterDO into the proxy request handler: - Add checkTokenBucketRateLimit call in ProxyForwarder.ts after existing rate limit checks - Add addTokenBucketRateLimitHeaders method to ResponseBuilder - Rate limiting is triggered by the Helicone-RateLimit-Policy header - Uses fail-open behavior to preserve availability on errors - Adds rate limit response headers (Limit, Remaining, Policy, Reset) - Returns HTTP 429 when rate limited Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
CI/CD Failure - Unrelated to PR Changes Workflow Findings: The failure is in This is a pre-existing flaky test issue with the Cloudflare Vitest pool workers' Durable Object storage isolation, not related to the token bucket rate limiter changes in this PR. Evidence:
Recommendation: Re-run the workflow or investigate the flaky test separately. |
The rate limit filter was looking up a property filter by label, which failed
when the Helicone-Rate-Limit-Status property hadn't been used yet. This caused
the filter node to be an empty object ({}) that matched all requests instead
of only rate-limited ones.
Fixed by building the filter node directly using the known property structure.
Use empty object {} when not filtering (valid FilterNode type) instead of "all"
string which causes backend validation errors.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
8d04c3c to
72a8abf
Compare
The rate limit filter was looking up a property filter by label, which failed
when the Helicone-Rate-Limit-Status property hadn't been used yet. This caused
the filter node to be an empty object ({}) that matched all requests instead
of only rate-limited ones.
Fixed by building the filter node directly using the known property structure
with the correct value "bucket_rate_limited" (not "rate_limited").
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Changed the chart's userFilters to use the correct property value "bucket_rate_limited" instead of "rate_limited". Also simplified the filter structure to avoid validation errors with nested "all" strings. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Resolved conflicts keeping bucket rate limiter implementation while merging DataDog tracer imports from main. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Includes tracer.setOrgId() call that was in the main branch's rate limit tracking for correlation purposes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
CI/CD Failure - Unresolved (Flaky Test) Workflow Findings: Same pre-existing flaky test issue in Root Cause: This is a known issue with All rate-limit tests passed successfully. The failing test is unrelated to token bucket rate limiter implementation. Recommendation: This flaky test should be addressed separately - it's been failing intermittently across multiple CI runs and predates this PR. |
- Add tracer and traceContext parameters to checkBucketRateLimit - Add tracer and traceContext parameters to recordBucketUsage - Add spans with metrics: remaining, rate_limited, quota_limit, time_window_seconds, rate_limit_unit, segment info - Pass tracer/traceContext from ProxyForwarder to bucket functions Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
CI/CD Failure - Unresolved (Flaky Test) Workflow Findings: Same pre-existing flaky test issue in Root Cause: This is a known issue with All 1256 other tests passed successfully, including all rate-limit tests. The failing test is unrelated to the token bucket rate limiter implementation. Recommendation: This flaky test should be addressed separately - consider increasing the timeout for this specific test or investigating the DO storage isolation issue in the test framework. |
Summary
Helicone-RateLimit-PolicyheaderHelicone-User-IdHelicone-Property-[Name](e.g., organization, tenant)Key Design Decisions
Policy Header Format
Examples:
1000;w=3600- 1000 requests per hour, global5000;w=86400;u=cents- $50 per day, global100;w=60;s=user- 100 requests per minute, per user10000;w=3600;s=organization- 10000 requests per hour, per organizationFiles Added
TokenBucketRateLimiterDO.tspolicyParser.tssegmentExtractor.tstokenBucketClient.tstest/rate-limit/*.spec.tsTest plan
wrangler dev(can test global/user/property isolation)🤖 Generated with Claude Code