Skip to content

Commit 15f5505

Browse files
replicas-connector[bot]claudeH2ShamiJustin Torre
authored
feat(worker): add token bucket rate limiter Durable Object (#5504)
* feat(worker): add token bucket rate limiter Durable Object Implement a production-grade token bucket rate limiter for the AI Gateway using Cloudflare Durable Objects for consistent enforcement at high concurrency. Key features: - Token bucket algorithm with lazy refill (no background timers) - Supports request-based and cost-based (cents) limiting - Segment types: global, per-user (Helicone-User-Id), per-property - Policy header format: [quota];w=[window];u=[unit];s=[segment] - Atomic operations via DO's single-threaded execution model - Configurable fail-open/fail-closed on errors - Policy change detection and graceful handling Files added: - TokenBucketRateLimiterDO: Core DO with bucket state management - policyParser: Robust policy string parsing with validation - segmentExtractor: Header-based segment identifier extraction - tokenBucketClient: Worker integration layer 90 unit tests covering: - Policy parsing edge cases - Token bucket math (refill, consume, reset time) - Segment extraction and DO key generation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(worker): fix lint errors in rate limiter modules - Add eslint-disable comment for control-regex in segmentExtractor - Fix prettier formatting issues in policyParser, segmentExtractor, tokenBucketClient - Remove unused SegmentExtractionError import - Change single quotes to double quotes for string literal Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * chore(worker): regenerate types with wrangler for token bucket DO Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * chore(worker): regenerate types with wrangler 4.53.0 to match CI Use the same wrangler version as CI (4.53.0) to ensure the generated worker-configuration.d.ts matches exactly. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(worker): integrate token bucket rate limiter into request flow Integrates the TokenBucketRateLimiterDO into the proxy request handler: - Add checkTokenBucketRateLimit call in ProxyForwarder.ts after existing rate limit checks - Add addTokenBucketRateLimitHeaders method to ResponseBuilder - Rate limiting is triggered by the Helicone-RateLimit-Policy header - Uses fail-open behavior to preserve availability on errors - Adds rate limit response headers (Limit, Remaining, Policy, Reset) - Returns HTTP 429 when rate limited Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * hook in new rate limiter * add more tests + fix cents based rate limiting * rename rate limiter + remove console logs * chore(worker): regenerate types with --strict-vars false to fix lint The previous commit regenerated worker-configuration.d.ts without the --strict-vars false flag, causing literal string types that caused TypeScript errors in src/index.ts. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * change max request cost * e2e test fixes * e2e updates * fix: Rate limit tab now correctly filters to only rate-limited requests The rate limit filter was looking up a property filter by label, which failed when the Helicone-Rate-Limit-Status property hadn't been used yet. This caused the filter node to be an empty object ({}) that matched all requests instead of only rate-limited ones. Fixed by building the filter node directly using the known property structure. Use empty object {} when not filtering (valid FilterNode type) instead of "all" string which causes backend validation errors. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: Rate limit tab now correctly filters to only rate-limited requests The rate limit filter was looking up a property filter by label, which failed when the Helicone-Rate-Limit-Status property hadn't been used yet. This caused the filter node to be an empty object ({}) that matched all requests instead of only rate-limited ones. Fixed by building the filter node directly using the known property structure with the correct value "bucket_rate_limited" (not "rate_limited"). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: Update rate limit chart filter to use bucket_rate_limited Changed the chart's userFilters to use the correct property value "bucket_rate_limited" instead of "rate_limited". Also simplified the filter structure to avoid validation errors with nested "all" strings. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add DataDog org_id tracking to bucket rate limiter Includes tracer.setOrgId() call that was in the main branch's rate limit tracking for correlation purposes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add DataDog tracing to bucket rate limiter - Add tracer and traceContext parameters to checkBucketRateLimit - Add tracer and traceContext parameters to recordBucketUsage - Add spans with metrics: remaining, rate_limited, quota_limit, time_window_seconds, rate_limit_unit, segment info - Pass tracer/traceContext from ProxyForwarder to bucket functions Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: replicas-connector[bot] <replicas-connector[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Hammad Shami <hammad@helicone.ai> Co-authored-by: Justin Torre <justin@Justins-MacBook-Pro.local>
1 parent 7b92b89 commit 15f5505

File tree

21 files changed

+3286
-105
lines changed

21 files changed

+3286
-105
lines changed

.claude/settings.local.json

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,10 @@
105105
"WebFetch(domain:ai.google.dev)",
106106
"Bash(npx tsoa:*)",
107107
"Bash(python3:*)",
108+
"Bash(git mv:*)",
109+
"Bash(npm run test:rate-limit:*)",
108110
"Bash(npx eslint:*)",
111+
"Bash(npm test:*)",
109112
"Bash(./run_all_workers.sh:*)"
110113
],
111114
"deny": []

e2e/lib/constants.ts

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,9 @@
55
// Service URLs
66
export const AI_GATEWAY_URL =
77
process.env.AI_GATEWAY_URL || "http://localhost:8793";
8+
// OpenAI proxy URL - uses BYOK flow (no wallet/credits needed)
9+
export const OPENAI_PROXY_URL =
10+
process.env.OPENAI_PROXY_URL || "http://localhost:8787";
811
export const WORKER_API_URL =
912
process.env.WORKER_API_URL || "http://localhost:8788";
1013
export const JAWN_URL = process.env.JAWN_URL || "http://localhost:8585";

e2e/package.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,8 @@
88
"test:watch": "jest --watch",
99
"test:coverage": "jest --coverage",
1010
"test:gateway": "jest tests/gateway",
11-
"test:integration": "jest --testPathPattern=integration"
11+
"test:integration": "jest --testPathPattern=integration",
12+
"test:rate-limit": "jest tests/on-push/rate-limit"
1213
},
1314
"dependencies": {
1415
"axios": "^1.6.7",

0 commit comments

Comments
 (0)