feat(worker): add token bucket rate limiter Durable Object by replicas-connector[bot] · Pull Request #5504 · Helicone/helicone

replicas-connector · 2026-01-13T21:16:55Z

Summary

Implement production-grade token bucket rate limiter using Cloudflare Durable Objects
Support request-based and cost-based (cents) limiting via Helicone-RateLimit-Policy header
Three segment types for rate limiting isolation:
- Global: One shared bucket per organization
- Per-user: Bucket per Helicone-User-Id
- Per-property: Bucket per Helicone-Property-[Name] (e.g., organization, tenant)

Key Design Decisions

Token bucket algorithm with lazy refill - no background timers, tokens computed on demand
Durable Objects for enforcement - guarantees atomic operations and consistency at scale
Configurable failure mode - fail-open (default, preserves availability) or fail-closed (preserves cost control)
Policy change detection - gracefully handles updates to rate limit policies

Policy Header Format

Helicone-RateLimit-Policy: [quota];w=[time_window];u=[unit];s=[segment]

Examples:

1000;w=3600 - 1000 requests per hour, global
5000;w=86400;u=cents - $50 per day, global
100;w=60;s=user - 100 requests per minute, per user
10000;w=3600;s=organization - 10000 requests per hour, per organization

Files Added

File	Description
`TokenBucketRateLimiterDO.ts`	Core Durable Object with bucket state management
`policyParser.ts`	Policy string parsing with validation
`segmentExtractor.ts`	Header-based segment identifier extraction
`tokenBucketClient.ts`	Worker integration layer
`test/rate-limit/*.spec.ts`	90 unit tests

Test plan

All 90 unit tests pass
Manual testing with wrangler dev (can test global/user/property isolation)
Integration testing in staging environment

🤖 Generated with Claude Code

Implement a production-grade token bucket rate limiter for the AI Gateway using Cloudflare Durable Objects for consistent enforcement at high concurrency. Key features: - Token bucket algorithm with lazy refill (no background timers) - Supports request-based and cost-based (cents) limiting - Segment types: global, per-user (Helicone-User-Id), per-property - Policy header format: [quota];w=[window];u=[unit];s=[segment] - Atomic operations via DO's single-threaded execution model - Configurable fail-open/fail-closed on errors - Policy change detection and graceful handling Files added: - TokenBucketRateLimiterDO: Core DO with bucket state management - policyParser: Robust policy string parsing with validation - segmentExtractor: Header-based segment identifier extraction - tokenBucketClient: Worker integration layer 90 unit tests covering: - Policy parsing edge cases - Token bucket math (refill, consume, reset time) - Segment extraction and DO key generation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

vercel · 2026-01-13T21:17:00Z

The latest updates on your projects. Learn more about Vercel for GitHub.

3 Skipped Deployments

Project	Deployment	Updated (UTC)
helicone	Skipped	Jan 16, 2026 11:23pm
helicone-bifrost	Skipped	Jan 16, 2026 11:23pm
helicone-eu	Skipped	Jan 16, 2026 11:23pm

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

supabase · 2026-01-13T21:17:06Z

This pull request has been ignored for the connected project bolqqmqbrciybnypvklh because there are no changes detected in supbase/supabase directory. You can change this behaviour in Project Integrations Settings ↗︎.

Preview Branches by Supabase.
Learn more about Supabase Branching ↗︎.

supabase · 2026-01-13T21:17:09Z

This pull request has been ignored for the connected project lmahfbbnchpworytrrqk because there are no changes detected in .supabase/supabase directory. You can change this behaviour in Project Integrations Settings ↗︎.

Preview Branches by Supabase.
Learn more about Supabase Branching ↗︎.

claude · 2026-01-13T21:17:15Z

Claude finished @replicas-connector[bot]'s task —— View job

PR Review: Token Bucket Rate Limiter

Score: 8/10 - Well-implemented production-grade rate limiter with excellent test coverage and thoughtful design decisions.

Todo List:

Analyze core rate limiter implementation
Review policy parsing and validation logic
Examine segment extraction for security issues
Check integration with existing codebase
Assess test coverage and quality
Evaluate performance implications
Provide final review with confidence score

Detailed Review

✅ Strengths

Excellent Algorithm Design

Token bucket with lazy refill is correctly implemented - no background timers, efficient computation
Smart dual-strategy for request vs cost-based limiting handles the "unknown cost upfront" problem elegantly
Policy change detection with graceful handling (clamp existing tokens to new capacity)

Strong Security & Robustness

Input sanitization in segmentExtractor.ts:172-180 prevents injection attacks in DO keys
Cost normalization with reasonable limits (max $10,000/request) prevents abuse
Fail-open default preserves availability, configurable fail-closed for cost control
Comprehensive validation of policy parameters

Excellent Test Coverage

90+ comprehensive unit tests across 3 test files (1,200+ lines of tests)
Tests cover edge cases, algorithm correctness, and error conditions
Pure function testing approach for bucket algorithm verification

Production-Ready Integration

DataDog tracing integration for monitoring
Proper error handling and logging
Clean separation of concerns (parser, extractor, client, DO)

⚠️ Minor Issues & Suggestions

1. Hash Function Quality (BucketRateLimiterDO.ts:317-326)

Uses simple string hash for policy change detection
Consider crypto.subtle.digest() for better collision resistance, though current approach is likely adequate

2. Cost Validation Gap

normalizeCost() clamps to 1M cents but doesn't validate minimum cost
Suggest adding minimum cost validation (e.g., >= 0.01 cents) to catch precision issues

3. Error Logging Enhancement

Rate limiter errors are caught and silently handled in some places
Consider structured logging for debugging production issues

4. Documentation

Excellent inline documentation, but could benefit from architecture decision records (ADRs) for the dual-strategy approach

🔧 Suggested Improvements

Enhanced Monitoring:

// Add more detailed metrics in bucketClient.ts
tracer.setTag(spanId, "tokens_before", bucketState.tokens);
tracer.setTag(spanId, "refill_amount", tokensToAdd);

Improved Error Handling:

// In ProxyForwarder.ts:210-212, add specific error logging
} catch (error) {
  console.error("Rate limit check failed:", error);
  // Current fail-open behavior is correct
}

Policy Validation:

// In policyParser.ts, add minimum cost validation
if (policy.unit === "cents" && quota < 0.01) {
  return err({ field: "quota", message: "Minimum cost is 0.01 cents" });
}

✅ Architecture Decisions

The dual-strategy approach for request vs cost-based limiting is particularly well thought out:

Request-based: Preemptive deduction prevents race conditions
Cost-based: Check-only pre-request + post-request deduction handles unknown costs
This handles the fundamental challenge that LLM costs are only known after completion

🚀 Performance Analysis

Durable Objects provide strong consistency with good performance
Lazy refill algorithm is O(1) time complexity
Key structure ensures efficient bucket isolation
Single storage operation per request is optimal

Suggestions Summary

Critical Fixes: None - this is ready for production

Recommended Improvements:

Enhanced error logging for debugging
Add minimum cost validation for cents-based policies
Consider stronger hash function for policy change detection
Add more detailed monitoring metrics

Nice-to-Have:

Architecture decision records documenting the dual-strategy approach
Performance benchmarks under load

This is a well-architected, thoroughly tested implementation that demonstrates strong understanding of distributed rate limiting challenges. The code quality is high and the approach is production-ready.

replicas-connector · 2026-01-13T21:17:40Z

CI/CD Failure - Resolved

Workflow Worker AI Gateway Tests failed on commit 3d1bc10.

Fix: Fixed lint error no-control-regex in segmentExtractor.ts by adding an eslint-disable comment for the control character regex pattern (used for sanitizing input). Also fixed prettier formatting issues and removed an unused import.

Commit: bebeff5

replicas-connector · 2026-01-13T21:18:54Z

CI/CD Failure

Workflow Worker Build Precheck failed on commit 3d1bc10. Investigating.

- Add eslint-disable comment for control-regex in segmentExtractor - Fix prettier formatting issues in policyParser, segmentExtractor, tokenBucketClient - Remove unused SegmentExtractionError import - Change single quotes to double quotes for string literal Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

replicas-connector · 2026-01-13T21:21:40Z

CI/CD Failure - Resolved

Workflow Worker Build Precheck failed on commit bebeff5.

Root Cause: The worker-configuration.d.ts file was generated with a different wrangler version locally (4.59.1) than what CI uses (4.53.0). Different wrangler versions embed different workerd runtimes, which produce slightly different runtime type definitions.

Fix: Regenerated worker-configuration.d.ts using npx wrangler@4.53.0 types --strict-vars false to match the CI environment exactly.

Commits:

ea294836b - Initial regeneration (wrong workerd version)
91d1d0aee - Regenerated with wrangler 4.53.0 to match CI

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

replicas-connector · 2026-01-13T21:24:02Z

CI/CD Failure - Resolved

Workflow Worker Build Precheck failed on commit ea29483.

Root Cause: This workflow ran against an intermediate commit that still had wrangler version mismatch in the generated types.

Fix: The fix was already pushed in commit 91d1d0aee which regenerates worker-configuration.d.ts using wrangler@4.53.0 to match CI exactly.

Status: The new workflow run 20973131541 on commit 91d1d0aee has passed successfully.

Use the same wrangler version as CI (4.53.0) to ensure the generated worker-configuration.d.ts matches exactly. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Integrates the TokenBucketRateLimiterDO into the proxy request handler: - Add checkTokenBucketRateLimit call in ProxyForwarder.ts after existing rate limit checks - Add addTokenBucketRateLimitHeaders method to ResponseBuilder - Rate limiting is triggered by the Helicone-RateLimit-Policy header - Uses fail-open behavior to preserve availability on errors - Adds rate limit response headers (Limit, Remaining, Policy, Reset) - Returns HTTP 429 when rate limited Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

replicas-connector · 2026-01-13T23:49:23Z

CI/CD Failure - Unrelated to PR Changes

Workflow Worker AI Gateway Tests failed on commit a939584.

Findings: The failure is in registry-ts.spec.ts test "openai - gpt-4o - PTB direct" with error:

Failed to pop isolated storage stack frame in registry-ts.spec.ts's test "openai - gpt-4o - PTB direct".
In particular, we were unable to pop Durable Objects storage.

This is a pre-existing flaky test issue with the Cloudflare Vitest pool workers' Durable Object storage isolation, not related to the token bucket rate limiter changes in this PR.

Evidence:

All 90 rate-limit tests passed successfully (policyParser: 34, segmentExtractor: 27, tokenBucket: 29)
The main branch also has intermittent failures with this same test (see workflow runs on 2026-01-08)
The error is in test infrastructure (@cloudflare/vitest-pool-workers) not in application code

Recommendation: Re-run the workflow or investigate the flaky test separately.

The rate limit filter was looking up a property filter by label, which failed when the Helicone-Rate-Limit-Status property hadn't been used yet. This caused the filter node to be an empty object ({}) that matched all requests instead of only rate-limited ones. Fixed by building the filter node directly using the known property structure. Use empty object {} when not filtering (valid FilterNode type) instead of "all" string which causes backend validation errors. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The rate limit filter was looking up a property filter by label, which failed when the Helicone-Rate-Limit-Status property hadn't been used yet. This caused the filter node to be an empty object ({}) that matched all requests instead of only rate-limited ones. Fixed by building the filter node directly using the known property structure with the correct value "bucket_rate_limited" (not "rate_limited"). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Changed the chart's userFilters to use the correct property value "bucket_rate_limited" instead of "rate_limited". Also simplified the filter structure to avoid validation errors with nested "all" strings. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Resolved conflicts keeping bucket rate limiter implementation while merging DataDog tracer imports from main. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Includes tracer.setOrgId() call that was in the main branch's rate limit tracking for correlation purposes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

replicas-connector · 2026-01-16T21:07:32Z

CI/CD Failure - Unresolved (Flaky Test)

Workflow Worker AI Gateway Tests failed on commit 170e4ab.

Findings: Same pre-existing flaky test issue in registry-ts.spec.ts:

Failed to pop isolated storage stack frame in registry-ts.spec.ts's test "openai - gpt-4o - PTB direct".
In particular, we were unable to pop Durable Objects storage.

Root Cause: This is a known issue with @cloudflare/vitest-pool-workers Durable Objects storage isolation, not related to the rate limiter changes in this PR.

All rate-limit tests passed successfully. The failing test is unrelated to token bucket rate limiter implementation.

Recommendation: This flaky test should be addressed separately - it's been failing intermittently across multiple CI runs and predates this PR.

- Add tracer and traceContext parameters to checkBucketRateLimit - Add tracer and traceContext parameters to recordBucketUsage - Add spans with metrics: remaining, rate_limited, quota_limit, time_window_seconds, rate_limit_unit, segment info - Pass tracer/traceContext from ProxyForwarder to bucket functions Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

replicas-connector · 2026-01-16T23:24:51Z

CI/CD Failure - Unresolved (Flaky Test)

Workflow Worker AI Gateway Tests failed on commit 7956bc9.

Findings: Same pre-existing flaky test issue in registry-ts.spec.ts:

FAIL registry-ts.spec.ts > Registry Tests > PTB Tests > with sufficient credits > openai - gpt-4o - PTB direct
Error: Test timed out in 10000ms.
AssertionError: Isolated storage failed. There should be additional logs above.

Root Cause: This is a known issue with @cloudflare/vitest-pool-workers Durable Objects storage isolation. The test times out and then the isolated storage cleanup fails. This has been occurring across multiple CI runs and predates this PR.

All 1256 other tests passed successfully, including all rate-limit tests. The failing test is unrelated to the token bucket rate limiter implementation.

Recommendation: This flaky test should be addressed separately - consider increasing the timeout for this specific test or investigating the DO storage isolation issue in the test framework.

greptile-apps bot reviewed Jan 13, 2026

View reviewed changes

vercel bot temporarily deployed to Preview – helicone-eu January 13, 2026 21:19 Inactive

vercel bot temporarily deployed to Preview – helicone-bifrost January 13, 2026 21:19 Inactive

vercel bot temporarily deployed to Preview – helicone January 13, 2026 21:19 Inactive

chore(worker): regenerate types with wrangler for token bucket DO

ea29483

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

vercel bot temporarily deployed to Preview – helicone January 13, 2026 21:22 Inactive

vercel bot temporarily deployed to Preview – helicone-bifrost January 13, 2026 21:22 Inactive

vercel bot temporarily deployed to Preview – helicone-eu January 13, 2026 21:22 Inactive

chore(worker): regenerate types with wrangler 4.53.0 to match CI

91d1d0a

Use the same wrangler version as CI (4.53.0) to ensure the generated worker-configuration.d.ts matches exactly. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

vercel bot temporarily deployed to Preview – helicone-bifrost January 13, 2026 21:25 Inactive

vercel bot temporarily deployed to Preview – helicone January 13, 2026 21:25 Inactive

vercel bot temporarily deployed to Preview – helicone-eu January 13, 2026 21:25 Inactive

vercel bot temporarily deployed to Preview – helicone January 13, 2026 23:47 Inactive

vercel bot temporarily deployed to Preview – helicone-eu January 13, 2026 23:47 Inactive

vercel bot temporarily deployed to Preview – helicone-bifrost January 13, 2026 23:47 Inactive

hook in new rate limiter

8952788

vercel bot temporarily deployed to Preview – helicone-bifrost January 14, 2026 01:10 Inactive

vercel bot temporarily deployed to Preview – helicone-eu January 14, 2026 01:10 Inactive

vercel bot deployed to Preview – helicone January 15, 2026 22:14 View deployment

vercel bot deployed to Preview – helicone-bifrost January 15, 2026 22:17 View deployment

chitalian approved these changes Jan 15, 2026

View reviewed changes

vercel bot temporarily deployed to Preview – helicone-bifrost January 15, 2026 23:55 Inactive

vercel bot deployed to Preview – helicone January 15, 2026 23:58 View deployment

chitalian force-pushed the new-rate-limter branch from 8d04c3c to 72a8abf Compare January 15, 2026 23:58

vercel bot temporarily deployed to Preview – helicone-bifrost January 15, 2026 23:58 Inactive

vercel bot deployed to Preview – helicone January 16, 2026 00:01 View deployment

vercel bot deployed to Preview – helicone-eu January 16, 2026 00:01 View deployment

vercel bot temporarily deployed to Preview – helicone-bifrost January 16, 2026 00:03 Inactive

vercel bot temporarily deployed to Preview – helicone-bifrost January 16, 2026 00:05 Inactive

vercel bot deployed to Preview – helicone January 16, 2026 00:08 View deployment

vercel bot deployed to Preview – helicone-eu January 16, 2026 00:08 View deployment

H2Shami and others added 2 commits January 16, 2026 11:28

Merge main into new-rate-limter

b860152

Resolved conflicts keeping bucket rate limiter implementation while merging DataDog tracer imports from main. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add DataDog org_id tracking to bucket rate limiter

351029f

Includes tracer.setOrgId() call that was in the main branch's rate limit tracking for correlation purposes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

vercel bot deployed to Preview – helicone-eu January 16, 2026 21:09 View deployment

vercel bot deployed to Preview – helicone January 16, 2026 21:09 View deployment

vercel bot deployed to Preview – helicone-bifrost January 16, 2026 21:11 View deployment

vercel bot temporarily deployed to Preview – helicone January 16, 2026 23:23 Inactive

vercel bot temporarily deployed to Preview – helicone-eu January 16, 2026 23:23 Inactive

vercel bot temporarily deployed to Preview – helicone-bifrost January 16, 2026 23:23 Inactive

chitalian approved these changes Jan 16, 2026

View reviewed changes

H2Shami merged commit 15f5505 into main Jan 16, 2026
11 of 12 checks passed

H2Shami deleted the new-rate-limter branch January 16, 2026 23:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(worker): add token bucket rate limiter Durable Object#5504

feat(worker): add token bucket rate limiter Durable Object#5504
H2Shami merged 19 commits intomainfrom
new-rate-limter

replicas-connector bot commented Jan 13, 2026

Uh oh!

vercel bot commented Jan 13, 2026 •

edited

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

supabase bot commented Jan 13, 2026

Uh oh!

supabase bot commented Jan 13, 2026

Uh oh!

claude bot commented Jan 13, 2026 •

edited

Loading

Uh oh!

replicas-connector bot commented Jan 13, 2026 •

edited

Loading

Uh oh!

replicas-connector bot commented Jan 13, 2026

Uh oh!

replicas-connector bot commented Jan 13, 2026 •

edited

Loading

Uh oh!

replicas-connector bot commented Jan 13, 2026 •

edited

Loading

Uh oh!

replicas-connector bot commented Jan 13, 2026 •

edited

Loading

Uh oh!

replicas-connector bot commented Jan 16, 2026 •

edited

Loading

Uh oh!

replicas-connector bot commented Jan 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

replicas-connector bot commented Jan 13, 2026

Summary

Key Design Decisions

Policy Header Format

Files Added

Test plan

Uh oh!

vercel bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

supabase bot commented Jan 13, 2026

Uh oh!

supabase bot commented Jan 13, 2026

Uh oh!

claude bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Token Bucket Rate Limiter

Todo List:

Detailed Review

✅ Strengths

⚠️ Minor Issues & Suggestions

🔧 Suggested Improvements

✅ Architecture Decisions

🚀 Performance Analysis

Suggestions Summary

Uh oh!

replicas-connector bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

replicas-connector bot commented Jan 13, 2026

Uh oh!

replicas-connector bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

replicas-connector bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

replicas-connector bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

replicas-connector bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

replicas-connector bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel bot commented Jan 13, 2026 •

edited

Loading

claude bot commented Jan 13, 2026 •

edited

Loading

replicas-connector bot commented Jan 13, 2026 •

edited

Loading

replicas-connector bot commented Jan 13, 2026 •

edited

Loading

replicas-connector bot commented Jan 13, 2026 •

edited

Loading

replicas-connector bot commented Jan 13, 2026 •

edited

Loading

replicas-connector bot commented Jan 16, 2026 •

edited

Loading

replicas-connector bot commented Jan 16, 2026 •

edited

Loading