Skip to content

Commit b8800bb

Browse files
authored
♻️ refactor(models): remove redundant burst field from Limit model (#408)
## Summary - Remove the redundant `burst` field from `Limit` and `burst_milli` from `BucketState` — `capacity` now serves as the bucket ceiling (max tokens), which is what `burst` was actually doing in all bucket math - Drop `bx` attribute from DynamoDB writes (bucket and config records); read path uses `max(cp, bx)` fallback for backward compatibility with existing records - Update `refill_bucket()` parameter naming (`burst_milli` → `capacity_milli`), aggregator processor, CLI display, exception serialization, migration logic, examples, and all documentation - Add inline period suffix support to CLI `-l` flag: `name:rate[/period][:burst]` where period is `/[N]sec`, `/[N]min` (default), `/[N]hour`, or `/[N]day` — fully backward compatible ## Test plan - [x] `uv run pytest tests/unit/` — all unit tests pass with updated assertions (models, bucket, limiter, repository, CLI, exceptions, migrations, processor) - [x] `uv run pytest tests/integration/` — integration tests pass with LocalStack - [ ] `uv run pytest tests/e2e/` — end-to-end workflows pass - [x] `hatch run generate-sync` output matches committed sync files - [x] `pre-commit run --all-files` passes (lint, type check, sync verification) Closes #406 Closes #410 🤖 Generated with [Claude Code](https://claude.ai/code)
2 parents 32429c9 + c889d38 commit b8800bb

40 files changed

Lines changed: 471 additions & 522 deletions

CLAUDE.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -536,7 +536,7 @@ Primary mitigation: cascade defaults to `False`.
536536
### Token Bucket Algorithm
537537
- Buckets can go **negative** for post-hoc reconciliation
538538
- Refill is calculated lazily on each access
539-
- `burst >= capacity` allows controlled bursting
539+
- `capacity` is the bucket ceiling; factory methods accept `burst` to set `capacity > refill_amount`
540540

541541
### DynamoDB Single Table Design
542542
- All entities, buckets, limits, usage in one table
@@ -826,12 +826,12 @@ Limit configs use composite items (v0.8.0+, ADR-114 for configs). All limits for
826826

827827
| Level | PK | SK | Attributes |
828828
|-------|----|----|------------|
829-
| System | `{ns}/SYSTEM#` | `#CONFIG` | `on_unavailable`, `l_rpm_cp`, `l_rpm_bx`, `l_rpm_ra`, `l_rpm_rp`, ... |
829+
| System | `{ns}/SYSTEM#` | `#CONFIG` | `on_unavailable`, `l_rpm_cp`, `l_rpm_ra`, `l_rpm_rp`, ... |
830830
| Resource | `{ns}/RESOURCE#{res}` | `#CONFIG` | `resource`, `l_rpm_cp`, ... |
831831
| Entity | `{ns}/ENTITY#{id}` | `#CONFIG#{resource}` | `entity_id`, `resource`, `l_rpm_cp`, ... |
832832

833833
**Limit attribute format:** `l_{limit_name}_{field}` where field is one of:
834-
- `cp` (capacity), `bx` (burst), `ra` (refill_amount), `rp` (refill_period_seconds)
834+
- `cp` (capacity), `ra` (refill_amount), `rp` (refill_period_seconds)
835835

836836
**Config fields:**
837837
- `config_version` (int): Atomic counter for cache invalidation

README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,8 +41,7 @@ sync_limiter = SyncRateLimiter(repository=sync_repo)
4141
# Define default limits (can be overridden per-entity)
4242
default_limits = [
4343
Limit.per_minute("rpm", 100),
44-
# Token bucket with burst capacity
45-
Limit.per_minute("tpm", 10_000, burst=50_000),
44+
Limit.per_minute("tpm", 10_000),
4645
]
4746

4847
async with limiter.acquire(

docs/adr/114-composite-bucket-items.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ already skips META reads on cache hits.
2626
All limits for an entity+resource must be stored in a single composite DynamoDB
2727
item with SK `#BUCKET#{resource}`. Per-limit attributes must use the prefix
2828
`b_{limit_name}_{field}` with short field names: `tk` (tokens), `cp` (capacity),
29-
`bx` (burst), `ra` (refill amount), `rp` (refill period), `tc` (total consumed).
29+
`ra` (refill amount), `rp` (refill period), `tc` (total consumed).
3030
GSI2SK must be per-entity (`BUCKET#{entity_id}`), not per-limit.
3131

3232
## Consequences

docs/adr/115-add-based-writes-lazy-refill.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ limits simultaneously.
2626

2727
Writers must use DynamoDB ADD to atomically decrement token balances and increment
2828
consumption counters. Refill must not be stored in `tk`; instead, effective tokens
29-
must be computed at read time as `min(stored_tk + elapsed * rate, burst)`. A single
29+
must be computed at read time as `min(stored_tk + elapsed * rate, capacity)`. A single
3030
shared `rf` attribute must serve as both the refill baseline and the optimistic
3131
lock. The repository must implement four write paths: Create (PutItem with
3232
`attribute_not_exists`), Normal (ADD with refill+consumption, condition `rf =

docs/api/exceptions.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -207,7 +207,6 @@ The `as_dict()` method returns a dictionary suitable for API responses:
207207
"resource": "api",
208208
"limit_name": "rpm",
209209
"capacity": 100,
210-
"burst": 100,
211210
"available": -5,
212211
"requested": 10,
213212
"exceeded": True,
@@ -218,7 +217,6 @@ The `as_dict()` method returns a dictionary suitable for API responses:
218217
"resource": "api",
219218
"limit_name": "tpm",
220219
"capacity": 10000,
221-
"burst": 10000,
222220
"available": 8500,
223221
"requested": 500,
224222
"exceeded": False,

docs/contributing/architecture.md

Lines changed: 1 addition & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,6 @@ See: [Issue #168](https://github.com/zeroae/zae-limiter/issues/168)
165165
"SK": "#CONFIG", # or #CONFIG#{resource} for entity level
166166
"resource": "gpt-4",
167167
"l_tpm_cp": 100000, # capacity for tpm limit
168-
"l_tpm_bx": 100000, # burst for tpm limit
169168
"l_tpm_ra": 100000, # refill_amount for tpm limit
170169
"l_tpm_rp": 60, # refill_period_seconds for tpm limit
171170
"config_version": 1 # Atomic counter for cache invalidation
@@ -304,18 +303,6 @@ new_tokens_milli = refill.new_tokens_milli - (amount * 1000)
304303

305304
The debt is repaid as tokens refill over time. A bucket at -1500 millitokens needs 1.5 minutes to reach 0 (at 1000 tokens/minute).
306305

307-
### Burst Capacity
308-
309-
Burst allows temporary exceeding of sustained rate:
310-
311-
```python
312-
# Sustained: 10k tokens/minute
313-
# Burst: 15k tokens (one-time)
314-
Limit.per_minute("tpm", 10_000, burst=15_000)
315-
```
316-
317-
When `burst > capacity`, users can consume up to `burst` tokens immediately, then sustain at `capacity` rate.
318-
319306
### Design Decisions
320307

321308
| Decision | Rationale |
@@ -461,7 +448,7 @@ The aggregator processes DynamoDB Stream records in each batch to:
461448
```
462449
Aggregator refill flow (per composite bucket):
463450
1. Aggregate tc deltas + last NewImage across stream batch
464-
2. For each limit: refill_bucket(tk, rf, now, bx, ra, rp)
451+
2. For each limit: refill_bucket(tk, rf, now, cp, ra, rp)
465452
+- refill_delta = new_tk - current_tk
466453
+- projected = new_tk after refill
467454
+- consumption_estimate = max(0, accumulated tc_delta)

docs/getting-started.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -275,10 +275,10 @@ Limit.custom("requests", capacity=50, refill_amount=50, refill_period_seconds=30
275275
| Parameter | Description |
276276
|-----------|-------------|
277277
| `name` | Unique identifier (e.g., "rpm", "tpm") |
278-
| `capacity` | Tokens that refill per period (sustained rate) |
279-
| `burst` | Maximum bucket size (defaults to capacity) |
278+
| `rate` | Sustained tokens per period (positional) |
279+
| `burst` | Optional burst ceiling (defaults to `rate`) |
280280

281-
See [Token Bucket Algorithm](guide/token-bucket.md) for details on how capacity, burst, and refill work together.
281+
See [Token Bucket Algorithm](guide/token-bucket.md) for details on how rate, burst, and refill work together.
282282

283283
## Handling Rate Limit Errors
284284

docs/guide/basic-usage.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ When using stored config, configure multiple limits at setup time:
6666

6767
## Burst Capacity
6868

69-
Allow temporary bursts above the sustained rate:
69+
Allow temporary bursts above the sustained rate by setting `burst` higher than the rate:
7070

7171
```python
7272
# Sustain 10k tokens/minute, but allow bursts up to 15k
@@ -75,7 +75,7 @@ limits = [
7575
]
7676
```
7777

78-
The bucket starts full at `burst` capacity and refills at `capacity` tokens per period. See [Token Bucket Algorithm](token-bucket.md#capacity-and-burst) for details on how burst and capacity interact.
78+
The bucket starts full at `burst` capacity and refills at `rate` tokens per period. See [Token Bucket Algorithm](token-bucket.md#capacity-and-burst) for details on how burst and rate interact.
7979

8080
## Adjusting Consumption
8181

docs/guide/token-bucket.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ flowchart LR
2323
Check -->|No| Reject[Reject + retry_after]
2424
```
2525

26-
This creates a natural rate limit: requests can burst up to the bucket's capacity, but sustained traffic is limited by the refill rate.
26+
This creates a natural rate limit: requests can consume up to the bucket's capacity, but sustained traffic is limited by the refill rate.
2727

2828
## How zae-limiter Implements It
2929

@@ -46,14 +46,14 @@ These modifications enable:
4646

4747
### Capacity and Burst
4848

49-
Every limit has two key parameters:
50-
51-
- **Capacity**: The sustained rate (tokens that refill per period)
52-
- **Burst**: The maximum bucket size (can be larger than capacity)
49+
Every limit has a **rate** (sustained throughput) and an optional **burst** (the bucket ceiling):
5350

5451
```python
55-
# 10,000 tokens/minute sustained, 15,000 burst
56-
Limit.per_minute("tpm", capacity=10_000, burst=15_000)
52+
# 10,000 tokens/minute sustained, bucket holds up to 10k
53+
Limit.per_minute("tpm", 10_000)
54+
55+
# 10,000 tokens/minute sustained, but allow bursts up to 15k
56+
Limit.per_minute("tpm", 10_000, burst=15_000)
5757
```
5858

5959
```mermaid
@@ -71,9 +71,9 @@ graph TD
7171
style E fill:#87CEEB
7272
```
7373

74-
**Key insight**: The bucket is larger (15k) but refills at the same rate (10k/minute). After fully depleting the burst, it takes **1.5 minutes** to return to full capacity—not 1 minute.
74+
**Key insight**: When `burst` is set, the bucket is larger than the refill rate. After fully depleting a 15k burst bucket that refills at 10k/minute, it takes **1.5 minutes** to return to full capacity. Without `burst`, the bucket ceiling equals the rate.
7575

76-
**When to use burst > capacity:**
76+
**When to use burst:**
7777

7878
- **Startup surge**: Handle initial traffic before steady state
7979
- **Bursty workloads**: Allow temporary spikes followed by quiet periods
@@ -192,9 +192,9 @@ except RateLimitExceeded as e:
192192

193193
### Choosing the right limits
194194

195-
| Scenario | Capacity | Burst | Rationale |
196-
|----------|----------|-------|-----------|
197-
| Steady API traffic | 100 rpm | 100 | No bursting needed |
195+
| Scenario | Rate | Burst | Rationale |
196+
|----------|------|-------|-----------|
197+
| Steady API traffic | 100 rpm | -- | No bursting needed |
198198
| Bursty batch jobs | 100 rpm | 500 | Allow 5x burst, then sustain |
199199
| LLM tokens | 10k tpm | 15k | Handle variable response sizes |
200200
| Database queries | 1k rows/min | 5k | Allow large result sets occasionally |

docs/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ A rate limiting library backed by DynamoDB using the token bucket algorithm.
2828

2929
## Features
3030

31-
- **Token Bucket Algorithm** - Precise rate limiting with configurable burst capacity
31+
- **Token Bucket Algorithm** - Precise rate limiting with configurable capacity and refill rates
3232
- **Multiple Limits** - Track requests per minute, tokens per minute, etc. in a single call
3333
- **Hierarchical Entities** - Two-level hierarchy (project → API keys) with cascade mode
3434
- **Atomic Transactions** - Multi-key updates via DynamoDB TransactWriteItems
@@ -51,7 +51,7 @@ limiter = RateLimiter(repository=repo)
5151
# Define default limits (can be overridden per-entity)
5252
default_limits = [
5353
Limit.per_minute("rpm", 100),
54-
Limit.per_minute("tpm", 10_000, burst=50_000), # Token bucket with burst
54+
Limit.per_minute("tpm", 10_000),
5555
]
5656

5757
async with limiter.acquire(

0 commit comments

Comments
 (0)