You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today, operators manage rate limits imperatively through individual CLI commands (zae-limiter system set-defaults, resource set-defaults, entity set-limits) or Python API calls (repo.set_system_defaults(), repo.set_resource_defaults(), repo.set_limits()). This works for one-off changes but creates challenges at scale:
No audit trail of desired state -- operators know what limits are but not what they should be
Drift detection is impossible -- no way to compare live limits against a source-of-truth document
Multi-namespace rollouts are tedious -- setting the same limits across tenant-alpha, tenant-beta, etc. requires repeated CLI invocations
No GitOps workflow -- limits cannot be version-controlled and reviewed through normal PR processes
No idempotent apply -- running the same CLI commands twice may produce warnings or unexpected behavior
Operators need a way to declare their desired limits configuration in version-controlled YAML files and have those declarations applied idempotently, similar to how Terraform manages infrastructure state.
Proposed Solution
YAML Configuration Format
One YAML file per namespace. Namespaces are auto-registered on first apply if they don't exist.
Limit shorthand: Only capacity is required. Defaults: burst = capacity, refill_amount = capacity, refill_period = 60 (matches Limit.per_minute()). Override any field explicitly for custom refill behavior.
Map keys for limits: Limits are keyed by name (map, not list) to prevent duplicate limit names in a single declaration.
Architecture
Both CLI and CloudFormation paths converge at a single Lambda provisioner:
YAML → CLI parse → Lambda invoke ──→ diff vs #PROVISIONER ──→ Repository API ──→ DynamoDB
CFN event ─────────→ Lambda ────────→ diff vs #PROVISIONER ──→ Repository API ──→ DynamoDB
The Lambda is the only writer. No direct DynamoDB access from the CLI. LocalStack supports Lambda containers, so no fallback is needed.
CLI Commands
# Preview changes (dry-run, like terraform plan)
zae-limiter limits plan -f tenant-alpha.limits.yaml
# Apply limits from YAML (idempotent, like terraform apply)
zae-limiter limits apply -f tenant-alpha.limits.yaml
# Show diff between YAML and live DynamoDB state (detect out-of-band drift)
zae-limiter limits diff -f tenant-alpha.limits.yaml
# Generate a CloudFormation template from YAML
zae-limiter limits cfn-template -f tenant-alpha.limits.yaml
All commands inherit --name and --region (and --endpoint-url) from the parent CLI.
Lambda Provisioner
A new Lambda function in the main stack that handles both CLI invocations and CFN custom resource events.
apply with empty manifest (deletes all managed items)
State Tracking (Terraform-style)
The provisioner tracks "managed" vs "unmanaged" limits to avoid clobbering manual overrides:
Scenario
Behavior
In YAML, not in DynamoDB
Create it
In YAML, in DynamoDB (managed)
Update it if changed
Removed from YAML, was managed
Delete it
In DynamoDB, never in YAML
Leave it alone
State is stored in a DynamoDB record:
Field
Description
PK
{ns}/SYSTEM#
SK
#PROVISIONER
managed_system
Boolean: whether system defaults are managed
managed_resources
List of resource names managed by this manifest
managed_entities
Map of {entity_id: [resource_list]} managed by this manifest
last_applied
ISO timestamp of last apply
applied_hash
SHA-256 of the YAML content (for drift detection)
Known limitation: The #PROVISIONER record is a single DynamoDB item (400KB max). This limits managed entities to ~5,000 per namespace. Documented as a known limit; solvable later with sharding or item-level tagging if needed.
src/zae_limiter_provisioner/
├── __init__.py # Re-exports handler, types
├── handler.py # Lambda entry: on_event() for CLI + CFN
├── manifest.py # LimitsManifest dataclass, YAML parsing, validation
├── differ.py # Diff engine: manifest vs #PROVISIONER record
└── applier.py # Applies changes via Repository API
src/zae_limiter/
├── cli.py # New `limits` command group (plan, apply, diff, cfn-template)
├── schema.py # New key builder: sk_provisioner()
└── repository.py # New methods: get_provisioner_state(), put_provisioner_state()
Apply Algorithm
apply(manifest: LimitsManifest, repo: Repository):
1. Auto-register namespace if it doesn't exist
2. Read current #PROVISIONER record from DynamoDB (or empty if first apply)
3. Compute diff:
SYSTEM:
manifest has system + previous didn't → set_system_defaults()
manifest has system + previous did → set_system_defaults() (overwrite)
manifest lacks system + previous had → delete_system_defaults()
RESOURCES:
in manifest, not in previous_managed → set_resource_defaults()
in manifest, in previous_managed → set_resource_defaults() (overwrite)
not in manifest, in previous_managed → delete_resource_defaults()
not in manifest, not in previous_managed → skip (unmanaged)
ENTITIES (same pattern):
in manifest, not in previous_managed → set_limits()
in manifest, in previous_managed → set_limits() (overwrite)
not in manifest, in previous_managed → delete_limits()
not in manifest, not in previous_managed → skip (unmanaged)
4. Write updated #PROVISIONER record with new managed set + hash
Plan mode runs steps 1-3 and returns the diff without applying.
This is an administrative operation, not on the hot path -- cost is negligible.
Alternatives Considered
CLI direct to DynamoDB (no Lambda) -- simpler but requires operators to have DynamoDB credentials, cannot be used as a CloudFormation Custom Resource
CLI fallback when Lambda unavailable -- adds a second code path to maintain. LocalStack supports Lambda containers, so no fallback needed
List-style limits ([{name: rpm, capacity: 1000}]) -- allows duplicate limit names by accident. Map style (rpm: {capacity: 1000}) prevents this
Multi-namespace YAML files -- adds complexity. One file per namespace is cleaner, matches one CFN stack per namespace
Full sync (delete everything not in YAML) -- would clobber limits set manually via CLI/API. Terraform-style state tracking preserves unmanaged items
External state file (like terraform.tfstate) -- adds file/S3 management complexity. DynamoDB #PROVISIONER record is self-contained
Acceptance Criteria
YAML schema defined and validated (Pydantic or dataclass) covering namespace, system, resources, and entities sections with map-style limits
Shorthand limit syntax supported: only capacity required, defaults to Limit.per_minute() behavior
zae-limiter limits plan -f <file> invokes Lambda and prints a human-readable diff without modifying DynamoDB
zae-limiter limits apply -f <file> invokes Lambda and applies limits idempotently -- running twice with the same file produces zero changes on second run
zae-limiter limits diff -f <file> shows differences between YAML declaration and live DynamoDB state
zae-limiter limits cfn-template -f <file> generates a CloudFormation template with Custom::ZaeLimiterLimits resource
#PROVISIONER state record (PK={ns}/SYSTEM#, SK=#PROVISIONER) tracks managed resources, entities, and applied hash
Removing a limit from YAML and running apply deletes it from DynamoDB (only if previously managed per #PROVISIONER record)
Limits set manually via set_system_defaults/set_resource_defaults/set_limits are not deleted by apply unless tracked in #PROVISIONER
Namespaces are auto-registered on first apply if they don't exist
Lambda provisioner function handles both CLI invocations and CFN Custom Resource events (Create/Update/Delete)
Main CloudFormation stack exports LimitsProvisionerArn for cross-stack reference
Unit tests cover YAML parsing, diff computation, and state tracking logic
Problem or Use Case
Today, operators manage rate limits imperatively through individual CLI commands (
zae-limiter system set-defaults,resource set-defaults,entity set-limits) or Python API calls (repo.set_system_defaults(),repo.set_resource_defaults(),repo.set_limits()). This works for one-off changes but creates challenges at scale:tenant-alpha,tenant-beta, etc. requires repeated CLI invocationsOperators need a way to declare their desired limits configuration in version-controlled YAML files and have those declarations applied idempotently, similar to how Terraform manages infrastructure state.
Proposed Solution
YAML Configuration Format
One YAML file per namespace. Namespaces are auto-registered on first apply if they don't exist.
Limit shorthand: Only
capacityis required. Defaults:burst = capacity,refill_amount = capacity,refill_period = 60(matchesLimit.per_minute()). Override any field explicitly for custom refill behavior.Map keys for limits: Limits are keyed by name (map, not list) to prevent duplicate limit names in a single declaration.
Architecture
Both CLI and CloudFormation paths converge at a single Lambda provisioner:
The Lambda is the only writer. No direct DynamoDB access from the CLI. LocalStack supports Lambda containers, so no fallback is needed.
CLI Commands
All commands inherit
--nameand--region(and--endpoint-url) from the parent CLI.Lambda Provisioner
A new Lambda function in the main stack that handles both CLI invocations and CFN custom resource events.
CLI invocation payload:
{ "action": "plan|apply", "manifest": { "namespace": "tenant-alpha", "system": { "on_unavailable": "allow", "limits": { "rpm": { "capacity": 10000 } } }, "resources": { "gpt-4": { "limits": { "rpm": { "capacity": 1000 } } } }, "entities": { "user-123": { "resources": { "gpt-4": { "limits": { "rpm": { "capacity": 500 } } } } } } } }Response:
{ "status": "applied|planned", "changes": [ {"action": "create", "level": "resource", "target": "gpt-4", "limits": {"rpm": {"capacity": 1000}}}, {"action": "update", "level": "entity", "target": "user-123/gpt-4", "limits": {"rpm": {"capacity": 500}}}, {"action": "delete", "level": "resource", "target": "gpt-3.5-turbo"} ], "manifest_hash": "sha256:abc..." }CFN lifecycle mapping:
apply(first apply, empty provisioner state)apply(diff against previous provisioner state)applywith empty manifest (deletes all managed items)State Tracking (Terraform-style)
The provisioner tracks "managed" vs "unmanaged" limits to avoid clobbering manual overrides:
State is stored in a DynamoDB record:
{ns}/SYSTEM##PROVISIONERmanaged_systemmanaged_resourcesmanaged_entities{entity_id: [resource_list]}managed by this manifestlast_appliedapplied_hashKnown limitation: The
#PROVISIONERrecord is a single DynamoDB item (400KB max). This limits managed entities to ~5,000 per namespace. Documented as a known limit; solvable later with sharding or item-level tagging if needed.CloudFormation Integration
Main stack exports the provisioner Lambda ARN:
Tenant stack (generated by
limits cfn-template), one per namespace:New Package Structure
Apply Algorithm
Plan mode runs steps 1-3 and returns the diff without applying.
Cost and Performance
limits plan(dry-run)limits apply(no changes)limits apply(with changes)This is an administrative operation, not on the hot path -- cost is negligible.
Alternatives Considered
[{name: rpm, capacity: 1000}]) -- allows duplicate limit names by accident. Map style (rpm: {capacity: 1000}) prevents this#PROVISIONERrecord is self-containedAcceptance Criteria
namespace,system,resources, andentitiessections with map-style limitscapacityrequired, defaults toLimit.per_minute()behaviorzae-limiter limits plan -f <file>invokes Lambda and prints a human-readable diff without modifying DynamoDBzae-limiter limits apply -f <file>invokes Lambda and applies limits idempotently -- running twice with the same file produces zero changes on second runzae-limiter limits diff -f <file>shows differences between YAML declaration and live DynamoDB statezae-limiter limits cfn-template -f <file>generates a CloudFormation template withCustom::ZaeLimiterLimitsresource#PROVISIONERstate record (PK={ns}/SYSTEM#, SK=#PROVISIONER) tracks managed resources, entities, and applied hashapplydeletes it from DynamoDB (only if previously managed per#PROVISIONERrecord)set_system_defaults/set_resource_defaults/set_limitsare not deleted byapplyunless tracked in#PROVISIONERapplyif they don't existLimitsProvisionerArnfor cross-stack reference