Status: DRAFT
Owner: Orchestrator
Related: TEST_STRATEGY.md, ACCEPTANCE.md
To validate ratelord as a resilient constraint control plane, simple load testing is insufficient. We must model complex, adversarial, and high-entropy scenarios that reflect real-world distributed systems.
This document defines the Advanced Simulation Framework, extending ratelord-sim to support deterministic, multi-agent scenarios with configurable topologies, policies, and failure modes.
- Behavioral Validation: Prove that
ratelordcorrectly shapes traffic (e.g., smoothing bursts, enforcing priorities) under stress. - Risk Modeling: Verify that "time-to-exhaustion" predictions hold true when usage patterns change abruptly.
- Resilience: Ensure the daemon recovers gracefully from state drift (e.g., missed webhooks, partition) and protects the provider.
The ratelord-sim tool must start supporting a Scenario Manifest (JSON/YAML) rather than just CLI flags.
A scenario defines:
- Topology: Hierarchy of identities, groups, and limits.
- Agents: Distinct actors with specific behaviors (greedy, periodic, bursty).
- Phases: Time-bounded stages (e.g., "Warmup", "Attack", "Cooldown").
- Invariants: Success criteria (e.g., "Provider never returns 429").
All scenarios must be replayable. Random number generators for jitter, request timing, and "sabotage" must accept a fixed seed.
- Description: 100+ agents wake up simultaneously (e.g., cron job alignment).
- Topology: Flat, single shared pool.
- Behavior: All agents request
POST /intentwithin a 100ms window. - Expectation:
- Daemon queues/throttles requests (no 500s).
- Approvals are distributed over time (smoothing).
- Provider rate limit is not exceeded.
- Description: The daemon's local state diverges from the provider (simulated by "sabotage" or external usage).
- Topology: Single identity.
- Behavior:
- Agents consume quota normally.
- "Saboteur" makes hidden requests directly to Provider (bypassing daemon).
- Provider usage increases faster than Daemon tracks.
- Expectation:
- Daemon detects drift on next poll/webhook.
- Forecast adapts (Time-to-Exhaustion drops).
- Daemon tightens policy (rejects/throttles) to prevent actual exhaustion.
- Description: High-volume low-priority traffic threatens to block critical operations.
- Topology:
Group A(Low Priority, High Volume).Group B(High Priority, Low Volume).
- Behavior:
Group Afloods the system.Group Battempts sporadic requests. - Expectation:
Group Ais throttled aggressively.Group Brequests are approved immediately (reserved capacity or priority queue).
- Description: One misbehaving agent in a shared pool affects others? Or is isolated?
- Topology:
Agent Bad: Greedy, infinite loop.Agent Good: Periodic, highly critical.- Variant A: Both in Shared Pool.
- Variant B: Separate Isolated Pools.
- Expectation:
- Variant A:
Agent Goodmay see increased wait times, but fairness logic should prevent starvation. - Variant B:
Agent Goodis completely unaffected byAgent Bad's exhaustion.
- Variant A:
- Description: Provider goes down (503s) or effectively blocks all requests (429s).
- Behavior: Provider returns error for all Polls/Intents.
- Expectation:
- Daemon enters "Circuit Breaker" mode.
- Auto-denies local intents without hitting provider (fail fast).
- Exponential backoff on polling.
- Self-heals when Provider recovers.
- Refactor
ratelord-simto load scenarios from file. - Implement
AgentBehaviorinterface (Greedy, Poisson, Periodic). - Add structured result reporting (JSON output).
- Define
S-01andS-02in JSON/YAML. - Implement support for modifying topology (creating identities/limits) as part of setup.
- Run selected scenarios as part of the build pipeline.
- Assert pass/fail based on
Invariants.