@@ -60,7 +60,7 @@ bash skills/mcp_restart.sh # after editing MCP
6060| ` cli/commands/ ` | cmd2 ` CommandSet ` subpkg, auto-discovered by ` cli/registry.py ` . |
6161| ` core/ ` | Canonical ` Config ` , crypto, validators, ` typing.Protocol ` interfaces. No framework imports. |
6262| ` scripts/ ` | Build/maintenance: ` build_command_index.py ` , ` patch_playbook_atomic_ids.py ` . |
63- | ` tests/ ` | 66 files, ~ 1860 tests. No mocking of C2 or daemon. |
63+ | ` tests/ ` | 68 files, ~ 1940 tests. No mocking of C2 or daemon. |
6464| ` lazyown-docker/ ` ` lazygui/ ` | Docker + desktop GUI. |
6565| ` docs/ ` | GH Pages site — auto-generated by ` DEPLOY.sh ` , don't edit HTML. |
6666| ` lazyc2/ ` | Security validators (` validate_route_path ` , ` validate_template_name ` , ` is_safe_template_path ` ). |
@@ -95,6 +95,11 @@ Critical keys:
9595| ` llm_model_groq ` | Model identifier passed to the Groq API (default ` llama-3.3-70b-versatile ` ) |
9696| ` llm_model_ollama ` | Model identifier passed to the Ollama API (default ` deepseek-r1:1.5b ` ) |
9797| ` ollama_host ` | Base URL of the Ollama daemon (default ` http://localhost:11434 ` ) |
98+ | ` llm_daily_budget_usd ` | Daily cost cap the LLM budget proxy enforces (default ` 1.0 ` ) |
99+ | ` llm_per_call_token_cap ` | Per call input token cap the proxy enforces (default ` 8000 ` ) |
100+ | ` llm_budget_enabled ` | When ` false ` the proxy passes calls through without recording (default ` true ` ) |
101+ | ` llm_reset_at_utc ` | UTC time the ledger rolls over (default ` 00:00 ` ) |
102+ | ` llm_model_prices ` | Per model price table expressed in United States dollars per million tokens |
98103| ` c2_daily_limit ` , ` c2_hour_limit ` , ` c2_login_limit ` | flask-limiter strings |
99104| ` targets ` | Multi-target list (status, ports, tags, notes) |
100105| ` scope ` | Authorized engagement scope: list of CIDR/IP/hostname entries (` *. ` wildcards). Empty = scope guard dormant |
@@ -601,6 +606,47 @@ contract is pinned by `tests/test_ci_strict.py`.
601606
602607---
603608
609+ ## 15i. LLM budget cap
610+
611+ ` core/llm_budget.py ` is the production implementation of the daily
612+ LLM cost cap and the per call token cap the operator configures
613+ through ` payload.json ` . The factory wraps every backend with the
614+ proxy so the cap is enforced at the single chokepoint the framework
615+ uses. The proxy never crashes a call: the proxy raises
616+ ` BudgetExceeded ` only when the cap is breached. The proxy persists
617+ the spend to ` sessions/llm_budget.json ` so a process restart does
618+ not reset the counter inside the same calendar day.
619+
620+ | Key | Default | Purpose |
621+ | -----| ---------| ---------|
622+ | ` llm_daily_budget_usd ` | ` 1.0 ` | Maximum spend per calendar day in United States dollars. |
623+ | ` llm_per_call_token_cap ` | ` 8000 ` | Maximum input tokens per call. |
624+ | ` llm_budget_enabled ` | ` true ` | When ` false ` the proxy passes the call through without recording. |
625+ | ` llm_reset_at_utc ` | ` 00:00 ` | UTC time the ledger rolls over. |
626+ | ` llm_model_prices ` | Groq and Ollama defaults | Per model price table in United States dollars per million tokens. |
627+
628+ Pricing follows the OpenAI style model. Groq llama-3.3-70b-versatile
629+ ships at 0.59 United States dollars per million input tokens and 0.79
630+ United States dollars per million output tokens. Ollama models ship
631+ at zero because the operator runs the model on the local host. The
632+ operator may override every value through ` payload.json ` .
633+
634+ The proxy is import tolerant. When the ` tiktoken ` dependency is
635+ missing the estimator falls back to a whitespace tokeniser that still
636+ produces a deterministic count. When the ` core.llm_budget ` module
637+ fails to import the factory returns the raw backend so a missing
638+ dependency never blocks the operator.
639+
640+ The CLI command ` llm_budget ` shows the structured status block. The
641+ command accepts three subcommands: ` llm_budget ` for the human
642+ readable block, ` llm_budget json ` for the structured object, and
643+ ` llm_budget reset ` to clear the ledger after the operator confirms
644+ the action. The MCP tool ` lazyown_get_llm_budget ` returns the same
645+ structured object the JSON subcommand prints. Tests live in
646+ ` tests/test_llm_budget.py ` and pin the contract the spec declares.
647+
648+ ---
649+
604650## 16. Read next
605651
606652- ` QUICKSTART.md ` — start here for a new operator session.
0 commit comments