grisuno
diff --git a/‎.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎CHEATSHEET.md‎
Lines changed: 3 additions & 0 deletions b/‎CHEATSHEET.md‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎CLAUDE.md‎
Lines changed: 47 additions & 1 deletion b/‎CLAUDE.md‎
Lines changed: 47 additions & 1 deletion
@@ -53,3 +53,4 @@ prompt.md
 knowledge_base_vuln.json
 graphify-out/cache/
 graphify-out/.graphify_*
+skills/claude_md_orchestrator/runs/
@@ -155,6 +155,9 @@ This is the second-level reference. Start with `ESSENTIALS.md` first. Use this w
 | Query knowledge base | `parquet_query` |
 | Semantic search sessions | `rag_query` |
 | Threat model | `threat_model` |
+| LLM daily cost cap and per call token cap | `llm_budget` |
+| LLM budget as JSON for scripts | `llm_budget json` |
+| Reset LLM ledger after confirmation | `llm_budget reset` |
 
 ---
 
 
@@ -60,7 +60,7 @@ bash skills/mcp_restart.sh                                   # after editing MCP
 | `cli/commands/` | cmd2 `CommandSet` subpkg, auto-discovered by `cli/registry.py`. |
 | `core/` | Canonical `Config`, crypto, validators, `typing.Protocol` interfaces. No framework imports. |
 | `scripts/` | Build/maintenance: `build_command_index.py`, `patch_playbook_atomic_ids.py`. |
-| `tests/` | 66 files, ~1860 tests. No mocking of C2 or daemon. |
+| `tests/` | 68 files, ~1940 tests. No mocking of C2 or daemon. |
 | `lazyown-docker/` `lazygui/` | Docker + desktop GUI. |
 | `docs/` | GH Pages site — auto-generated by `DEPLOY.sh`, don't edit HTML. |
 | `lazyc2/` | Security validators (`validate_route_path`, `validate_template_name`, `is_safe_template_path`). |
@@ -95,6 +95,11 @@ Critical keys:
 | `llm_model_groq` | Model identifier passed to the Groq API (default `llama-3.3-70b-versatile`) |
 | `llm_model_ollama` | Model identifier passed to the Ollama API (default `deepseek-r1:1.5b`) |
 | `ollama_host` | Base URL of the Ollama daemon (default `http://localhost:11434`) |
+| `llm_daily_budget_usd` | Daily cost cap the LLM budget proxy enforces (default `1.0`) |
+| `llm_per_call_token_cap` | Per call input token cap the proxy enforces (default `8000`) |
+| `llm_budget_enabled` | When `false` the proxy passes calls through without recording (default `true`) |
+| `llm_reset_at_utc` | UTC time the ledger rolls over (default `00:00`) |
+| `llm_model_prices` | Per model price table expressed in United States dollars per million tokens |
 | `c2_daily_limit`, `c2_hour_limit`, `c2_login_limit` | flask-limiter strings |
 | `targets` | Multi-target list (status, ports, tags, notes) |
 | `scope` | Authorized engagement scope: list of CIDR/IP/hostname entries (`*.` wildcards). Empty = scope guard dormant |
@@ -601,6 +606,47 @@ contract is pinned by `tests/test_ci_strict.py`.
 
 ---
 
+## 15i. LLM budget cap
+
+`core/llm_budget.py` is the production implementation of the daily
+LLM cost cap and the per call token cap the operator configures
+through `payload.json`. The factory wraps every backend with the
+proxy so the cap is enforced at the single chokepoint the framework
+uses. The proxy never crashes a call: the proxy raises
+`BudgetExceeded` only when the cap is breached. The proxy persists
+the spend to `sessions/llm_budget.json` so a process restart does
+not reset the counter inside the same calendar day.
+
+| Key | Default | Purpose |
+|-----|---------|---------|
+| `llm_daily_budget_usd` | `1.0` | Maximum spend per calendar day in United States dollars. |
+| `llm_per_call_token_cap` | `8000` | Maximum input tokens per call. |
+| `llm_budget_enabled` | `true` | When `false` the proxy passes the call through without recording. |
+| `llm_reset_at_utc` | `00:00` | UTC time the ledger rolls over. |
+| `llm_model_prices` | Groq and Ollama defaults | Per model price table in United States dollars per million tokens. |
+
+Pricing follows the OpenAI style model. Groq llama-3.3-70b-versatile
+ships at 0.59 United States dollars per million input tokens and 0.79
+United States dollars per million output tokens. Ollama models ship
+at zero because the operator runs the model on the local host. The
+operator may override every value through `payload.json`.
+
+The proxy is import tolerant. When the `tiktoken` dependency is
+missing the estimator falls back to a whitespace tokeniser that still
+produces a deterministic count. When the `core.llm_budget` module
+fails to import the factory returns the raw backend so a missing
+dependency never blocks the operator.
+
+The CLI command `llm_budget` shows the structured status block. The
+command accepts three subcommands: `llm_budget` for the human
+readable block, `llm_budget json` for the structured object, and
+`llm_budget reset` to clear the ledger after the operator confirms
+the action. The MCP tool `lazyown_get_llm_budget` returns the same
+structured object the JSON subcommand prints. Tests live in
+`tests/test_llm_budget.py` and pin the contract the spec declares.
+
+---
+
 ## 16. Read next
 
 - `QUICKSTART.md` — start here for a new operator session.