Skip to content

Commit 43197fd

Browse files
committed
bug fixing
1 parent 3d15357 commit 43197fd

25 files changed

Lines changed: 1753 additions & 964 deletions

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,3 +53,4 @@ prompt.md
5353
knowledge_base_vuln.json
5454
graphify-out/cache/
5555
graphify-out/.graphify_*
56+
skills/claude_md_orchestrator/runs/

CHEATSHEET.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,9 @@ This is the second-level reference. Start with `ESSENTIALS.md` first. Use this w
155155
| Query knowledge base | `parquet_query` |
156156
| Semantic search sessions | `rag_query` |
157157
| Threat model | `threat_model` |
158+
| LLM daily cost cap and per call token cap | `llm_budget` |
159+
| LLM budget as JSON for scripts | `llm_budget json` |
160+
| Reset LLM ledger after confirmation | `llm_budget reset` |
158161

159162
---
160163

CLAUDE.md

Lines changed: 47 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ bash skills/mcp_restart.sh # after editing MCP
6060
| `cli/commands/` | cmd2 `CommandSet` subpkg, auto-discovered by `cli/registry.py`. |
6161
| `core/` | Canonical `Config`, crypto, validators, `typing.Protocol` interfaces. No framework imports. |
6262
| `scripts/` | Build/maintenance: `build_command_index.py`, `patch_playbook_atomic_ids.py`. |
63-
| `tests/` | 66 files, ~1860 tests. No mocking of C2 or daemon. |
63+
| `tests/` | 68 files, ~1940 tests. No mocking of C2 or daemon. |
6464
| `lazyown-docker/` `lazygui/` | Docker + desktop GUI. |
6565
| `docs/` | GH Pages site — auto-generated by `DEPLOY.sh`, don't edit HTML. |
6666
| `lazyc2/` | Security validators (`validate_route_path`, `validate_template_name`, `is_safe_template_path`). |
@@ -95,6 +95,11 @@ Critical keys:
9595
| `llm_model_groq` | Model identifier passed to the Groq API (default `llama-3.3-70b-versatile`) |
9696
| `llm_model_ollama` | Model identifier passed to the Ollama API (default `deepseek-r1:1.5b`) |
9797
| `ollama_host` | Base URL of the Ollama daemon (default `http://localhost:11434`) |
98+
| `llm_daily_budget_usd` | Daily cost cap the LLM budget proxy enforces (default `1.0`) |
99+
| `llm_per_call_token_cap` | Per call input token cap the proxy enforces (default `8000`) |
100+
| `llm_budget_enabled` | When `false` the proxy passes calls through without recording (default `true`) |
101+
| `llm_reset_at_utc` | UTC time the ledger rolls over (default `00:00`) |
102+
| `llm_model_prices` | Per model price table expressed in United States dollars per million tokens |
98103
| `c2_daily_limit`, `c2_hour_limit`, `c2_login_limit` | flask-limiter strings |
99104
| `targets` | Multi-target list (status, ports, tags, notes) |
100105
| `scope` | Authorized engagement scope: list of CIDR/IP/hostname entries (`*.` wildcards). Empty = scope guard dormant |
@@ -601,6 +606,47 @@ contract is pinned by `tests/test_ci_strict.py`.
601606

602607
---
603608

609+
## 15i. LLM budget cap
610+
611+
`core/llm_budget.py` is the production implementation of the daily
612+
LLM cost cap and the per call token cap the operator configures
613+
through `payload.json`. The factory wraps every backend with the
614+
proxy so the cap is enforced at the single chokepoint the framework
615+
uses. The proxy never crashes a call: the proxy raises
616+
`BudgetExceeded` only when the cap is breached. The proxy persists
617+
the spend to `sessions/llm_budget.json` so a process restart does
618+
not reset the counter inside the same calendar day.
619+
620+
| Key | Default | Purpose |
621+
|-----|---------|---------|
622+
| `llm_daily_budget_usd` | `1.0` | Maximum spend per calendar day in United States dollars. |
623+
| `llm_per_call_token_cap` | `8000` | Maximum input tokens per call. |
624+
| `llm_budget_enabled` | `true` | When `false` the proxy passes the call through without recording. |
625+
| `llm_reset_at_utc` | `00:00` | UTC time the ledger rolls over. |
626+
| `llm_model_prices` | Groq and Ollama defaults | Per model price table in United States dollars per million tokens. |
627+
628+
Pricing follows the OpenAI style model. Groq llama-3.3-70b-versatile
629+
ships at 0.59 United States dollars per million input tokens and 0.79
630+
United States dollars per million output tokens. Ollama models ship
631+
at zero because the operator runs the model on the local host. The
632+
operator may override every value through `payload.json`.
633+
634+
The proxy is import tolerant. When the `tiktoken` dependency is
635+
missing the estimator falls back to a whitespace tokeniser that still
636+
produces a deterministic count. When the `core.llm_budget` module
637+
fails to import the factory returns the raw backend so a missing
638+
dependency never blocks the operator.
639+
640+
The CLI command `llm_budget` shows the structured status block. The
641+
command accepts three subcommands: `llm_budget` for the human
642+
readable block, `llm_budget json` for the structured object, and
643+
`llm_budget reset` to clear the ledger after the operator confirms
644+
the action. The MCP tool `lazyown_get_llm_budget` returns the same
645+
structured object the JSON subcommand prints. Tests live in
646+
`tests/test_llm_budget.py` and pin the contract the spec declares.
647+
648+
---
649+
604650
## 16. Read next
605651

606652
- `QUICKSTART.md` — start here for a new operator session.

0 commit comments

Comments
 (0)