-
Notifications
You must be signed in to change notification settings - Fork 0
Architecture
This page describes how Agent Life Space is structured at runtime, what the data flow looks like, and why each layer exists. It is the canonical reference — the per-module details are on the Modules page, the security boundaries are on Security, and the on-disk formats are on Vault and Logging.
┌──────────────────────────────────┐
│ AgentOrchestrator │
│ (lifecycle + dependency wire) │
└────────────┬─────────────────────┘
│
┌──────────────┬──────────────────────┼──────────────────────┬───────────────┐
│ │ │ │ │
┌────▼────┐ ┌─────▼────┐ ┌───────▼────────┐ ┌──────▼──────┐ ┌─────▼─────┐
│ Brain │ │ Memory │ │ Tasks / Work │ │ Build │ │ Review │
│ pipeline│ │ store + │ │ queue + proj │ │ pipeline │ │ pipeline │
│ (9 layer│ │ RAG + │ │ │ │ (codegen+ │ │ (audit + │
│ cascade│ │ persist │ │ │ │ Docker) │ │ PR + rls)│
└────┬────┘ └────┬─────┘ └────────┬───────┘ └──────┬──────┘ └─────┬─────┘
│ │ │ │ │
└─────────────┴────────┬───────────────┴──────┬──────────────┴───────────────┘
│ │
┌───────▼──────┐ ┌───────▼──────────┐
│ Control Plane│ │ Governance │
│ ─ policy │ │ ─ ToolPolicy │
│ ─ intake │ │ ─ ApprovalQueue │
│ ─ gateway │ │ ─ OperatorCtrl │
│ ─ state │ │ ─ StatusModel │
│ ─ reporting │ │ ─ ExplanationLog│
└───────┬──────┘ └──────────────────┘
│
┌────────────────┼─────────────────┐
│ │ │
┌──────▼─────┐ ┌──────▼─────┐ ┌──────▼─────┐
│ Vault (v2) │ │ Logs │ │ Finance │
│ AES-128 │ │ tiered │ │ budget + │
│ +HMAC-256 │ │ long+short│ │ approval │
└────────────┘ └────────────┘ └────────────┘
The orchestrator (agent/core/agent.py::AgentOrchestrator) is the only place that wires modules together. Everything else holds references that were injected at construction time, never reaches across boundaries with getattr hacks, and never imports the orchestrator back. This is enforced by architecture invariant tests.
The brain (agent/core/brain.py::AgentBrain) is channel-agnostic. It takes an IncomingMessage and returns a string. Every message goes through the same nine layers, in order. Layers 1 to 4 may early-return; layers 5 to 9 always run together.
process(IncomingMessage)
│
├─ try / finally — status always resets to IDLE on exit
│
└─ _process_inner(message)
│
Layer 1 Multi-task detection → work queue
───── Strict rules: explicit intent header (urob:, todo:, ...) OR
clean numbered list with no surrounding prose. Anti-echo guard
rejects pasted assistant text. → early return if multi-task
│
Layer 2 Internal dispatcher (deterministic, no LLM)
───── status / health / tasks / budget / identity / skills.
→ early return if handled
│
Layer 3 Semantic cache lookup
───── sentence-transformers similarity ≥ 0.90 → early return on hit
│
Layer 4 RAG retrieval
───── knowledge base embedding search.
"direct" → early return. "augment" → context injected into prompt.
│
Layer 5 Task classification + model selection
───── classify_task() → tier (FAST/BALANCED/POWERFUL) → model.
Learning-based override (adapt_model). Channel enforcement.
Telegram + CLI + sandbox-only deny guard (fail-closed).
│
Layer 5.5 Runtime facts injection (anti-confabulation)
───── Real CPU/RAM/uptime/budget injected so the model has verified
ground truth even when it can't call agent tools.
│
Layer 6 LLM call via provider abstraction
───── API backend → ToolUseLoop (multi-turn function calling).
CLI backend → direct generate (with channel-enforced file access).
│
Layer 7 Post-routing quality escalation
───── assess_quality(). If response is generic and budget allows,
re-run with stronger model. Skipped for tool-loop responses
to preserve tool context.
│
Layer 8 Learning feedback + skill auto-update
───── process_outcome(model, task, reply) →
confidence adjustment, prompt augmentation hints, skill discovery.
│
Layer 9 Channel policy filter + explanation log
───── classify_response() → can_send_response().
ExplanationLog records routing signals, policy decisions,
learning context, memory provenance breakdown.
│
return reply
Key invariant: the LLM is the most expensive layer. Every cheaper layer that can answer must run first. Most messages never reach layer 6 because they were handled by dispatcher, cache, or RAG.
| Path | Purpose | LOC (approx) |
|---|---|---|
agent/core/ |
Orchestrator, brain pipeline, LLM provider, tool policy, approval, status, explanation, models, sandbox executor, cron loops, paths | ~9,400 |
agent/build/ |
Build service, codegen, capabilities, verification, acceptance criteria, storage, models, Docker executor | ~6,200 |
agent/social/ |
Telegram bot + handler, Agent HTTP API, dashboard, channel policy, request identity | ~5,800 |
agent/control/ |
Policy, intake, gateway, state, reporting, evidence export, llm_runtime, settlement, recurring workflows, pipelines, storage | ~5,400 |
agent/brain/ |
Internal dispatcher, semantic router, programmer, learning, decision engine, tool router, skills, knowledge | ~3,200 |
agent/review/ |
Review service, analyzers, verifier, redaction, quality, storage, models | ~2,900 |
agent/memory/ |
4-type store + provenance, persistent conversation, RAG, semantic cache, consolidation, inspection | ~2,400 |
agent/finance/ |
Tracker, budget policy, risk templates, approval flow, settlement requests | ~1,300 |
agent/logs/ |
Structured logging, secret redaction, tiered routing, retention manager | ~620 |
agent/tasks/ |
Task lifecycle (CRUD, priority queue) | ~410 |
agent/work/ |
SQLite-backed workspaces, audit trail, recovery, hash chain | ~470 |
agent/projects/ |
Project scoping | ~330 |
agent/vault/ |
Encrypted secrets (v2 single-file format) | ~470 |
operator/ |
TypeScript contracts for the operator dashboard | (separate package) |
Total Python in agent/: ~70,000 LOC across 112 source files. Full per-file inventory: Modules.
The 9-layer cascade described above. See agent/core/brain.py::AgentBrain.process for the entry point and _process_inner for the body. Each layer is unit-tested in tests/test_brain_core.py.
operator → /build or /intake (telegram or HTTP)
│
intake.qualify → plan → submit
│
BuildService.run_build()
│
├─ workspace setup (isolated, hash-chained audit trail)
│
├─ codegen (LLM call → BuildOperation[])
│ │
│ └─ AUDIT_MARKER_ONLY guard: refuse to pass verify if codegen failed
│
├─ apply mutations (10 types: create_file, edit_file, copy_file, ...)
│
├─ verification (test/lint/typecheck plan, discovered or explicit)
│
├─ Docker isolation (256MB, no network, image whitelist)
│
├─ acceptance evaluation (auto + verify + review)
│
├─ artifacts persisted via BuildStorage (WAL SQLite)
│
└─ delivery package (preview → approve → handed off)
Full detail: Build pipeline.
operator → /review or /intake
│
ReviewIntake (validated)
│
ReviewService.run_review() →
│
├─ repo audit (RepoStructureAnalyzer + SecurityAnalyzer)
├─ pr_review (DiffAnalyzer + security pass on changed files)
└─ release_review (audit + release-specific checks)
│
verifier → false-positive reduction
│
ReviewReport (verdict, findings, executive summary, open questions, assumptions)
│
artifacts (markdown report, finding list JSON, reviewer handoff pack)
│
evidence_export (internal or client_safe; redacts paths/secrets dynamically)
│
delivery package
Full detail: Review pipeline.
intake (/intake)
│
qualify_operator_intake → preview_operator_intake → submit_operator_intake
│
policy.evaluate_runtime_action (deterministic, deny-by-default)
│
budget check (hard cap / stop-loss / approval cap)
│
├─ approved → product job (build or review or analysis)
└─ blocked → structured denial with category + reason
│
on completion:
- control_plane.record_trace (RELEASE | BUILD | REVIEW | DELIVERY)
- cost ledger entry
- operator inbox surface
- settlement attention if 402 was triggered
Policies live in agent/control/policy.py. Intake routing in agent/control/intake.py. Trace + cost storage in agent/control/state.py.
Every vault write is one atomic operation:
set_secret(name, value)
│
_load() → fail-fast on InvalidToken (VaultDecryptionError)
│
secrets[name] = value
│
_save(secrets):
│
├─ token = self._fernet.encrypt(orjson.dumps(secrets))
│
├─ v2_blob = b"ALSv2\n" + self._current_salt + token
│
└─ _atomic_write(secrets_file, v2_blob):
│
├─ open secrets.enc.tmp with O_CREAT|O_WRONLY|O_TRUNC mode 0600
│
├─ os.write all bytes
│
├─ os.fsync(fd) ← contents durable
│
├─ os.close(fd)
│
├─ os.replace(tmp, secrets_file) ← POSIX atomic rename
│
└─ os.fsync(parent_dir) ← rename durable
A SIGKILL between any two of these steps leaves the vault in exactly one of two states: the previous good blob, or the new good blob. Never a partial / mismatched mix. Full spec: Vault.
structlog event
│
processors: add_log_level + TimeStamper + StackInfoRenderer + format_exc_info + JSONRenderer
│
stdlib LoggerFactory (BoundLogger)
│
root logger handlers: _TierRouter
│
├─ resolve_tier(level, event) → "long" or "short"
│
├─ long → TimedRotatingFileHandler (daily, agent-long.log)
│
└─ short → TimedRotatingFileHandler (hourly, agent-short.log)
│
cron loop (hourly):
│
└─ LogRetentionManager.prune_all()
│
├─ long files older than AGENT_LOG_LONG_RETENTION_HOURS → delete
└─ short files older than AGENT_LOG_SHORT_RETENTION_HOURS → delete
Full spec: Tiered logging.
| Layer | Choice | Why |
|---|---|---|
| Language | Python 3.11+ | Async first-class, structural pattern matching, mature crypto |
| LLM | Provider-agnostic (Claude CLI, Anthropic API, OpenAI-compatible API) | No lock-in. Operator picks per session. |
| Database | SQLite (aiosqlite + sqlite3 with WAL mode) | Single file per concern, no separate server, durable, fast enough |
| Serialization | orjson |
5–10× faster than stdlib json, strict UTF-8 |
| Validation | Pydantic v2 + jsonschema
|
Pydantic for runtime models, jsonschema for LLM-output validation |
| Logging |
structlog (JSON via stdlib) |
Structured events, tier-routable, secret-redactable |
| Encryption |
cryptography (Fernet AES-128-CBC + HMAC-SHA256, PBKDF2 480K iterations) |
Audited primitives, no DIY crypto |
| Sandbox | Docker (read-only, no-network, resource limits, image whitelist) | Real isolation, well-understood blast radius |
| Embeddings |
sentence-transformers (paraphrase-multilingual-MiniLM-L12-v2) |
Local, no API, multilingual (EN + SK) |
| HTTP |
aiohttp (server + client) |
One library for both sides, async-native |
| Scheduling | Plain asyncio loops with await asyncio.sleep
|
No APScheduler footgun, deterministic, observable |
| Process supervision | psutil |
Cross-platform, battle-tested |
| Type checking | mypy strict on the whole agent/ tree |
Catch wiring bugs at CI time |
| Lint / format | ruff |
Fast, opinionated, replaces flake8 + isort |
We deliberately avoid: APScheduler, Celery, Redis, RabbitMQ, Kubernetes, vendor SDKs that pull in 50+ transitive deps. The whole agent fits in pip install -e . with a tiny pyproject.toml.
<AGENT_DATA_DIR>/ ← .agent_runtime/ by default
├── memory/
│ ├── memories.db ← 4-type memory store + provenance
│ ├── conversations.db ← persistent conversation context
│ └── rag/ ← embedding index cache
├── tasks/
│ └── tasks.db
├── finance/
│ └── finance.db ← propose/approve/complete + budget snapshots
├── projects/
│ └── projects.db
├── workspaces/
│ ├── <workspace_id>/ ← per-job isolated workspace
│ └── workspaces.db ← audit trail with hash chain
├── build/
│ └── builds.db ← jobs + artifacts (WAL mode)
├── review/
│ └── reviews.db ← jobs + artifacts (WAL mode)
├── control/
│ ├── control.db ← plans, traces, cost ledger, settlement
│ └── llm_runtime.json ← operator runtime LLM override
├── approval/
│ └── approvals.db ← multi-step approval queue
├── identity/
│ └── owner_profile.json ← agent + owner identity (post-onboarding)
└── logs/ ← AGENT_LOG_DIR (default: <data_dir>/logs)
├── long/
│ └── agent-long.log[.YYYY-MM-DD]
└── short/
└── agent-short.log[.YYYY-MM-DD-HH]
<AGENT_PROJECT_ROOT>/agent/vault/
└── secrets.enc ← v2 single-file (header + salt + Fernet token)
AGENT_DATA_DIR defaults to .agent_runtime/ for fresh installs and agent/ for legacy installs (so existing operators don't have data move under their feet). The vault deliberately stays in the project tree because it's the only file that's both encrypted and required at boot.
python -m agent
│
1. load .env (operator-managed, gitignored)
2. resolve data_dir (env > legacy detection > .agent_runtime)
3. pin AGENT_DATA_DIR + AGENT_LOG_DIR + AGENT_PIDFILE_PATH into env
4. setup_tiered_logging() — installs _TierRouter on root logger,
switches structlog to stdlib BoundLogger
5. _check_pidfile() — refuse to start if another instance is running
6. AgentOrchestrator(data_dir).initialize()
│
├─ memory store (open SQLite, replay WAL)
├─ task manager
├─ finance tracker (asyncio.Lock per tx)
├─ project manager
├─ workspace manager (recover orphaned workspaces from SQLite)
├─ build storage (WAL mode)
├─ review storage
├─ approval queue
├─ control plane state (plans/traces/cost/settlement)
├─ runtime model + LLM runtime control
├─ gateway (provider routes)
├─ build service + review service
├─ intake router
├─ recurring workflows + pipeline orchestrator
├─ settlement service
├─ reporting service
├─ vault (open secrets.enc, migrate v1→v2 if needed)
├─ message router
├─ watchdog
├─ job runner (12 cron jobs registered)
├─ tool executor (with operator controls)
├─ agent brain (wires tool executor)
├─ telegram bot + handler
└─ HTTP API + dashboard
│
7. signal handlers (SIGINT, SIGTERM)
8. enter run loop
If any step fails, the process exits with a clear error message and a non-zero exit code. There is no silent degradation.
| Principle | Enforced by |
|---|---|
| Anti-stochastika — LLM only when no cheaper layer answered |
test_brain_core.py, test_routing_eval.py
|
| Deny-by-default — unknown tools and unknown channels are blocked |
test_tool_governance.py, test_policy_regression.py, test_security_invariants.py
|
| Fail-fast — wrong vault key, missing config, corrupt state surface immediately |
test_vault.py::TestVaultWrongKeyWriteFailFast, test_security_audit.py
|
| Human-in-the-loop for money + host access + external writes |
test_finance_approval.py, test_multi_step_approval.py, test_approval_queue.py
|
| Persistent state — SQLite everywhere, survives crashes |
test_workspace_recovery.py, test_persistent_conversation.py, test_control_plane.py
|
Crash-safe vault writes — single atomic os.replace per write |
test_vault.py::TestVaultV2Format, TestVaultV2MigrationCrashSafety
|
| Explainability — every decision recorded |
test_explanation.py, test_action_envelope.py
|
| Sovereign by default — no telemetry leaks |
test_security_audit.py::TestNoSecretsInLogs, gateway audit |
- Adding a feature? Start at Modules to find the right home for the code, then Testing for the test pyramid.
- Operating it? Deployment for first install, then Operator Handbook for daily ops.
- Debugging it? Troubleshooting for common issues, Tiered logging for where the logs live.
- Reviewing security? Security for the model, Vault for crypto details.
v1.35.0 · Latest Release
Getting started
Architecture
Subsystems
- Security model
- Vault
- Tiered logging
- Runtime LLM control
- Build pipeline
- Review pipeline
- Finance
- Cron & Maintenance
Development