Architecture

This page describes how Agent Life Space is structured at runtime, what the data flow looks like, and why each layer exists. It is the canonical reference — the per-module details are on the Modules page, the security boundaries are on Security, and the on-disk formats are on Vault and Logging.

System overview

                                  ┌──────────────────────────────────┐
                                  │       AgentOrchestrator          │
                                  │   (lifecycle + dependency wire)  │
                                  └────────────┬─────────────────────┘
                                               │
        ┌──────────────┬──────────────────────┼──────────────────────┬───────────────┐
        │              │                      │                      │               │
   ┌────▼────┐   ┌─────▼────┐         ┌───────▼────────┐      ┌──────▼──────┐  ┌─────▼─────┐
   │ Brain   │   │ Memory   │         │ Tasks / Work   │      │ Build       │  │ Review    │
   │ pipeline│   │ store +  │         │ queue + proj   │      │ pipeline    │  │ pipeline  │
   │ (9 layer│   │ RAG +    │         │                │      │ (codegen+   │  │ (audit +  │
   │  cascade│   │ persist  │         │                │      │  Docker)    │  │  PR + rls)│
   └────┬────┘   └────┬─────┘         └────────┬───────┘      └──────┬──────┘  └─────┬─────┘
        │             │                        │                     │               │
        └─────────────┴────────┬───────────────┴──────┬──────────────┴───────────────┘
                               │                      │
                       ┌───────▼──────┐       ┌───────▼──────────┐
                       │ Control Plane│       │  Governance      │
                       │  ─ policy    │       │  ─ ToolPolicy    │
                       │  ─ intake    │       │  ─ ApprovalQueue │
                       │  ─ gateway   │       │  ─ OperatorCtrl  │
                       │  ─ state     │       │  ─ StatusModel   │
                       │  ─ reporting │       │  ─ ExplanationLog│
                       └───────┬──────┘       └──────────────────┘
                               │
              ┌────────────────┼─────────────────┐
              │                │                 │
       ┌──────▼─────┐   ┌──────▼─────┐    ┌──────▼─────┐
       │ Vault (v2) │   │  Logs      │    │  Finance   │
       │  AES-128   │   │  tiered    │    │  budget +  │
       │  +HMAC-256 │   │  long+short│    │  approval  │
       └────────────┘   └────────────┘    └────────────┘

The orchestrator (agent/core/agent.py::AgentOrchestrator) is the only place that wires modules together. Everything else holds references that were injected at construction time, never reaches across boundaries with getattr hacks, and never imports the orchestrator back. This is enforced by architecture invariant tests.

The 9-layer brain pipeline

The brain (agent/core/brain.py::AgentBrain) is channel-agnostic. It takes an IncomingMessage and returns a string. Every message goes through the same nine layers, in order. Layers 1 to 4 may early-return; layers 5 to 9 always run together.

process(IncomingMessage)
  │
  ├─ try / finally — status always resets to IDLE on exit
  │
  └─ _process_inner(message)
        │
        Layer 1   Multi-task detection → work queue
        ─────    Strict rules: explicit intent header (urob:, todo:, ...) OR
                  clean numbered list with no surrounding prose. Anti-echo guard
                  rejects pasted assistant text. → early return if multi-task
        │
        Layer 2   Internal dispatcher (deterministic, no LLM)
        ─────    status / health / tasks / budget / identity / skills.
                  → early return if handled
        │
        Layer 3   Semantic cache lookup
        ─────    sentence-transformers similarity ≥ 0.90 → early return on hit
        │
        Layer 4   RAG retrieval
        ─────    knowledge base embedding search.
                  "direct" → early return.  "augment" → context injected into prompt.
        │
        Layer 5   Task classification + model selection
        ─────    classify_task() → tier (FAST/BALANCED/POWERFUL) → model.
                  Learning-based override (adapt_model). Channel enforcement.
                  Telegram + CLI + sandbox-only deny guard (fail-closed).
        │
        Layer 5.5 Runtime facts injection (anti-confabulation)
        ─────    Real CPU/RAM/uptime/budget injected so the model has verified
                  ground truth even when it can't call agent tools.
        │
        Layer 6   LLM call via provider abstraction
        ─────    API backend → ToolUseLoop (multi-turn function calling).
                  CLI backend → direct generate (with channel-enforced file access).
        │
        Layer 7   Post-routing quality escalation
        ─────    assess_quality(). If response is generic and budget allows,
                  re-run with stronger model. Skipped for tool-loop responses
                  to preserve tool context.
        │
        Layer 8   Learning feedback + skill auto-update
        ─────    process_outcome(model, task, reply) →
                  confidence adjustment, prompt augmentation hints, skill discovery.
        │
        Layer 9   Channel policy filter + explanation log
        ─────    classify_response() → can_send_response().
                  ExplanationLog records routing signals, policy decisions,
                  learning context, memory provenance breakdown.
        │
        return reply

Key invariant: the LLM is the most expensive layer. Every cheaper layer that can answer must run first. Most messages never reach layer 6 because they were handled by dispatcher, cache, or RAG.

Module map

Path	Purpose	LOC (approx)
`agent/core/`	Orchestrator, brain pipeline, LLM provider, tool policy, approval, status, explanation, models, sandbox executor, cron loops, paths	~9,400
`agent/build/`	Build service, codegen, capabilities, verification, acceptance criteria, storage, models, Docker executor	~6,200
`agent/social/`	Telegram bot + handler, Agent HTTP API, dashboard, channel policy, request identity	~5,800
`agent/control/`	Policy, intake, gateway, state, reporting, evidence export, llm_runtime, settlement, recurring workflows, pipelines, storage	~5,400
`agent/brain/`	Internal dispatcher, semantic router, programmer, learning, decision engine, tool router, skills, knowledge	~3,200
`agent/review/`	Review service, analyzers, verifier, redaction, quality, storage, models	~2,900
`agent/memory/`	4-type store + provenance, persistent conversation, RAG, semantic cache, consolidation, inspection	~2,400
`agent/finance/`	Tracker, budget policy, risk templates, approval flow, settlement requests	~1,300
`agent/logs/`	Structured logging, secret redaction, tiered routing, retention manager	~620
`agent/tasks/`	Task lifecycle (CRUD, priority queue)	~410
`agent/work/`	SQLite-backed workspaces, audit trail, recovery, hash chain	~470
`agent/projects/`	Project scoping	~330
`agent/vault/`	Encrypted secrets (v2 single-file format)	~470
`operator/`	TypeScript contracts for the operator dashboard	(separate package)

Total Python in agent/: ~70,000 LOC across 112 source files. Full per-file inventory: Modules.

Data flow

Brain pipeline

The 9-layer cascade described above. See agent/core/brain.py::AgentBrain.process for the entry point and _process_inner for the body. Each layer is unit-tested in tests/test_brain_core.py.

Build pipeline

operator → /build or /intake (telegram or HTTP)
  │
intake.qualify → plan → submit
  │
BuildService.run_build()
  │
  ├─ workspace setup (isolated, hash-chained audit trail)
  │
  ├─ codegen (LLM call → BuildOperation[])
  │     │
  │     └─ AUDIT_MARKER_ONLY guard: refuse to pass verify if codegen failed
  │
  ├─ apply mutations (10 types: create_file, edit_file, copy_file, ...)
  │
  ├─ verification (test/lint/typecheck plan, discovered or explicit)
  │
  ├─ Docker isolation (256MB, no network, image whitelist)
  │
  ├─ acceptance evaluation (auto + verify + review)
  │
  ├─ artifacts persisted via BuildStorage (WAL SQLite)
  │
  └─ delivery package (preview → approve → handed off)

Full detail: Build pipeline.

Review pipeline

operator → /review or /intake
  │
ReviewIntake (validated)
  │
ReviewService.run_review() →
  │
  ├─ repo audit (RepoStructureAnalyzer + SecurityAnalyzer)
  ├─ pr_review  (DiffAnalyzer + security pass on changed files)
  └─ release_review (audit + release-specific checks)
  │
verifier → false-positive reduction
  │
ReviewReport (verdict, findings, executive summary, open questions, assumptions)
  │
artifacts (markdown report, finding list JSON, reviewer handoff pack)
  │
evidence_export (internal or client_safe; redacts paths/secrets dynamically)
  │
delivery package

Full detail: Review pipeline.

Control plane

intake (/intake)
  │
qualify_operator_intake → preview_operator_intake → submit_operator_intake
  │
policy.evaluate_runtime_action (deterministic, deny-by-default)
  │
budget check (hard cap / stop-loss / approval cap)
  │
  ├─ approved → product job (build or review or analysis)
  └─ blocked  → structured denial with category + reason
  │
on completion:
  - control_plane.record_trace (RELEASE | BUILD | REVIEW | DELIVERY)
  - cost ledger entry
  - operator inbox surface
  - settlement attention if 402 was triggered

Policies live in agent/control/policy.py. Intake routing in agent/control/intake.py. Trace + cost storage in agent/control/state.py.

Vault writes

Every vault write is one atomic operation:

set_secret(name, value)
  │
_load() → fail-fast on InvalidToken (VaultDecryptionError)
  │
secrets[name] = value
  │
_save(secrets):
  │
  ├─ token = self._fernet.encrypt(orjson.dumps(secrets))
  │
  ├─ v2_blob = b"ALSv2\n" + self._current_salt + token
  │
  └─ _atomic_write(secrets_file, v2_blob):
        │
        ├─ open secrets.enc.tmp with O_CREAT|O_WRONLY|O_TRUNC mode 0600
        │
        ├─ os.write all bytes
        │
        ├─ os.fsync(fd)        ← contents durable
        │
        ├─ os.close(fd)
        │
        ├─ os.replace(tmp, secrets_file)   ← POSIX atomic rename
        │
        └─ os.fsync(parent_dir)            ← rename durable

A SIGKILL between any two of these steps leaves the vault in exactly one of two states: the previous good blob, or the new good blob. Never a partial / mismatched mix. Full spec: Vault.

Tiered logging

structlog event
  │
processors: add_log_level + TimeStamper + StackInfoRenderer + format_exc_info + JSONRenderer
  │
stdlib LoggerFactory (BoundLogger)
  │
root logger handlers: _TierRouter
  │
  ├─ resolve_tier(level, event)  → "long" or "short"
  │
  ├─ long  → TimedRotatingFileHandler (daily, agent-long.log)
  │
  └─ short → TimedRotatingFileHandler (hourly, agent-short.log)
  │
cron loop (hourly):
  │
  └─ LogRetentionManager.prune_all()
        │
        ├─ long  files older than AGENT_LOG_LONG_RETENTION_HOURS  → delete
        └─ short files older than AGENT_LOG_SHORT_RETENTION_HOURS → delete

Full spec: Tiered logging.

Technology stack

Layer	Choice	Why
Language	Python 3.11+	Async first-class, structural pattern matching, mature crypto
LLM	Provider-agnostic (Claude CLI, Anthropic API, OpenAI-compatible API)	No lock-in. Operator picks per session.
Database	SQLite (aiosqlite + sqlite3 with WAL mode)	Single file per concern, no separate server, durable, fast enough
Serialization	`orjson`	5–10× faster than stdlib `json`, strict UTF-8
Validation	Pydantic v2 + `jsonschema`	Pydantic for runtime models, jsonschema for LLM-output validation
Logging	`structlog` (JSON via stdlib)	Structured events, tier-routable, secret-redactable
Encryption	`cryptography` (Fernet AES-128-CBC + HMAC-SHA256, PBKDF2 480K iterations)	Audited primitives, no DIY crypto
Sandbox	Docker (read-only, no-network, resource limits, image whitelist)	Real isolation, well-understood blast radius
Embeddings	`sentence-transformers` (paraphrase-multilingual-MiniLM-L12-v2)	Local, no API, multilingual (EN + SK)
HTTP	`aiohttp` (server + client)	One library for both sides, async-native
Scheduling	Plain `asyncio` loops with `await asyncio.sleep`	No APScheduler footgun, deterministic, observable
Process supervision	`psutil`	Cross-platform, battle-tested
Type checking	mypy strict on the whole `agent/` tree	Catch wiring bugs at CI time
Lint / format	`ruff`	Fast, opinionated, replaces flake8 + isort

We deliberately avoid: APScheduler, Celery, Redis, RabbitMQ, Kubernetes, vendor SDKs that pull in 50+ transitive deps. The whole agent fits in pip install -e . with a tiny pyproject.toml.

Storage layout

<AGENT_DATA_DIR>/                  ← .agent_runtime/ by default
├── memory/
│   ├── memories.db                ← 4-type memory store + provenance
│   ├── conversations.db           ← persistent conversation context
│   └── rag/                       ← embedding index cache
├── tasks/
│   └── tasks.db
├── finance/
│   └── finance.db                 ← propose/approve/complete + budget snapshots
├── projects/
│   └── projects.db
├── workspaces/
│   ├── <workspace_id>/            ← per-job isolated workspace
│   └── workspaces.db              ← audit trail with hash chain
├── build/
│   └── builds.db                  ← jobs + artifacts (WAL mode)
├── review/
│   └── reviews.db                 ← jobs + artifacts (WAL mode)
├── control/
│   ├── control.db                 ← plans, traces, cost ledger, settlement
│   └── llm_runtime.json           ← operator runtime LLM override
├── approval/
│   └── approvals.db               ← multi-step approval queue
├── identity/
│   └── owner_profile.json         ← agent + owner identity (post-onboarding)
└── logs/                          ← AGENT_LOG_DIR (default: <data_dir>/logs)
    ├── long/
    │   └── agent-long.log[.YYYY-MM-DD]
    └── short/
        └── agent-short.log[.YYYY-MM-DD-HH]

<AGENT_PROJECT_ROOT>/agent/vault/
└── secrets.enc                    ← v2 single-file (header + salt + Fernet token)

AGENT_DATA_DIR defaults to .agent_runtime/ for fresh installs and agent/ for legacy installs (so existing operators don't have data move under their feet). The vault deliberately stays in the project tree because it's the only file that's both encrypted and required at boot.

Boot sequence

python -m agent
  │
1.  load .env (operator-managed, gitignored)
2.  resolve data_dir (env > legacy detection > .agent_runtime)
3.  pin AGENT_DATA_DIR + AGENT_LOG_DIR + AGENT_PIDFILE_PATH into env
4.  setup_tiered_logging() — installs _TierRouter on root logger,
     switches structlog to stdlib BoundLogger
5.  _check_pidfile() — refuse to start if another instance is running
6.  AgentOrchestrator(data_dir).initialize()
       │
       ├─ memory store (open SQLite, replay WAL)
       ├─ task manager
       ├─ finance tracker (asyncio.Lock per tx)
       ├─ project manager
       ├─ workspace manager (recover orphaned workspaces from SQLite)
       ├─ build storage (WAL mode)
       ├─ review storage
       ├─ approval queue
       ├─ control plane state (plans/traces/cost/settlement)
       ├─ runtime model + LLM runtime control
       ├─ gateway (provider routes)
       ├─ build service + review service
       ├─ intake router
       ├─ recurring workflows + pipeline orchestrator
       ├─ settlement service
       ├─ reporting service
       ├─ vault (open secrets.enc, migrate v1→v2 if needed)
       ├─ message router
       ├─ watchdog
       ├─ job runner (12 cron jobs registered)
       ├─ tool executor (with operator controls)
       ├─ agent brain (wires tool executor)
       ├─ telegram bot + handler
       └─ HTTP API + dashboard
  │
7.  signal handlers (SIGINT, SIGTERM)
8.  enter run loop

If any step fails, the process exits with a clear error message and a non-zero exit code. There is no silent degradation.

Design principles (and the tests that enforce them)

Principle	Enforced by
Anti-stochastika — LLM only when no cheaper layer answered	`test_brain_core.py`, `test_routing_eval.py`
Deny-by-default — unknown tools and unknown channels are blocked	`test_tool_governance.py`, `test_policy_regression.py`, `test_security_invariants.py`
Fail-fast — wrong vault key, missing config, corrupt state surface immediately	`test_vault.py::TestVaultWrongKeyWriteFailFast`, `test_security_audit.py`
Human-in-the-loop for money + host access + external writes	`test_finance_approval.py`, `test_multi_step_approval.py`, `test_approval_queue.py`
Persistent state — SQLite everywhere, survives crashes	`test_workspace_recovery.py`, `test_persistent_conversation.py`, `test_control_plane.py`
Crash-safe vault writes — single atomic `os.replace` per write	`test_vault.py::TestVaultV2Format`, `TestVaultV2MigrationCrashSafety`
Explainability — every decision recorded	`test_explanation.py`, `test_action_envelope.py`
Sovereign by default — no telemetry leaks	`test_security_audit.py::TestNoSecretsInLogs`, gateway audit

Where to read next

Adding a feature? Start at Modules to find the right home for the code, then Testing for the test pyramid.
Operating it? Deployment for first install, then Operator Handbook for daily ops.
Debugging it? Troubleshooting for common issues, Tiered logging for where the logs live.
Reviewing security? Security for the model, Vault for crypto details.

Repo · CHANGELOG · Releases · Issues · MIT License

Agent Life Space

v1.35.0 · Latest Release

Getting started

Architecture

Subsystems

Development

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture

Architecture

System overview

The 9-layer brain pipeline

Module map

Data flow

Brain pipeline

Build pipeline

Review pipeline

Control plane

Vault writes

Tiered logging

Technology stack

Storage layout

Boot sequence

Design principles (and the tests that enforce them)

Where to read next

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Agent Life Space

Clone this wiki locally