Operator Handbook

This is the daily-operation playbook. For first install see Deployment. For threat model see Security. For specific subsystems see Vault, Tiered logging, Runtime LLM Control, Build pipeline, Review pipeline.

Audience. You are the operator. You own the server, the vault key, and the .env file. The agent works for you and only for you.

Daily checks (5 minutes)

# 1. Status
.venv/bin/python -m agent --status

# 2. Health (CPU, RAM, disk, modules)
.venv/bin/python -m agent --health

# 3. Operator inbox (failed jobs, pending approvals, settlement attention)
.venv/bin/python -m agent --report

# 4. Tail the long-tier log for anything ERROR / CRITICAL
tail -F .agent_runtime/logs/long/agent-long.log | grep -E '"level":"(error|critical)"'

If --report shows anything in inbox or settlement_attention, handle it before doing anything else.

Telegram cheat sheet

The most common commands you'll send. Owner only unless marked.

Command	When
`/status`	"Is it alive?"
`/health`	"Is the host OK?"
`/budget`	"How much did I spend today?"
`/jobs`	"What did I run recently?"
`/report`	Operator inbox + budget + cost summary
`/runtime`	"Switch backend / detach LLM right now"
`/intake build "..."`	Schedule a build job
`/intake review <path>`	Schedule a review job
`/deliver <job_id>`	Approve or reject a delivery
`/sandbox <code>`	Run code in Docker (fast Python eval)
`/web <url>`	Fetch a URL through the rate-limited HTTP client
`/wallet`	ETH/BTC balance + receive addresses (no send)
`/help`	List of commands

For the full list see API Reference.

Operating the dashboard

Open http://localhost:8420/dashboard. The browser will ask for a Bearer token — paste your AGENT_API_KEY. The dashboard remembers it in localStorage for the session.

Panels:

Status — agent identity, version, modules, watchdog, vault, last boot
Inbox — failed jobs, pending approvals, settlements awaiting decision
Jobs — build / review jobs with filter
Deliveries — delivery packages with approve/reject
Settlements — 402/top-up requests with approve/deny/execute
LLM Runtime — runtime override panel (see Runtime LLM Control)
Cost ledger — per-job cost entries
Audit traces — control plane traces by kind

The dashboard does not accept query-string auth. The ?key= fallback was removed in v1.35.0. If your old bookmarks don't work, paste the token into the login page instead.

When something is wrong

"The agent is not responding to Telegram"

# 1. Is the process alive?
systemctl --user status agent-life-space
ps aux | grep "python -m agent"

# 2. Did it crash recently?
sudo journalctl -u agent-life-space --since "10 minutes ago"

# 3. Is the vault unlocked?
.venv/bin/python -m agent --setup-doctor | jq .vault

# 4. Is the LLM detached?
.venv/bin/python -m agent --llm-runtime-status

If --llm-runtime-status shows "enabled": false, somebody (you?) detached the LLM. Re-enable:

.venv/bin/python -m agent --llm-runtime-enable --llm-runtime-backend cli

"Programming task hangs forever"

You're hitting the Telegram + Claude CLI deny guard. The brain catches this and returns a deterministic message — but if you're on an old version, it might still hang.

Two unblock paths:

# Option A: switch to API backend (recommended for daemon hosts)
.venv/bin/python -m agent --llm-runtime-enable --llm-runtime-backend api --llm-runtime-provider anthropic

# Option B: explicit host opt-in (only if you trust the operator + sandbox)
echo 'AGENT_SANDBOX_ONLY=0' >> .env
sudo systemctl restart agent-life-space

Option A is reversible and doesn't relax security. Option B is a real opt-in to host file access — use only if you're sure.

"I typed the wrong vault key"

The vault is fail-fast on wrong key writes. Reads return None so the agent boots, logs warnings, and lets you fix .env.

# 1. Check the fingerprint of the current key against your password manager
echo -n "$AGENT_VAULT_KEY" | sha256sum | cut -c1-16

# 2. If it doesn't match, fix .env and reboot
nano .env
sudo systemctl restart agent-life-space

# 3. Check the long log for vault decryption failures
grep vault_decryption_failed .agent_runtime/logs/long/agent-long.log* | tail -10

You will not lose data — the wrong-key write path raises VaultDecryptionError and never touches secrets.enc. Full spec: Vault.

"Disk is filling up"

# 1. What's eating the disk?
du -sh .agent_runtime/* | sort -rh | head -10

# 2. Most likely culprits:
#    - .agent_runtime/logs/long/   ← long-tier log files
#    - .agent_runtime/logs/short/  ← short-tier log files (should be tiny if cron ran)
#    - .agent_runtime/build/builds.db   ← if you ran many builds
#    - workspaces/                  ← per-job workspaces

# 3. If cron prune sweep isn't running, run it manually
grep cron_log_retention_pruned .agent_runtime/logs/long/agent-long.log | tail -5

# 4. If short tier is huge, check that AGENT_LOG_DIR matches between __main__ and the cron sweep
.venv/bin/python -m agent --setup-doctor | jq .logs

# 5. Manually prune workspaces older than 14 days
.venv/bin/python -m agent --prune-expired-retained-artifacts

The cron loop runs LogRetentionManager.prune_all() hourly. If it's not running, the cron loop is dead — check the long log for cron_log_retention_error events and restart the agent.

"Build is stuck in PROPOSED"

# 1. List jobs
.venv/bin/python -m agent --list-persisted-jobs | jq '.[] | select(.status=="proposed")'

# 2. Check the budget
.venv/bin/python -m agent --report | jq .budget

# 3. If the job is awaiting approval
.venv/bin/python -m agent --report | jq .inbox

If the budget is over the hard cap, propose a budget reset or wait until midnight UTC.

"Approval queue is full"

# Pending approvals
.venv/bin/python -m agent --report | jq .inbox.pending_approvals

# Approve one (via Telegram or HTTP)
curl -X POST -H "Authorization: Bearer $AGENT_API_KEY" \
  http://localhost:8420/api/operator/deliveries/<id>/approve

Or use the dashboard → Inbox panel.

Lockdown drill

If anything looks suspicious — unexpected job, unexpected gateway call, unexpected log event — the kill switch is one command away:

# Disable every external tool immediately
curl -X POST -H "Authorization: Bearer $AGENT_API_KEY" \
  http://localhost:8420/api/operator/lockdown

# Or via Python REPL
python3 -c "
from agent.core.operator import OperatorControls
ctrl = OperatorControls()
ctrl.lockdown()
print(ctrl.is_locked_down())  # True
"

After lockdown:

Read the explanation log for the last 10 minutes
Read the long-tier log for ERROR / WARNING events
Read the audit trace store (--list-traces) for unexpected entries
Check finance (/budget) for unauthorized proposals

When you've confirmed it's safe:

curl -X POST -H "Authorization: Bearer $AGENT_API_KEY" \
  http://localhost:8420/api/operator/unlock

Lockdown is in-memory only and resets on agent restart. If you want a permanent disable for a specific tool, add it to OperatorControls.disabled_tools in code (and add a test).

Restart safely

# Graceful restart preserves state and finishes in-flight jobs
sudo systemctl restart agent-life-space

# Watch the boot sequence
sudo journalctl -u agent-life-space -f

The boot sequence (see Architecture) runs the setup doctor, opens every database in WAL mode, replays workspace recovery from SQLite, migrates legacy state, and only then accepts Telegram messages. If any step fails, the process exits with a clear error.

Vault operations

Add a secret

There is no Telegram or HTTP write surface for the vault — secrets go in via Python REPL only (so they don't leak into transcripts).

.venv/bin/python -c "
from agent.vault.secrets import SecretsManager
import os
v = SecretsManager(vault_dir='agent/vault', master_key=os.environ['AGENT_VAULT_KEY'])
v.set_secret('NEW_API_KEY', 'sk-...')
"

List secret names

.venv/bin/python -c "
from agent.vault.secrets import SecretsManager
import os
v = SecretsManager(vault_dir='agent/vault', master_key=os.environ['AGENT_VAULT_KEY'])
for name in v.list_secrets():
    print(name)
"

(Lists names only — never values.)

Rotate the master key

See Vault → Rotating the master key.

Backups

Daily cron entry recommended:

0 3 * * * tar czf /backup/agent-$(date +\%Y\%m\%d).tar.gz \
  /home/youruser/Agent_Life_Space/.agent_runtime \
  /home/youruser/Agent_Life_Space/agent/vault/secrets.enc \
  && find /backup -name 'agent-*.tar.gz' -mtime +30 -delete

The vault file alone is useless without .env — store the master key in a real password manager (1Password, Bitwarden, KeePassXC, pass).

Things you should NOT do

Don't paste vault values into a chat assistant. The vault key, the API key, the OAuth token — none of them belong in a chat transcript. We learned this the hard way; see docs/SECURITY_INCIDENT_2026-04-07.md.
Don't run with AGENT_SANDBOX_ONLY=0 casually. This gives the LLM host file access. Only opt in if you fully understand what you're letting it touch.
Don't disable the cron loop. Log retention, dead-man-switch, memory consolidation, and approval expiry all live there. If you really need to disable it for debugging, restart immediately afterwards.
Don't share .env between machines. Each agent instance should have its own master key, its own API key, its own database. Sharing state defeats the sovereignty model.
Don't rm -rf .agent_runtime/ without backing up first. You will lose every job, finance entry, memory, and approval history.
Don't push secrets to git. .env, secrets.enc, *.pem, id_rsa*, *.p12 are all in .gitignore. Don't fight the gitignore.

Quick reference card

# Start / stop / restart
sudo systemctl start    agent-life-space
sudo systemctl stop     agent-life-space
sudo systemctl restart  agent-life-space
sudo systemctl status   agent-life-space

# Live logs
sudo journalctl -u agent-life-space -f
tail -F .agent_runtime/logs/long/agent-long.log | jq .
tail -F .agent_runtime/logs/short/agent-short.log | jq .

# Quick health
.venv/bin/python -m agent --status
.venv/bin/python -m agent --health
.venv/bin/python -m agent --report

# Setup audit
.venv/bin/python -m agent --setup-doctor

# LLM control
.venv/bin/python -m agent --llm-runtime-status
.venv/bin/python -m agent --llm-runtime-disable --llm-runtime-note "maintenance"
.venv/bin/python -m agent --llm-runtime-enable --llm-runtime-backend cli

# Lockdown
curl -X POST -H "Authorization: Bearer $AGENT_API_KEY" http://localhost:8420/api/operator/lockdown
curl -X POST -H "Authorization: Bearer $AGENT_API_KEY" http://localhost:8420/api/operator/unlock

# Operator surfaces
.venv/bin/python -m agent --list-persisted-jobs
.venv/bin/python -m agent --list-deliveries
.venv/bin/python -m agent --list-cost-ledger
.venv/bin/python -m agent --list-traces

# Tests (sanity before pulling new code)
.venv/bin/python -m pytest tests/ -q

Repo · CHANGELOG · Releases · Issues · MIT License

Agent Life Space

v1.35.0 · Latest Release

Getting started

Architecture

Subsystems

Development

Operator Handbook

Operator Handbook

Daily checks (5 minutes)

Telegram cheat sheet

Operating the dashboard

When something is wrong

"The agent is not responding to Telegram"

"Programming task hangs forever"

"I typed the wrong vault key"

"Disk is filling up"

"Build is stuck in PROPOSED"

"Approval queue is full"

Lockdown drill

Restart safely

Vault operations

Add a secret

List secret names

Rotate the master key

Backups

Things you should NOT do

Quick reference card

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Agent Life Space

Clone this wiki locally