Skip to content

Latest commit

 

History

History
345 lines (271 loc) · 11.2 KB

File metadata and controls

345 lines (271 loc) · 11.2 KB

Calciforge Host-Agent (Adapter-First)

mTLS RPC server providing safe VM-to-host delegation via an adapter-first architecture.

v4 adds: unified /host/op dispatch endpoint, five adapters (ZFS, Systemd, PCT, Git, Exec/Ansible stub), per-adapter validation, and policy-driven approval flows.

v4 Quick Start — /host/op

# ZFS list
curl -s --cert client.crt --key client.key -k \
  -X POST https://host:18443/host/op \
  -H "Content-Type: application/json" \
  -d '{"kind":"zfs","args":["list"]}'

# Systemd status
curl -s --cert client.crt --key client.key -k \
  -X POST https://host:18443/host/op \
  -d '{"kind":"systemd","resource":"nginx.service","args":["status"]}'

# PCT container status
curl -s --cert client.crt --key client.key -k \
  -X POST https://host:18443/host/op \
  -d '{"kind":"pct","resource":"101","args":["status"]}'

# Git repo status (repo must be in allowed_repos)
curl -s --cert client.crt --key client.key -k \
  -X POST https://host:18443/host/op \
  -d '{"kind":"git","resource":"/srv/myapp","args":["status"]}'

v4 Adapter Config

# Git adapter: repo allowlist (empty = allow any absolute path)
[git]
allowed_repos = ["/srv", "/opt", "/home"]

# Exec adapter: disabled by default
[exec]
enabled = false                              # must be true to activate
allowed_commands = ["/usr/bin/uptime"]       # absolute paths only
# ansible_job_queue = "/var/lib/clash/jobs"  # for Ansible stub

v4 Default Policy Rules

Adapter dispatch respects the same [[rules]] config as v3:

# Systemd — read-only ops: no approval needed
[[rules]]
operation = "systemd-status"
approval_required = false

# PCT — start/stop requires approval
[[rules]]
operation = "pct-start"
approval_required = true

# PCT — destroy always requires admin approval
[[rules]]
operation = "pct-destroy"
approval_required = true
always_ask = true
approval_admin_only = true

# Git — checkout requires approval
[[rules]]
operation = "git-checkout"
approval_required = true

Legacy API (still supported)

Security Features (SDD Round 2)

P0 — Authentication & Authorization ✅

  1. Real mTLS auth middleware — CN extracted from TLS session, ClientIdentity injected
  2. No HTTP fallback — TLS failure is fatal, no plaintext server (P0-2)
  3. Caller identity passed to ZFS — All operations use sudo -u <identity> (P0-3)
  4. Config approval rules enforcedrequires_approval() checked at runtime (P0-4)

P1 — Token & Security Hardening ✅

  1. 16-character token entropy — ~80 bits, cryptographically secure (P1-5)
  2. Token hash logging — Only SHA-256 hashes logged, never plaintext (P1-6)
  3. Filtered /pending endpoint — Returns only caller's pending approvals (P1-7)
  4. Real UID lookup — Uses nix::unistd::User::from_name() / getpwnam() (P1-8)
  5. CRL support — Certificate revocation list checking in TLS (P1-9)

P2 — Operational Readiness ✅

  1. Async ZFS commands — Uses tokio::process::Command (P2-10)
  2. Install scriptinstall.sh for one-command deployment (P2-11)
  3. Config reload — SIGHUP handler support (P2-12)
  4. Prometheus metrics/metrics endpoint on configurable port (P2-13)
  5. Audit log rotation — Daily rotation with retention (P2-14)
  6. Clean code — Compiler warnings addressed

P3 — ZeroClaw & Agent Integration 🔄 (Partial)

  1. ZeroClaw integration framework — Policy engine trait defined, ready for ZeroClaw connection
  2. Agent adapter framework — CN → agent identity mapping, ACPX support
  3. Unified approvals — Signal webhook integration for human confirmation

mTLS Security Features (unchanged from v3)

P0 — Authentication & Authorization ✅

  1. Real mTLS auth middleware — CN extracted from TLS session, ClientIdentity injected
  2. No HTTP fallback — TLS failure is fatal, no plaintext server
  3. Caller identity passed to operations — All operations use sudo -u <identity>
  4. Config approval rules enforced — policy checked at runtime

P1 — Token & Security Hardening ✅

  1. 16-character token entropy — ~80 bits, cryptographically secure
  2. Token hash logging — Only SHA-256 hashes logged, never plaintext
  3. Filtered /pending endpoint — Returns only caller's pending approvals
  4. Real UID lookup — Uses nix::unistd::User::from_name() / getpwnam()

Quick Start

Build

cd /root/projects/calciforge
cargo build --release -p host-agent

Install on Target System

cd /root/projects/calciforge/crates/host-agent
scp target/release/clash-host-agent root@host.example.invalid:/tmp/
ssh root@host.example.invalid
cd /tmp
./clash-host-agent --help

# Or use the install script:
./install.sh

Ansible Deployment

cd /root/.openclaw/workspace/infra/ansible
ansible-playbook -i inventories/toy-vm.yml playbooks/host-agent-deploy.yml

Configuration

[server]
bind = "0.0.0.0:18443"
cert = "/etc/clash/certs/server.crt"
key = "/etc/clash/certs/server.key"
client_ca = "/etc/clash/certs/ca.crt"
crl_file = "/etc/clash/certs/ca.crl"  # Optional

[audit]
log_path = "/var/log/clash/audit.jsonl"
rotation = "daily"  # daily, hourly, never
retention_days = 90

[approval]
ttl_seconds = 300
token_entropy_bits = 80
signal_webhook = "https://signal.example.com/webhook"
allowed_approvers = ["+15555550001"]
# Required for admin endpoints such as /admin/pending and /admin/warn-permissions.
# Leave unset to fail closed until a dedicated admin client certificate exists.
admin_cn_pattern = "admin-*"

[metrics]
enabled = true
bind = "127.0.0.1:19090"

[[agent]]
cn_pattern = "calciforge-agent"
agent_type = "generic"
unix_user = "clash-agent"
autonomy = "supervised"
allowed_operations = ["zfs-list", "zfs-snapshot"]
requires_approval_for = ["zfs-destroy"]

[[rule]]
operation = "zfs-destroy"
approval_required = true
pattern = "tank/.*"

API Endpoints

Health Check

curl -k --cert client.pem https://host:18443/health

ZFS List

curl -k --cert client.pem -X POST \
  -H "Content-Type: application/json" \
  -d '{"dataset": "tank", "type": "snapshot"}' \
  https://host:18443/zfs/list

ZFS Snapshot

curl -k --cert client.pem -X POST \
  -H "Content-Type: application/json" \
  -d '{"dataset": "tank/media", "snapname": "daily-2024-01-15"}' \
  https://host:18443/zfs/snapshot

ZFS Destroy (Requires Approval)

# Request approval
curl -k --cert client.pem -X POST \
  -H "Content-Type: application/json" \
  -d '{"dataset": "tank/media@old", "approval_token": null}' \
  https://host:18443/zfs/destroy

# Response: {"pending_approval": true, "approval_id": "...", "message": "Reply CONFIRM <code>"}

# Confirm via API (or Signal webhook)
curl -k --cert client.pem -X POST \
  -H "Content-Type: application/json" \
  -d '{"approval_id": "...", "token": "<approval-token>"}' \
  https://host:18443/approve

# Execute with token
curl -k --cert client.pem -X POST \
  -H "Content-Type: application/json" \
  -d '{"dataset": "tank/media@old", "approval_token": "<approval-token>"}' \
  https://host:18443/zfs/destroy

Prometheus Metrics

curl http://localhost:19090/metrics

Pending Approvals

# Caller-scoped view: returns only approvals requested by this client cert CN.
curl -k --cert client.pem https://host:18443/pending

# Admin view: returns all pending approvals only when this client cert CN matches
# [approval].admin_cn_pattern. If the pattern is unset, this endpoint is 403.
curl -k --cert admin-client.pem https://host:18443/admin/pending

Security Model

  1. mTLS is mandatory — No plaintext HTTP fallback
  2. Client certificates required — Must present valid cert signed by CA
  3. Identity from CN — Unix user resolved from certificate Common Name
  4. Operations as user — All ZFS commands run as the authenticated user
  5. Approval for destruction — Destroy operations require human confirmation
  6. Admin views fail closed/admin/pending and /admin/warn-permissions require approval.admin_cn_pattern
  7. Audit everything — All operations logged with hashes (no plaintext tokens)

Testing

Unit Tests

cargo test -p host-agent

Integration Tests (on .50)

# Deploy
ansible-playbook -i inventories/toy-vm.yml playbooks/host-agent-deploy.yml

# Copy client cert locally
scp root@host.example.invalid:/etc/clash/certs/client-bundle.pem ./

# Test health
curl -k --cert client-bundle.pem https://host.example.invalid:18443/health

# Test ZFS operations
curl -k --cert client-bundle.pem -X POST \
  -H "Content-Type: application/json" \
  -d '{"dataset": "tank"}' \
  https://host.example.invalid:18443/zfs/list

Architecture

┌─────────────┐      mTLS       ┌────────────────────────────────────────┐
│ Client      │ ───────────────▶│ Host-Agent                             │
│ (cert: CN)  │                 │  ┌─────────┐  ┌──────────┐  ┌────────┐ │
└─────────────┘                 │  │ mTLS    │─▶│ Identity │─▶│ Policy │ │
                                │  │ Layer   │  │ Resolver │  │ Engine │ │
                                │  └─────────┘  └──────────┘  └───┬────┘ │
                                │                                 │      │
                                │  ┌─────────┐  ┌──────────┐      │      │
                                │  │ ZFS     │◀─│  Sudo    │◀─────┘      │
                                │  │ Executor│  │  -u CN   │             │
                                │  └─────────┘  └──────────┘             │
                                │                                        │
                                │  ┌─────────┐  ┌──────────┐             │
                                │  │ Audit   │  │ Signal   │             │
                                │  │ Logger  │  │ Webhook  │             │
                                │  └─────────┘  └──────────┘             │
                                └────────────────────────────────────────┘

Troubleshooting

Service won't start

journalctl -u clash-host-agent -f
# Check certificate permissions
ls -la /etc/clash/certs/
# Check config syntax
cat /etc/clash/host-agent.toml

mTLS handshake fails

# Test with verbose curl
curl -v -k --cert client.pem https://host:18443/health
# Check cert is signed by CA
openssl verify -CAfile /etc/clash/certs/ca.crt /etc/clash/certs/client.crt

ZFS permission denied

# Check ZFS delegation
zfs allow tank
# Check sudoers
sudo -u clash-agent sudo -u root zfs list tank

License

MIT