Skip to content

pegasi-ai/clawreins

Repository files navigation

ClawReins Logo

🦞 + 🪢 ClawReins

Runtime safety and human approval infrastructure for computer-using agents.

github.com/pegasi-ai/clawreins

License: Apache 2.0 TypeScript Node.js >= 18.0.0

OpenClaw is powerful. That's the problem. ClawReins is the watchdog layer.

ClawReins sits between an AI agent and the real world. It’s the watchdog layer for computer-using agents. ClawReins protects agents at two stages:

  • Before runtime → security scanning
  • During runtime → action interception

Think of it as sudo for AI agents. The first production integration is OpenClaw. ClawReins plugs into the before_tool_call event and adds:

  • Prevent destructive actions before they execute
  • Pause for human approval with YES / ALLOW / CONFIRM flows
  • Prove what happened with durable audit logs and post-incident review

OpenClaw cannot be its own watchdog. Neither can any CUA.

Demo

ClawReins demo

Hero example: an OpenClaw agent tries to bulk-delete 4,382 Gmail messages. ClawReins blocks it before execution.

That is the core runtime story:

  • destructive action detected
  • execution paused before side effects
  • human approval required
  • decision written to the audit trail

In The News

Intercept Example

ClawReins intercept example

Runtime Interception

Runtime interception is the enforcement layer. It is what stops an agent mid-trajectory when the action is destructive, irreversible, or operating under risky browser state.

Core capabilities:

  • Browser-state awareness for CAPTCHA, 2FA, and challenge walls
  • Irreversibility scoring for risky versus catastrophic actions
  • Runtime intervention across terminal and messaging approval channels
  • ToolShield-aligned hardening for new tool rollouts
  • Full audit logging for every approval decision

Security Scan

ClawReins includes a security scanner that audits the local OpenClaw environment for high-signal misconfigurations before runtime problems turn into incidents.

ClawReins security scan

clawreins scan audits a local OpenClaw installation for high-signal security misconfigurations, writes an HTML report to ~/Downloads/scan-report.html, and prints a file:// link directly in the terminal.

Usage:

# Run the 13-check audit and save the HTML report
clawreins scan

# Save the report and try to open it automatically
clawreins scan --html

# Machine-readable output for CI
clawreins scan --json

# Apply supported auto-fixes after confirmation
clawreins scan --fix

# Apply supported auto-fixes without prompting
clawreins scan --fix --yes

# Compare against the last saved baseline and alert on drift
clawreins scan --monitor

# Compare against the baseline and invoke a notifier when drift is detected
clawreins scan --monitor --alert-command "/path/to/send-openclaw-alert.sh"

Supported auto-fixes:

  • Rebinding gateway host from 0.0.0.0 to 127.0.0.1
  • Tightening config file permissions to 600
  • Injecting a default tools.exec.safeBins allowlist
  • Disabling authBypass / skipAuth / disableAuth style flags

Before any fix is applied, ClawReins creates a timestamped backup in ~/.scan-backup/.

Drift Monitoring

Drift monitoring is opt-in. It is designed for scheduled runs, not enabled by default.

Default monitoring behavior:

  • disabled by default
  • run every 24 hours when scheduled
  • compare against ~/.openclaw/clawreins/scan-state.json
  • alert only on worsened posture: verdict worsening, new WARN, or new FAIL
  • no background auto-fix
  • HTML report still written to ~/Downloads/scan-report.html

Manual run:

clawreins scan --monitor

The first run creates a baseline. Later runs compare the current report against that saved baseline and only alert when posture worsens.

If you want scheduled jobs to notify through your own transport, add --alert-command. This command runs only when drift is detected. ClawReins exports these environment variables to the notifier:

  • CLAWREINS_SCAN_SUMMARY
  • CLAWREINS_SCAN_VERDICT
  • CLAWREINS_SCAN_REPORT_PATH
  • CLAWREINS_SCAN_REPORT_URL
  • CLAWREINS_SCAN_STATE_PATH
  • CLAWREINS_SCAN_WORSENED_CHECKS

That makes it easy to route alerts through:

  • an OpenClaw messaging wrapper
  • a webhook sender
  • email, Slack, Telegram, or WhatsApp bridge scripts

Notifier example:

clawreins scan --monitor \
  --alert-command "$HOME/bin/send-openclaw-alert.sh"

The alert hook is generic on purpose. The scan CLI does not directly call the in-process OpenClaw plugin API from cron or system schedulers, so the notifier command is the bridge if you want alerts to land through OpenClaw-managed messaging.

Scheduled Runs

Recommended operating model:

  • run once per day
  • use --monitor so each run compares against the last saved baseline
  • add --alert-command if you want drift notifications delivered outside the terminal
  • never use --fix in scheduled jobs

What happens on scheduled runs:

  1. The first scheduled run creates the baseline in ~/.openclaw/clawreins/scan-state.json.
  2. Later runs compare the current ScanReport against that saved baseline.
  3. ClawReins alerts only when posture worsens: verdict gets worse, a check changes from PASS to WARN, or a check changes from PASS or WARN to FAIL.
  4. Every run still writes ~/Downloads/scan-report.html so the latest full report is easy to inspect.

Recommended scheduler settings:

  • frequency: every 24 hours
  • stdout/stderr: append to a dedicated log file such as ~/.openclaw/clawreins/scan-monitor.log
  • environment: set HOME and OPENCLAW_HOME explicitly
  • notifier: use --alert-command for OpenClaw wrappers, webhooks, or messaging bridges

Example daily job with drift logging only:

0 9 * * * /usr/bin/env \
  HOME=$HOME \
  OPENCLAW_HOME=$HOME/.openclaw \
  /usr/local/bin/clawreins scan --monitor \
  >> $HOME/.openclaw/clawreins/scan-monitor.log 2>&1

Example daily job with drift alert delivery:

0 9 * * * /usr/bin/env \
  HOME=$HOME \
  OPENCLAW_HOME=$HOME/.openclaw \
  /usr/local/bin/clawreins scan --monitor \
  --alert-command "$HOME/bin/send-openclaw-alert.sh" \
  >> $HOME/.openclaw/clawreins/scan-monitor.log 2>&1

If you want the scheduled job to fail loudly for automation, the exit codes stay the same in monitor mode:

  • 0 for SECURE
  • 1 for NEEDS ATTENTION
  • 2 for EXPOSED

That makes scheduled monitoring usable from cron, systemd, CI, or any wrapper that reacts to non-zero exit codes.

Security Checks

Check Severity Detects Auto-fix
GATEWAY_BINDING Critical Gateway listening on 0.0.0.0 or missing localhost binding Yes
API_KEYS_EXPOSURE Critical Plaintext API keys, tokens, passwords, or secrets stored directly in config files No
FILE_PERMISSIONS Critical Config files readable by group or other users instead of 600 Yes
HTTPS_TLS Warning Missing HTTPS/TLS or certificate-related configuration No
SHELL_COMMAND_ALLOWLIST Critical Missing safeBins or equivalent shell allowlist / unrestricted shell execution Yes
SENSITIVE_DIRECTORIES Warning Agent environment can still reach directories like ~/.ssh, ~/.gnupg, ~/.aws, or /etc/shadow No
WEBHOOK_AUTH Warning Webhook endpoints configured without auth tokens or shared secrets No
SANDBOX_ISOLATION Warning No Docker or sandbox isolation detected No
DEFAULT_WEAK_CREDENTIALS Critical Default, weak, undefined, or missing gateway credentials No
RATE_LIMITING Warning No gateway throttling or rate limit configuration No
NODEJS_VERSION Critical Node.js versions affected by CVE-2026-21636 permission-model bypass window No
CONTROL_UI_AUTH Critical Control UI authentication bypass flags enabled Yes
BROWSER_UNSANDBOXED Critical Browser skill config missing headless: true or sandbox: true protection No

Exit codes:

  • 0 = SECURE
  • 1 = NEEDS ATTENTION
  • 2 = EXPOSED

Why?

OpenClaw can execute shell commands, modify files, and access your APIs. OS-level isolation (containers, VMs) protects your host machine, but it doesn't protect the services your agent has access to.

ClawReins solves this by hooking into OpenClaw's before_tool_call plugin event. Before any dangerous action executes (writes, deletes, shell commands, API calls), the agent pauses and waits for your decision. In a terminal, you get an interactive prompt. On messaging channels (WhatsApp, Telegram), the agent asks for YES/NO/ALLOW or explicit CONFIRM token (for irreversible actions) via a dedicated clawreins_respond tool. Every choice is logged to an immutable audit trail. Think of it as sudo for your AI agent: nothing happens without your explicit permission.

Features

  • Prevent Stop destructive actions before execution, score irreversibility, detect risky browser state, and harden tool rollout with ToolShield-aligned guardrails.
  • Pause Route high-impact actions through terminal or messaging approval flows, including explicit CONFIRM-* tokens for catastrophic operations.
  • Prove Preserve audit logs, approval decisions, security scan findings, and post-fix artifacts so incidents are reviewable after the fact.

Destructive Action Intercept (Pre-Execution)

ClawReins now applies deterministic pre-execution gating for destructive actions.

  • Destructive calls are intercepted before execution and forced through HITL approval
  • HIGH severity supports YES / ALLOW
  • CATASTROPHIC severity requires explicit CONFIRM-* token
  • Fail-secure behavior: if approval tooling is unavailable, action stays blocked

Environment toggles:

CLAWREINS_DESTRUCTIVE_GATING=on   # default on
CLAWREINS_BULK_THRESHOLD=20       # default 20
CLAWREINS_CONFIRM_THRESHOLD=80    # optional, irreversibility confirm threshold

Demo script (GIF-friendly):

npm run demo:destructive

Quick Start

Prerequisites

  • Node.js >= 18.0.0
  • OpenClaw installed

Installation

# Install plugin
openclaw plugins install clawreins@beta

# Run setup
node ~/.openclaw/extensions/clawreins/dist/cli/index.js init

# Reload gateway
openclaw gateway restart

Done! ClawReins is now protecting your OpenClaw instance.

Building from Source

Use this to run ClawReins from a local clone instead of the published npm package.

# Clone and build
git clone https://github.com/pegasi-ai/clawreins
cd clawreins
npm install
npm run build

# Register as a linked plugin (loads from local source)
openclaw plugins install --link .

# Run setup
node dist/cli/index.js init

# Reload gateway
openclaw gateway restart

After any code change, run npm run build and openclaw gateway restart — no re-registration needed.

clawreins init now enables ToolShield by default:

  • Uses bundled ToolShield core from this repo first (src/core/toolshield)
  • Falls back to auto-install via pip only if bundled core is unavailable
  • Syncs bundled experiences into OpenClaw AGENTS.md
  • Keeps ClawReins runtime interception + ToolShield instruction hardening aligned

ToolShield Sync (One Command)

If you use ToolShield for instruction-level hardening, sync it directly into your OpenClaw AGENTS.md through ClawReins:

clawreins toolshield-sync

What it does:

  • Uses bundled ToolShield core from src/core/toolshield when available
  • Falls back to installed/pip ToolShield if bundled core is unavailable
  • Removes previously injected ToolShield guidelines by default (idempotent sync)
  • Imports bundled experiences into OpenClaw instructions (AGENTS.md)

ToolShield project reference: CHATS-lab/ToolShield

Useful overrides:

# Use a different bundled model
clawreins toolshield-sync --model claude-sonnet-4.5

# Custom OpenClaw home/profile
OPENCLAW_HOME=~/.openclaw-profile-a clawreins toolshield-sync

# Target a custom AGENTS.md path
clawreins toolshield-sync --agents-file /path/to/AGENTS.md

# Force a specific bundled ToolShield source root
clawreins toolshield-sync --bundled-dir /path/to/toolshield-root

# Do not auto-install ToolShield (fail if missing)
clawreins toolshield-sync --no-install

# Append without unloading existing ToolShield section
clawreins toolshield-sync --append

How It Works

Terminal Mode (TTY)

Agent calls tool: write('/etc/passwd', 'hacked')
  → before_tool_call hook fires
  → ClawReins checks policy: write = ASK
  → Interactive prompt:
    ┌─────────────────────────────────────┐
    │ 🦞 CLAWREINS SECURITY ALERT         │
    │                                     │
    │ Module: FileSystem                  │
    │ Method: write                       │
    │ Args: ["/etc/passwd", "hacked"]     │
    │                                     │
    │ ❯ ✓ Approve                         │
    │   ✗ Reject                          │
    └─────────────────────────────────────┘
  → You reject → { block: true }
  → Decision logged to audit trail

Channel Mode (WhatsApp / Telegram)

Agent calls tool: bash('rm -rf /tmp/data')
  → before_tool_call → policy = ASK → blocked (pending approval)
  → Agent asks user for approval (or explicit token for irreversible actions)

User replies YES (normal risk):
  → Agent calls clawreins_respond({ decision: "yes" })
  → before_tool_call intercepts → approves pending entry
  → Agent retries bash('rm -rf /tmp/data') → approved ✓

User replies NO:
  → Agent calls clawreins_respond({ decision: "no" })
  → before_tool_call intercepts → denies pending entry
  → Agent does NOT retry → cancelled ✓

For high irreversibility actions:
  → ClawReins returns token requirement (e.g. CONFIRM-AB12CD)
  → Agent calls clawreins_respond({ decision: "confirm", confirmation: "CONFIRM-AB12CD" })
  → Retry proceeds only after token match ✓

The clawreins_respond tool is registered automatically via api.registerTool() when the gateway supports it (yes, no, allow, confirm).

Memory-Aware Pre-Turn Forecasting

Before execution, ClawReins now evaluates accumulated session memory and predicts high-risk turn N+1 trajectories.

Signals:

  • Drift score: semantic drift from initial intent to current trajectory
  • Salami index: low-risk looking steps composing into a harmful chain
  • Commitment creep: rising irreversibility and narrowing rollback options

When memory trajectory risk crosses threshold, ClawReins escalates to HITL before execution and includes predicted next-step danger paths in the approval summary.

Security Policies

ClawReins uses three decision types:

Policy Behavior
ALLOW Execute immediately (e.g., file reads)
ASK Prompt for approval (e.g., file writes)
DENY Block automatically (e.g., file deletes)

Default policy (Balanced):

  • FileSystem: read=ALLOW, write=ASK, delete=DENY
  • Shell: bash=ASK, exec=ASK
  • Browser: screenshot=ALLOW, navigate/click/type/evaluate=ASK
  • Gateway: sendMessage=ASK
  • Network: fetch=ASK, request=ASK
  • Everything else: ASK (fail-secure default)

Customizable: Every rule is editable. Policies are stored as plain JSON at ~/.openclaw/clawreins/policy.json. See Customizing Security Policies for the full schema, path filtering, and examples.

CLI Commands

clawreins init        # Interactive setup wizard
clawreins configure   # Alias for init (OpenClaw configure entrypoint)
clawreins configure --non-interactive --json  # Automation-friendly machine output
clawreins policy      # Manage security policies
clawreins stats       # View statistics
clawreins audit       # View decision history
clawreins reset       # Reset statistics
clawreins disable     # Temporarily disable
clawreins enable      # Re-enable
clawreins toolshield-sync  # Sync ToolShield guardrails into AGENTS.md
clawreins upgrade     # Reinstall latest clawreins@beta in OpenClaw + restart gateway
clawreins update      # Alias for upgrade
clawreins scan        # Run 13 security checks and save an HTML report
clawreins scan --fix  # Backup config and apply supported remediations
clawreins scan --monitor  # Compare with the last baseline and alert on drift
clawreins scan --monitor --alert-command "/path/to/notifier.sh"  # Run a notifier on drift

Example: View Audit Trail

$ clawreins audit --lines 5

16:05:00 | FileSystem.read              | ALLOWED    |   0.0s
16:06:00 | FileSystem.write             | APPROVED   |   3.5s (human)
16:07:00 | Shell.bash                   | REJECTED   |   1.2s (human)
16:08:00 | FileSystem.delete            | BLOCKED    |   0.0s - Policy: DENY

Example: View Statistics

$ clawreins stats

📊 ClawReins Statistics

Total Calls:    142

Decisions:
  ✅ Allowed:      35 (24.6%)
  ✅ Approved:     89 (62.7%) - by user
  ❌ Rejected:     12 (8.5%)  - by user
  🚫 Blocked:       6 (4.2%)  - by policy

Average Decision Time: 2.8s

Data Storage

All data stored in ~/.openclaw/clawreins/:

~/.openclaw/clawreins/
├── policy.json       # Your security rules
├── decisions.jsonl   # Audit trail (append-only)
├── stats.json        # Statistics
├── scan-state.json   # Last drift-monitoring baseline
├── browser-sessions.json  # Encrypted persistent browser auth/session state
└── clawreins.log          # Application logs

Use as a Library

import { Interceptor, createToolCallHook } from 'clawreins';

// Create interceptor with default policy
const interceptor = new Interceptor();

// Create a hook handler for OpenClaw's before_tool_call event
const hook = createToolCallHook(interceptor);

// Register with the OpenClaw plugin API
api.on('before_tool_call', hook);

Protected Tools

ClawReins intercepts every tool mapped in TOOL_TO_MODULE:

  • FileSystem: read, write, edit, glob
  • Shell: bash, exec
  • Browser: navigate, screenshot, click, type, evaluate
  • Network: fetch, request, webhook, download
  • Gateway: listSessions, listNodes, sendMessage

Any unmapped tool falls through to defaultAction (ASK by default).

Architecture

src/
├── core/
│   ├── Interceptor.ts    # Policy evaluation engine
│   ├── Arbitrator.ts     # Human-in-the-loop (TTY prompt / channel queue)
│   ├── ApprovalQueue.ts  # In-memory approval state for channel mode
│   ├── MemoryRiskForecaster.ts  # Drift/salami/commitment pre-turn forecasting
│   ├── toolshield/       # Bundled ToolShield core used for default sync
│   └── Logger.ts         # Winston-based logging
├── plugin/
│   ├── index.ts              # Plugin entry point (hook + tool registration)
│   ├── tool-interceptor.ts   # before_tool_call handler + clawreins_respond intercept
│   └── config-manager.ts     # OpenClaw config management (register/unregister)
├── storage/        # Persistence (PolicyStore, DecisionLog, StatsTracker)
├── cli/            # Command-line interface
├── toolshield/     # ToolShield sync integration helpers
├── types.ts        # TypeScript definitions
└── config.ts       # Default policies

Development

# Clone repo
git clone github.com/pegasi-ai/clawreins
cd clawreins

# Install dependencies
npm install

# Build
npm run build

# Test CLI locally
node dist/cli/index.js init

# Link for global testing
npm link
clawreins --help

Security Guarantees

Zero Trust - Every action evaluated ✅ Synchronous Blocking - Agent waits for approval ✅ No Bypass - Plugin hooks intercept all tool calls ✅ Immutable Audit - JSON Lines append-only format ✅ Human Authority - Critical decisions need approval ✅ Fail Secure - Unknown actions default to ASK/DENY

Contributing

We believe in safe AI. PRs welcome!

  1. Fork the repo
  2. Create your feature branch: git checkout -b feature/amazing
  3. Commit changes: git commit -m 'Add amazing feature'
  4. Push: git push origin feature/amazing
  5. Open a Pull Request

See CONTRIBUTING.md for details.

License

Apache 2.0 - See LICENSE for details.

Acknowledgments

  • Built for OpenClaw agents
  • ToolShield methodology and implementation from CHATS-lab/ToolShield
  • Inspired by the need for human oversight in AI systems
  • Thanks to the AI safety community

Built with ❤️ for a safer AI future.

About

Intervention layer with audit logs for OpenClaw agents. Browser-aware. Trajectory-aware. Human-routable.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages