🦞 + 🪢 ClawReins

Runtime safety and human approval infrastructure for computer-using agents.

OpenClaw is powerful. That's the problem. ClawReins is the watchdog layer.

ClawReins sits between an AI agent and the real world. It’s the watchdog layer for computer-using agents. ClawReins protects agents at two stages:

Before runtime → security scanning
During runtime → action interception

Think of it as sudo for AI agents. The first production integration is OpenClaw. ClawReins plugs into the before_tool_call event and adds:

Prevent destructive actions before they execute
Pause for human approval with YES / ALLOW / CONFIRM flows
Prove what happened with durable audit logs and post-incident review

OpenClaw cannot be its own watchdog. Neither can any CUA.

Demo

Hero example: an OpenClaw agent tries to bulk-delete 4,382 Gmail messages. ClawReins blocks it before execution.

That is the core runtime story:

destructive action detected
execution paused before side effects
human approval required
decision written to the audit trail

In The News

TechCrunch (February 23, 2026): A Meta AI security researcher said an OpenClaw agent ran amok on her inbox

Intercept Example

Runtime Interception

Runtime interception is the enforcement layer. It is what stops an agent mid-trajectory when the action is destructive, irreversible, or operating under risky browser state.

Core capabilities:

Browser-state awareness for CAPTCHA, 2FA, and challenge walls
Irreversibility scoring for risky versus catastrophic actions
Runtime intervention across terminal and messaging approval channels
ToolShield-aligned hardening for new tool rollouts
Full audit logging for every approval decision

Security Scan

ClawReins includes a security scanner that audits the local OpenClaw environment for high-signal misconfigurations before runtime problems turn into incidents.

clawreins scan audits a local OpenClaw installation for high-signal security misconfigurations, writes an HTML report to ~/Downloads/scan-report.html, and prints a file:// link directly in the terminal.

Usage:

# Run the 13-check audit and save the HTML report
clawreins scan

# Save the report and try to open it automatically
clawreins scan --html

# Machine-readable output for CI
clawreins scan --json

# Apply supported auto-fixes after confirmation
clawreins scan --fix

# Apply supported auto-fixes without prompting
clawreins scan --fix --yes

# Compare against the last saved baseline and alert on drift
clawreins scan --monitor

# Compare against the baseline and invoke a notifier when drift is detected
clawreins scan --monitor --alert-command "/path/to/send-openclaw-alert.sh"

Supported auto-fixes:

Rebinding gateway host from 0.0.0.0 to 127.0.0.1
Tightening config file permissions to 600
Injecting a default tools.exec.safeBins allowlist
Disabling authBypass / skipAuth / disableAuth style flags

Before any fix is applied, ClawReins creates a timestamped backup in ~/.scan-backup/.

Drift Monitoring

Drift monitoring is opt-in. It is designed for scheduled runs, not enabled by default.

Default monitoring behavior:

disabled by default
run every 24 hours when scheduled
compare against ~/.openclaw/clawreins/scan-state.json
alert only on worsened posture: verdict worsening, new WARN, or new FAIL
no background auto-fix
HTML report still written to ~/Downloads/scan-report.html

Manual run:

clawreins scan --monitor

The first run creates a baseline. Later runs compare the current report against that saved baseline and only alert when posture worsens.

If you want scheduled jobs to notify through your own transport, add --alert-command. This command runs only when drift is detected. ClawReins exports these environment variables to the notifier:

CLAWREINS_SCAN_SUMMARY
CLAWREINS_SCAN_VERDICT
CLAWREINS_SCAN_REPORT_PATH
CLAWREINS_SCAN_REPORT_URL
CLAWREINS_SCAN_STATE_PATH
CLAWREINS_SCAN_WORSENED_CHECKS

That makes it easy to route alerts through:

an OpenClaw messaging wrapper
a webhook sender
email, Slack, Telegram, or WhatsApp bridge scripts

Notifier example:

clawreins scan --monitor \
  --alert-command "$HOME/bin/send-openclaw-alert.sh"

The alert hook is generic on purpose. The scan CLI does not directly call the in-process OpenClaw plugin API from cron or system schedulers, so the notifier command is the bridge if you want alerts to land through OpenClaw-managed messaging.

Scheduled Runs

Recommended operating model:

run once per day
use --monitor so each run compares against the last saved baseline
add --alert-command if you want drift notifications delivered outside the terminal
never use --fix in scheduled jobs

What happens on scheduled runs:

The first scheduled run creates the baseline in ~/.openclaw/clawreins/scan-state.json.
Later runs compare the current ScanReport against that saved baseline.
ClawReins alerts only when posture worsens: verdict gets worse, a check changes from PASS to WARN, or a check changes from PASS or WARN to FAIL.
Every run still writes ~/Downloads/scan-report.html so the latest full report is easy to inspect.

Recommended scheduler settings:

frequency: every 24 hours
stdout/stderr: append to a dedicated log file such as ~/.openclaw/clawreins/scan-monitor.log
environment: set HOME and OPENCLAW_HOME explicitly
notifier: use --alert-command for OpenClaw wrappers, webhooks, or messaging bridges

Example daily job with drift logging only:

0 9 * * * /usr/bin/env \
  HOME=$HOME \
  OPENCLAW_HOME=$HOME/.openclaw \
  /usr/local/bin/clawreins scan --monitor \
  >> $HOME/.openclaw/clawreins/scan-monitor.log 2>&1

Example daily job with drift alert delivery:

0 9 * * * /usr/bin/env \
  HOME=$HOME \
  OPENCLAW_HOME=$HOME/.openclaw \
  /usr/local/bin/clawreins scan --monitor \
  --alert-command "$HOME/bin/send-openclaw-alert.sh" \
  >> $HOME/.openclaw/clawreins/scan-monitor.log 2>&1

If you want the scheduled job to fail loudly for automation, the exit codes stay the same in monitor mode:

0 for SECURE
1 for NEEDS ATTENTION
2 for EXPOSED

That makes scheduled monitoring usable from cron, systemd, CI, or any wrapper that reacts to non-zero exit codes.

Security Checks

Check	Severity	Detects	Auto-fix
`GATEWAY_BINDING`	Critical	Gateway listening on `0.0.0.0` or missing localhost binding	Yes
`API_KEYS_EXPOSURE`	Critical	Plaintext API keys, tokens, passwords, or secrets stored directly in config files	No
`FILE_PERMISSIONS`	Critical	Config files readable by group or other users instead of `600`	Yes
`HTTPS_TLS`	Warning	Missing HTTPS/TLS or certificate-related configuration	No
`SHELL_COMMAND_ALLOWLIST`	Critical	Missing `safeBins` or equivalent shell allowlist / unrestricted shell execution	Yes
`SENSITIVE_DIRECTORIES`	Warning	Agent environment can still reach directories like `~/.ssh`, `~/.gnupg`, `~/.aws`, or `/etc/shadow`	No
`WEBHOOK_AUTH`	Warning	Webhook endpoints configured without auth tokens or shared secrets	No
`SANDBOX_ISOLATION`	Warning	No Docker or sandbox isolation detected	No
`DEFAULT_WEAK_CREDENTIALS`	Critical	Default, weak, undefined, or missing gateway credentials	No
`RATE_LIMITING`	Warning	No gateway throttling or rate limit configuration	No
`NODEJS_VERSION`	Critical	Node.js versions affected by CVE-2026-21636 permission-model bypass window	No
`CONTROL_UI_AUTH`	Critical	Control UI authentication bypass flags enabled	Yes
`BROWSER_UNSANDBOXED`	Critical	Browser skill config missing `headless: true` or `sandbox: true` protection	No

Exit codes:

0 = SECURE
1 = NEEDS ATTENTION
2 = EXPOSED

Why?

OpenClaw can execute shell commands, modify files, and access your APIs. OS-level isolation (containers, VMs) protects your host machine, but it doesn't protect the services your agent has access to.

ClawReins solves this by hooking into OpenClaw's before_tool_call plugin event. Before any dangerous action executes (writes, deletes, shell commands, API calls), the agent pauses and waits for your decision. In a terminal, you get an interactive prompt. On messaging channels (WhatsApp, Telegram), the agent asks for YES/NO/ALLOW or explicit CONFIRM token (for irreversible actions) via a dedicated clawreins_respond tool. Every choice is logged to an immutable audit trail. Think of it as sudo for your AI agent: nothing happens without your explicit permission.

Features

Prevent Stop destructive actions before execution, score irreversibility, detect risky browser state, and harden tool rollout with ToolShield-aligned guardrails.
Pause Route high-impact actions through terminal or messaging approval flows, including explicit CONFIRM-* tokens for catastrophic operations.
Prove Preserve audit logs, approval decisions, security scan findings, and post-fix artifacts so incidents are reviewable after the fact.

Destructive Action Intercept (Pre-Execution)

ClawReins now applies deterministic pre-execution gating for destructive actions.

Destructive calls are intercepted before execution and forced through HITL approval
HIGH severity supports YES / ALLOW
CATASTROPHIC severity requires explicit CONFIRM-* token
Fail-secure behavior: if approval tooling is unavailable, action stays blocked

Environment toggles:

CLAWREINS_DESTRUCTIVE_GATING=on   # default on
CLAWREINS_BULK_THRESHOLD=20       # default 20
CLAWREINS_CONFIRM_THRESHOLD=80    # optional, irreversibility confirm threshold

Demo script (GIF-friendly):

npm run demo:destructive

Quick Start

Prerequisites

Node.js >= 18.0.0
OpenClaw installed

Installation

# Install plugin
openclaw plugins install clawreins@beta

# Run setup
node ~/.openclaw/extensions/clawreins/dist/cli/index.js init

# Reload gateway
openclaw gateway restart

Done! ClawReins is now protecting your OpenClaw instance.

Building from Source

Use this to run ClawReins from a local clone instead of the published npm package.

# Clone and build
git clone https://github.com/pegasi-ai/clawreins
cd clawreins
npm install
npm run build

# Register as a linked plugin (loads from local source)
openclaw plugins install --link .

# Run setup
node dist/cli/index.js init

# Reload gateway
openclaw gateway restart

After any code change, run npm run build and openclaw gateway restart — no re-registration needed.

clawreins init now enables ToolShield by default:

Uses bundled ToolShield core from this repo first (src/core/toolshield)
Falls back to auto-install via pip only if bundled core is unavailable
Syncs bundled experiences into OpenClaw AGENTS.md
Keeps ClawReins runtime interception + ToolShield instruction hardening aligned

ToolShield Sync (One Command)

If you use ToolShield for instruction-level hardening, sync it directly into your OpenClaw AGENTS.md through ClawReins:

clawreins toolshield-sync

What it does:

Uses bundled ToolShield core from src/core/toolshield when available
Falls back to installed/pip ToolShield if bundled core is unavailable
Removes previously injected ToolShield guidelines by default (idempotent sync)
Imports bundled experiences into OpenClaw instructions (AGENTS.md)

ToolShield project reference: CHATS-lab/ToolShield

Useful overrides:

# Use a different bundled model
clawreins toolshield-sync --model claude-sonnet-4.5

# Custom OpenClaw home/profile
OPENCLAW_HOME=~/.openclaw-profile-a clawreins toolshield-sync

# Target a custom AGENTS.md path
clawreins toolshield-sync --agents-file /path/to/AGENTS.md

# Force a specific bundled ToolShield source root
clawreins toolshield-sync --bundled-dir /path/to/toolshield-root

# Do not auto-install ToolShield (fail if missing)
clawreins toolshield-sync --no-install

# Append without unloading existing ToolShield section
clawreins toolshield-sync --append

How It Works

Terminal Mode (TTY)

Agent calls tool: write('/etc/passwd', 'hacked')
  → before_tool_call hook fires
  → ClawReins checks policy: write = ASK
  → Interactive prompt:
    ┌─────────────────────────────────────┐
    │ 🦞 CLAWREINS SECURITY ALERT         │
    │                                     │
    │ Module: FileSystem                  │
    │ Method: write                       │
    │ Args: ["/etc/passwd", "hacked"]     │
    │                                     │
    │ ❯ ✓ Approve                         │
    │   ✗ Reject                          │
    └─────────────────────────────────────┘
  → You reject → { block: true }
  → Decision logged to audit trail

Channel Mode (WhatsApp / Telegram)

Agent calls tool: bash('rm -rf /tmp/data')
  → before_tool_call → policy = ASK → blocked (pending approval)
  → Agent asks user for approval (or explicit token for irreversible actions)

User replies YES (normal risk):
  → Agent calls clawreins_respond({ decision: "yes" })
  → before_tool_call intercepts → approves pending entry
  → Agent retries bash('rm -rf /tmp/data') → approved ✓

User replies NO:
  → Agent calls clawreins_respond({ decision: "no" })
  → before_tool_call intercepts → denies pending entry
  → Agent does NOT retry → cancelled ✓

For high irreversibility actions:
  → ClawReins returns token requirement (e.g. CONFIRM-AB12CD)
  → Agent calls clawreins_respond({ decision: "confirm", confirmation: "CONFIRM-AB12CD" })
  → Retry proceeds only after token match ✓

The clawreins_respond tool is registered automatically via api.registerTool() when the gateway supports it (yes, no, allow, confirm).

Memory-Aware Pre-Turn Forecasting

Before execution, ClawReins now evaluates accumulated session memory and predicts high-risk turn N+1 trajectories.

Signals:

Drift score: semantic drift from initial intent to current trajectory
Salami index: low-risk looking steps composing into a harmful chain
Commitment creep: rising irreversibility and narrowing rollback options

When memory trajectory risk crosses threshold, ClawReins escalates to HITL before execution and includes predicted next-step danger paths in the approval summary.

Security Policies

ClawReins uses three decision types:

Policy	Behavior
ALLOW	Execute immediately (e.g., file reads)
ASK	Prompt for approval (e.g., file writes)
DENY	Block automatically (e.g., file deletes)

Default policy (Balanced):

FileSystem: read=ALLOW, write=ASK, delete=DENY
Shell: bash=ASK, exec=ASK
Browser: screenshot=ALLOW, navigate/click/type/evaluate=ASK
Gateway: sendMessage=ASK
Network: fetch=ASK, request=ASK
Everything else: ASK (fail-secure default)

Customizable: Every rule is editable. Policies are stored as plain JSON at ~/.openclaw/clawreins/policy.json. See Customizing Security Policies for the full schema, path filtering, and examples.

CLI Commands

clawreins init        # Interactive setup wizard
clawreins configure   # Alias for init (OpenClaw configure entrypoint)
clawreins configure --non-interactive --json  # Automation-friendly machine output
clawreins policy      # Manage security policies
clawreins stats       # View statistics
clawreins audit       # View decision history
clawreins reset       # Reset statistics
clawreins disable     # Temporarily disable
clawreins enable      # Re-enable
clawreins toolshield-sync  # Sync ToolShield guardrails into AGENTS.md
clawreins upgrade     # Reinstall latest clawreins@beta in OpenClaw + restart gateway
clawreins update      # Alias for upgrade
clawreins scan        # Run 13 security checks and save an HTML report
clawreins scan --fix  # Backup config and apply supported remediations
clawreins scan --monitor  # Compare with the last baseline and alert on drift
clawreins scan --monitor --alert-command "/path/to/notifier.sh"  # Run a notifier on drift

Example: View Audit Trail

$ clawreins audit --lines 5

16:05:00 | FileSystem.read              | ALLOWED    |   0.0s
16:06:00 | FileSystem.write             | APPROVED   |   3.5s (human)
16:07:00 | Shell.bash                   | REJECTED   |   1.2s (human)
16:08:00 | FileSystem.delete            | BLOCKED    |   0.0s - Policy: DENY

Example: View Statistics

$ clawreins stats

📊 ClawReins Statistics

Total Calls:    142

Decisions:
  ✅ Allowed:      35 (24.6%)
  ✅ Approved:     89 (62.7%) - by user
  ❌ Rejected:     12 (8.5%)  - by user
  🚫 Blocked:       6 (4.2%)  - by policy

Average Decision Time: 2.8s

Data Storage

All data stored in ~/.openclaw/clawreins/:

~/.openclaw/clawreins/
├── policy.json       # Your security rules
├── decisions.jsonl   # Audit trail (append-only)
├── stats.json        # Statistics
├── scan-state.json   # Last drift-monitoring baseline
├── browser-sessions.json  # Encrypted persistent browser auth/session state
└── clawreins.log          # Application logs

Use as a Library

import { Interceptor, createToolCallHook } from 'clawreins';

// Create interceptor with default policy
const interceptor = new Interceptor();

// Create a hook handler for OpenClaw's before_tool_call event
const hook = createToolCallHook(interceptor);

// Register with the OpenClaw plugin API
api.on('before_tool_call', hook);

Protected Tools

ClawReins intercepts every tool mapped in TOOL_TO_MODULE:

FileSystem: read, write, edit, glob
Shell: bash, exec
Browser: navigate, screenshot, click, type, evaluate
Network: fetch, request, webhook, download
Gateway: listSessions, listNodes, sendMessage

Any unmapped tool falls through to defaultAction (ASK by default).

Architecture

src/
├── core/
│   ├── Interceptor.ts    # Policy evaluation engine
│   ├── Arbitrator.ts     # Human-in-the-loop (TTY prompt / channel queue)
│   ├── ApprovalQueue.ts  # In-memory approval state for channel mode
│   ├── MemoryRiskForecaster.ts  # Drift/salami/commitment pre-turn forecasting
│   ├── toolshield/       # Bundled ToolShield core used for default sync
│   └── Logger.ts         # Winston-based logging
├── plugin/
│   ├── index.ts              # Plugin entry point (hook + tool registration)
│   ├── tool-interceptor.ts   # before_tool_call handler + clawreins_respond intercept
│   └── config-manager.ts     # OpenClaw config management (register/unregister)
├── storage/        # Persistence (PolicyStore, DecisionLog, StatsTracker)
├── cli/            # Command-line interface
├── toolshield/     # ToolShield sync integration helpers
├── types.ts        # TypeScript definitions
└── config.ts       # Default policies

Development

# Clone repo
git clone github.com/pegasi-ai/clawreins
cd clawreins

# Install dependencies
npm install

# Build
npm run build

# Test CLI locally
node dist/cli/index.js init

# Link for global testing
npm link
clawreins --help

Security Guarantees

✅ Zero Trust - Every action evaluated ✅ Synchronous Blocking - Agent waits for approval ✅ No Bypass - Plugin hooks intercept all tool calls ✅ Immutable Audit - JSON Lines append-only format ✅ Human Authority - Critical decisions need approval ✅ Fail Secure - Unknown actions default to ASK/DENY

Contributing

We believe in safe AI. PRs welcome!

Fork the repo
Create your feature branch: git checkout -b feature/amazing
Commit changes: git commit -m 'Add amazing feature'
Push: git push origin feature/amazing
Open a Pull Request

See CONTRIBUTING.md for details.

License

Apache 2.0 - See LICENSE for details.

Acknowledgments

Built for OpenClaw agents
ToolShield methodology and implementation from CHATS-lab/ToolShield
Inspired by the need for human oversight in AI systems
Thanks to the AI safety community

Built with ❤️ for a safer AI future.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
docs		docs
public		public
scripts		scripts
src		src
test		test
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.prettierrc		.prettierrc
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
logo.png		logo.png
openclaw.plugin.json		openclaw.plugin.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

🦞 + 🪢 ClawReins

Demo

In The News

Intercept Example

Runtime Interception

Security Scan

Drift Monitoring

Scheduled Runs

Security Checks

Why?

Features

Destructive Action Intercept (Pre-Execution)

Quick Start

Prerequisites

Installation

Building from Source

ToolShield Sync (One Command)

How It Works

Terminal Mode (TTY)

Channel Mode (WhatsApp / Telegram)

Memory-Aware Pre-Turn Forecasting

Security Policies

CLI Commands

Example: View Audit Trail

Example: View Statistics

Data Storage

Use as a Library

Protected Tools

Architecture

Development

Security Guarantees

Contributing

License

Acknowledgments

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 5

Languages

Packages