Policy and Approval System

Wazuh Autopilot implements an enterprise-grade policy engine that controls all automated actions. This document explains how policies work and how to configure them.

Core Principles

Deny by Default - Actions require explicit enablement
Separation of Duties - Different agents have different permissions
Audit Trail - Every decision is logged with reason codes
Configurable Autonomy - Balance automation with human oversight
Inline Enforcement - Policies are enforced at the Runtime Service level, not just by agents

Enforcement Architecture

Policy enforcement operates at two levels:

1. Runtime-Level (Inline) — Primary

The Runtime Service enforces policies/policy.yaml at six critical points:

Enforcement Point	API Endpoint	What's Checked
Plan Creation	`POST /api/plans`	Each action validated against `actions.allowlist` — must be `enabled`, must meet `min_confidence`. Time window check for `response_planning` operation.
Plan Approval	`POST /api/plans/:id/approve`	Approver validated against `approvers.groups` — must have action in `can_approve`, risk level must be within `max_risk_level`
Plan Execution (pre-loop)	`POST /api/plans/:id/execute`	Evidence count validated against `min_evidence_items`. Time window check for `action_execution` operation — denies entire plan if outside window.
Plan Execution (per-action)	`POST /api/plans/:id/execute`	Each action checked for idempotency (duplicate action+target within window) and rate limits (per-action and global hourly/daily). Denied actions are skipped individually.

Fail modes:

Production mode (AUTOPILOT_MODE=production): Fail-closed — denies if policy cannot be loaded
Bootstrap mode (AUTOPILOT_MODE=bootstrap): Fail-open — warns but allows (easier testing)

2. Agent-Level (Supplementary) — Advisory

The Policy Guard agent is still triggered via webhook when a plan is created. It performs supplementary LLM-based analysis (blast radius assessment, context-aware evaluation) that complements the rule-based inline checks. Its findings are advisory — the Runtime's inline enforcement is authoritative.

Autonomy Levels

Autopilot supports three autonomy levels:

Read-Only (Default for most agents)

Can query Wazuh data via MCP
Can create and update cases
Can post to Slack
Cannot execute response actions

Approval (Default for response actions)

All capabilities of read-only
Can propose response plans
Can request approvals
Executes actions only after approval

Limited-Auto (Optional)

Executes pre-approved safe actions automatically
Still requires approval for risky actions
Must be explicitly enabled

Policy Configuration

All policies are defined in policies/policy.yaml. This file is the source of truth for all policy decisions.

Autonomy Settings

autonomy:
  default_level: approval

  operations:
    triage:
      level: read-only
      auto_execute: true

    response_planning:
      level: approval
      auto_execute: false

    action_execution:
      level: approval
      auto_execute: false

Slack Allowlists

Control where Autopilot can operate:

slack:
  workspace_allowlist:
    - id: "T0123456789"
      name: "Security Team"
      enabled: true

  channels:
    alerts:
      allowlist:
        - id: "C0123456789"
          name: "#security-alerts"
      deny_action: log_and_skip

    approvals:
      allowlist:
        - id: "C1234567890"
          name: "#security-approvals"
      deny_action: log_and_skip

Approver Configuration

Define who can approve what:

approvers:
  groups:
    standard:
      members:
        - slack_id: "U0123456789"
          name: "Security Analyst"
      can_approve:
        - block_ip
        - quarantine_file
      max_risk_level: medium

    elevated:
      members:
        - slack_id: "U1234567890"
          name: "Senior Engineer"
      can_approve:
        - block_ip
        - isolate_host
        - kill_process
      max_risk_level: high

    admin:
      members:
        - slack_id: "U2345678901"
          name: "Security Director"
      can_approve:
        - block_ip
        - quarantine_file
        - isolate_host
        - kill_process
        - disable_user
        - firewall_drop
        - host_deny
        - active_response
        - restart_wazuh
      max_risk_level: critical

  self_approval:
    allowed: false

Action Allowlists

Control which actions are permitted:

actions:
  enabled: true  # Actions enabled - individual actions still require approval

  allowlist:
    block_ip:
      enabled: true
      risk_level: low
      requires_approval: true
      min_approver_group: standard
      min_confidence: 0.7
      min_evidence_items: 2

    isolate_host:
      enabled: true
      risk_level: medium
      requires_approval: true
      min_approver_group: elevated
      min_confidence: 0.8
      min_evidence_items: 3

    disable_user:
      enabled: true
      risk_level: high
      requires_approval: true
      min_approver_group: admin
      min_confidence: 0.9
      min_evidence_items: 5

  deny_unlisted: true

Asset Criticality

Different rules for different asset types:

assets:
  classifications:
    critical:
      patterns:
        hostnames:
          - "^prod-.*"
          - "^db-.*"
        ips:
          - "10.0.1.0/24"
      requires_approver_group: admin
      extra_evidence_required: 2

    production:
      patterns:
        hostnames:
          - "^app-.*"
          - "^web-.*"
      requires_approver_group: elevated

    development:
      patterns:
        hostnames:
          - "^dev-.*"
          - "^test-.*"
      requires_approver_group: standard

  default_classification: production

Thresholds

Minimum requirements for different operations:

thresholds:
  evidence:
    action_execution:
      min_items: 3

  confidence:
    action_execution:
      min: 0.7
    critical_action:
      min: 0.9

Time Windows (Optional)

Restrict operations to certain times. Enforced at runtime — the runtime checks policyCheckTimeWindow() before plan creation and execution.

time_windows:
  enabled: true  # false by default — set to true to activate

  operations:
    action_execution:
      windows:
        - days: [mon, tue, wed, thu, fri]
          start: "06:00"
          end: "22:00"
          timezone: UTC
      outside_window_action: deny  # "deny" blocks, "allow" permits

    response_planning:
      windows:
        - days: [mon, tue, wed, thu, fri, sat, sun]
          start: "00:00"
          end: "23:59"
          timezone: UTC
      outside_window_action: allow

  emergency_override:
    enabled: true
    requires_approver_group: admin
    max_duration_hours: 4

Enforcement behavior:

response_planning is checked during POST /api/plans (plan creation). If denied, the plan is not created and the API returns 400.
action_execution is checked during POST /api/plans/:id/execute (before the action loop). If denied, the entire plan is marked FAILED — no actions execute.
When outside_window_action: allow, operations outside the window are permitted with a log warning.
When time_windows.enabled: false (the default), all time window checks are no-ops.

Rate Limits

Control how many actions can execute per time period. Enforced at runtime — the runtime checks policyCheckActionRateLimit() before each action in the execution loop.

rate_limits:
  # Per-action rate limits
  actions:
    block_ip:
      max_per_hour: 100
      max_per_day: 500
    isolate_host:
      max_per_hour: 20
      max_per_day: 50
    disable_user:
      max_per_hour: 10
      max_per_day: 30

  # Global rate limits (across all action types)
  global:
    max_actions_per_hour: 200
    max_actions_per_day: 1000

Enforcement behavior:

Counters increment only after successful MCP tool execution (failed actions don't consume budget)
Per-action and global limits are checked independently — either can deny
When a rate limit is exceeded, the individual action is skipped with status: "denied" in the execution results; the plan continues with remaining actions
Counter windows auto-reset when they expire (hourly/daily)
Stale counter entries are evicted every 5 minutes
Actions not listed in rate_limits.actions are still subject to global limits

Idempotency / Duplicate Detection

Prevent the same action from executing repeatedly on the same target. Enforced at runtime — the runtime checks policyCheckIdempotency() before each action in the execution loop.

idempotency:
  enabled: true  # true by default

  # State checks (declarative labels for documentation)
  checks:
    block_ip:
      check_method: verify_ip_not_blocked
      deny_if_exists: true
      deny_reason: ALREADY_BLOCKED
    isolate_host:
      check_method: verify_host_not_isolated
      deny_if_exists: true
      deny_reason: ALREADY_ISOLATED

  # Duplicate request detection (enforced at runtime)
  duplicate_detection:
    enabled: true
    window_minutes: 60     # Deny same action+target within this window
    deny_reason: DUPLICATE_REQUEST

Enforcement behavior:

The runtime tracks action_type:target pairs with timestamps
If the same action+target was successfully executed within window_minutes, the action is denied with DUPLICATE_REQUEST
Different targets for the same action type are allowed (e.g., block_ip:10.0.0.1 and block_ip:10.0.0.2 are independent)
Denied actions are skipped individually with status: "denied" — the plan continues
Dedup entries are recorded only after successful execution
Stale entries are evicted every 5 minutes
Maximum 10,000 dedup entries tracked (LRU eviction)

Approval Workflow

1. Response Planner Creates Plan

When a case reaches high/critical severity, the Response Planner agent generates a plan:

{
  "plan_id": "PLAN-20260217-abc12345",
  "case_id": "CASE-20260217-abc12345",
  "actions": [
    {
      "action": "block_ip",
      "target": "192.168.1.100",
      "risk_level": "low"
    }
  ],
  "risk_assessment": {...},
  "blast_radius": {...}
}

2. Inline Policy Enforcement (Automatic)

The Runtime Service enforces policy rules before the plan is stored:

Inline Enforcement (plan creation):
1. ✓ Time window check (response_planning within allowed hours)
2. ✓ Action allowlist (block_ip enabled)
3. ✓ Confidence threshold (0.85 >= 0.7)
4. ✓ deny_unlisted check (action is listed)

Result: ALLOW (plan created, webhook dispatched to Policy Guard)

2b. Policy Guard Evaluates (Supplementary)

The Policy Guard agent receives a webhook and performs LLM-based analysis:

Supplementary Analysis:
1. ✓ Asset criticality (dev system, standard ok)
2. ✓ Evidence threshold (3 items >= 2 required)
3. ✓ Blast radius assessment
4. ✓ Context-aware risk evaluation

Result: ADVISORY — findings added to case

Note: Time window, rate limit, and idempotency checks are now enforced by the Runtime Service (not the Policy Guard agent). The Policy Guard provides supplementary LLM analysis only.

3. Approval Request Posted

An approval request is posted to Slack:

🚨 Approval Request

Case: CASE-20260217-abc12345
Severity: High
Confidence: 85%

Proposed Actions:
1. Block IP 192.168.1.100 (risk: low)

Risk Assessment:
- Blast radius: 1 host affected
- Reversible: Yes

Evidence:
- 47 brute force attempts
- 3 source IPs correlated
- Pattern matches known attack

Required Approver: standard or higher

[Approve] [Deny] [Request Changes]

4. Approval Token Generated

A single-use, time-limited token is created:

{
  "token": "abc123...",
  "plan_id": "PLAN-20260217-abc12345",
  "case_id": "CASE-20260217-abc12345",
  "expires_at": "2026-02-17T11:00:00Z",
  "used": false
}

5. Approver Responds

The approver clicks Approve or uses:

/wazuh approve PLAN-20260217-abc12345

6. Token Validated and Consumed

Policy Guard verifies:

Token is valid
Token not expired
Token not already used
Approver is authorized
Approver is not the requester (self-approval prevention)

7. Action Executed (If Enabled)

If the Responder agent is enabled:

Action is executed via MCP
Result is verified
Evidence pack is updated
Confirmation posted to Slack

Deny Reason Codes

Every policy denial includes a structured reason code:

Code	Description	Enforcement Level
`WORKSPACE_NOT_ALLOWED`	Slack workspace not in allowlist	Slack layer
`CHANNEL_NOT_ALLOWED`	Slack channel not in allowlist	Slack layer
`APPROVER_NOT_AUTHORIZED`	Approver lacks permission for this action	Runtime (plan approval)
`ACTION_NOT_ALLOWED`	Action type not in allowlist	Runtime (plan creation)
`CRITICAL_ASSET_ELEVATED_APPROVAL`	Critical asset requires admin approval	Policy Guard (advisory)
`INSUFFICIENT_EVIDENCE`	Not enough evidence items	Runtime (plan execution)
`LOW_CONFIDENCE`	Confidence score below threshold	Runtime (plan creation)
`time_window_denied`	Operation outside allowed hours	Runtime (plan creation/execution)
`action_rate_limited`	Per-action hourly/daily limit exceeded	Runtime (per-action execution)
`global_rate_limited`	Global hourly/daily limit exceeded	Runtime (per-action execution)
`duplicate_action`	Same action+target within dedup window	Runtime (per-action execution)
`EXPIRED_APPROVAL`	Approval token has expired	Runtime
`INVALID_APPROVAL_TOKEN`	Token is invalid or malformed	Runtime

Metrics

Policy decisions are tracked via Prometheus metrics:

autopilot_policy_denies_total{reason="INSUFFICIENT_EVIDENCE"}
autopilot_policy_denies_total{reason="APPROVER_NOT_AUTHORIZED"}
autopilot_policy_denies_total{reason="ACTION_NOT_ALLOWED"}
autopilot_policy_denies_total{reason="time_window_denied"}
autopilot_policy_denies_total{reason="action_rate_limited"}
autopilot_policy_denies_total{reason="global_rate_limited"}
autopilot_policy_denies_total{reason="duplicate_action"}

Best Practices

Start Restrictive

Begin with conservative settings:

# In policy.yaml — actions require individual enablement and human approval
actions:
  enabled: true
  # Each action in the allowlist must have enabled: true to be available
  # All actions require human approval regardless of this flag

autonomy:
  default_level: approval

Additionally, keep the responder capability disabled until ready:

# In .env — blocks execution even after human approval
AUTOPILOT_RESPONDER_ENABLED=false

Test in Bootstrap Mode

Use bootstrap mode for testing without Tailscale requirements.

Review Deny Rates

Monitor autopilot_policy_denies_total to identify:

Over-restrictive policies
Training needs for approvers
Potential configuration issues

Regular Policy Review

Schedule quarterly reviews of:

Approver lists
Action allowlists
Threshold values
Time windows

Document Exceptions

When making policy exceptions:

Document the business justification
Set an expiration date
Review during next policy audit

Troubleshooting

"Action not allowed"

Check actions.enabled is true
Verify action is in allowlist
Check action's enabled is true

"Approver not authorized"

Verify approver's Slack ID in policy
Check approver is in correct group
Verify group can approve this action type

"Insufficient evidence"

Lower threshold temporarily for testing
Ensure triage/investigation completed
Review evidence collection in playbook

Self-approval issues

If legitimate need for self-approval:

approvers:
  self_approval:
    allowed: true
    exception_groups:
      - admin

Not recommended - breaks separation of duties.

FilesExpand file tree

POLICY_AND_APPROVALS.md

Latest commit

History

POLICY_AND_APPROVALS.md

File metadata and controls

Policy and Approval System

Core Principles

Enforcement Architecture

1. Runtime-Level (Inline) — Primary

2. Agent-Level (Supplementary) — Advisory

Autonomy Levels

Read-Only (Default for most agents)

Approval (Default for response actions)

Limited-Auto (Optional)

Policy Configuration

Autonomy Settings

Slack Allowlists

Approver Configuration

Action Allowlists

Asset Criticality

Thresholds

Time Windows (Optional)

Rate Limits

Idempotency / Duplicate Detection

Approval Workflow

1. Response Planner Creates Plan

2. Inline Policy Enforcement (Automatic)

2b. Policy Guard Evaluates (Supplementary)

3. Approval Request Posted

4. Approval Token Generated

5. Approver Responds

6. Token Validated and Consumed

7. Action Executed (If Enabled)

Deny Reason Codes

Metrics

Best Practices

Start Restrictive

Test in Bootstrap Mode

Review Deny Rates

Regular Policy Review

Document Exceptions

Troubleshooting

"Action not allowed"

"Approver not authorized"

"Insufficient evidence"

Self-approval issues