Skip to content

Latest commit

 

History

History
136 lines (92 loc) · 5.11 KB

File metadata and controls

136 lines (92 loc) · 5.11 KB

Neurosymbolic Guardrails: Verifiable Agent Decisions

Combines LLM flexibility with symbolic rules for verifiable, constrained decision-making in AI agents.

Diagram showing LLM flexibility combined with symbolic rule enforcement

The Problem

Research (ATA: Autonomous Trustworthy Agents, 2024) shows that agents hallucinate when business rules are expressed only in natural language prompts:

  • Parameter errors: Agent calls book_hotel(guests=15) despite "Maximum 10 guests" in docstring
  • Completeness errors: Agent executes bookings without required payment verification
  • Tool bypass behavior: Agent confirms success without calling validation tools

Why prompt engineering fails: Prompts are suggestions, not constraints. Agents can ignore docstring instructions because they're processed as text, not executable rules.

The Solution: Neurosymbolic Guardrails for AI Agents with Strands Agents Hooks

Neurosymbolic integration combines:

  • Neural (LLM): Understands natural language, interprets intent, selects tools
  • Symbolic (Rules): Validates constraints, enforces prerequisites, blocks invalid operations
  • Strands Hooks: Intercept tool calls before execution to enforce rules
User Query → LLM (understands) → Tool Selection → Hook (validates) → Execute or Block

Strands Agents makes this simple: Just create a hook, define your rules, and attach it to your agent. The framework handles the rest.

Quick Start

Prerequisites

Model

This demo uses OpenAI with GPT-4o-mini by default (requires OPENAI_API_KEY environment variable).

You can swap the model for any provider supported by Strands — Amazon Bedrock, Anthropic, Ollama, etc. See Strands Model Providers for configuration.

Setup

uv venv && uv pip install -r requirements.txt
uv run test_neurosymbolic_hooks.py

How It Works with Strands Agents

1. Define Rules (rules.py)

BOOKING_RULES = [
    Rule(
        name="max_guests",
        condition=lambda ctx: ctx.get("guests", 1) <= 10,
        message="Maximum 10 guests per booking"
    ),
]

2. Create Validation Hook

from strands.hooks import HookProvider, HookRegistry, BeforeToolCallEvent

class NeurosymbolicHook(HookProvider):
    def register_hooks(self, registry: HookRegistry) -> None:
        registry.add_callback(BeforeToolCallEvent, self.validate)
    
    def validate(self, event: BeforeToolCallEvent) -> None:
        ctx = self._build_context(event.tool_use["name"], event.tool_use["input"])
        passed, violations = validate(self.rules[tool_name], ctx)
        
        if not passed:
            event.cancel_tool = f"BLOCKED: {', '.join(violations)}"

3. Clean Tools (no validation logic needed)

@tool
def book_hotel(hotel: str, check_in: str, check_out: str, guests: int = 1) -> str:
    """Book a hotel room."""
    return f"SUCCESS: Booked {hotel} for {guests} guests"

4. Attach Hook to Agent

hook = NeurosymbolicHook(STATE)
agent = Agent(tools=[book_hotel, ...], hooks=[hook])

That's it! Strands Agents handles the interception, validation, and blocking automatically.

Key Insight

Strands Agents makes neurosymbolic integration effortless:

  1. LLM (Neural): Handles natural language understanding and tool selection
  2. Rules (Symbolic): Enforce business logic with verifiable code
  3. Hooks (Integration): Strands automatically intercepts and validates before execution
  4. Clean Separation: Tools stay simple, validation stays centralized

The agent uses the LLM to understand "Confirm booking BK001 for me", but the hook validates that payment was verified before allowing the tool to execute. The LLM cannot bypass these rules.

Why Strands Hooks?

Simple API: Just implement HookProvider and register callbacks
Centralized validation: One hook validates all tools
Clean tools: No validation logic mixed with business logic
Type-safe: Strongly-typed event objects
Composable: Multiple hooks can work together

References


Contributing

Contributions are welcome! See CONTRIBUTING for more information.


Security

If you discover a potential security issue in this project, notify AWS/Amazon Security via the vulnerability reporting page. Please do not create a public GitHub issue.


License

This library is licensed under the MIT-0 License. See the LICENSE file for details.