This guide helps you migrate from prompt-based AI safety to Agent OS kernel-based enforcement.
| Prompt-Based Safety | Kernel-Based Enforcement |
|---|---|
| LLM decides whether to comply | Kernel enforces before execution |
| Can be bypassed with jailbreaks | Cannot be bypassed (deterministic) |
| Non-deterministic results | 100% consistent enforcement |
| Difficult to audit | Complete audit trail |
| Hope-based security | Guarantee-based security |
List all prompt-based safety instructions you're currently using:
# BEFORE: Prompt-based safety
system_prompt = """
You are a helpful assistant.
IMPORTANT:
- Never reveal API keys or passwords
- Do not access files outside /workspace
- Do not run destructive SQL commands like DROP or DELETE
- Always ask for confirmation before making changes
"""Convert each prompt instruction to an Agent OS policy:
| Prompt Instruction | Agent OS Policy |
|---|---|
| "Never reveal API keys" | Policy.no_secrets() |
| "Do not access files outside /workspace" | Policy.file_access("/workspace") |
| "No destructive SQL" | Policy.no_destructive_sql() |
| "Ask for confirmation" | Policy.require_approval() |
pip install agent-control-plane# AFTER: Kernel-based enforcement
from agent_control_plane import ControlPlane, PolicyEngine
# Create control plane with policies
plane = ControlPlane()
# Add policies (equivalent to your prompt instructions)
plane.policy_engine.add_constraint("agent", [
"read_file",
"write_file",
"query_database",
])
# Configure file access restriction
plane.policy_engine.protected_paths = [
"/etc/",
"/sys/",
"C:\\Windows\\",
]
# Configure SQL protection
# (Built-in: blocks DROP, DELETE without WHERE, TRUNCATE)Before (LangChain example):
from langchain.agents import AgentExecutor
agent = AgentExecutor(
agent=my_agent,
tools=tools,
)
# Run with prompt-based safety only
result = agent.invoke({"input": user_query})After (with Agent OS):
from langchain.agents import AgentExecutor
from agent_control_plane.langchain_adapter import GovernedAgentExecutor
# Wrap with Agent OS governance
governed_agent = GovernedAgentExecutor(
agent=my_agent,
tools=tools,
control_plane=plane,
)
# Run with kernel-based enforcement
result = governed_agent.invoke({"input": user_query})
# All tool calls pass through policy engine BEFORE executionRun your existing test cases and verify:
- Safe actions still work:
# Should succeed
result = governed_agent.invoke({"input": "Read the config file"})
assert result["success"]- Dangerous actions are blocked:
# Should be blocked by policy, not by hoping the LLM refuses
result = governed_agent.invoke({"input": "Delete all user data"})
assert not result["success"]
assert "Policy" in result.get("error", "")- Check audit logs:
# All actions are logged
logs = plane.flight_recorder.get_recent_entries(100)
for log in logs:
print(f"{log['action']}: {log['verdict']}")Once kernel-based enforcement is verified, you can simplify your prompts:
# BEFORE: Long safety-focused prompt
system_prompt = """
You are a helpful assistant.
IMPORTANT:
- Never reveal API keys or passwords
- Do not access files outside /workspace
- Do not run destructive SQL commands
- Always ask for confirmation
- Never execute code that could harm the system
- Do not access sensitive user data without permission
...
"""
# AFTER: Focus on task, not safety
system_prompt = """
You are a helpful assistant for data analysis tasks.
"""
# Safety is enforced by the kernel, not by promptsfrom agent_control_plane.langchain_adapter import GovernedAgentExecutor
governed = GovernedAgentExecutor(
agent=agent,
tools=tools,
control_plane=plane,
shadow_mode=False, # Set True for testing without blocking
)from agent_control_plane.crewai_adapter import GovernedCrew
crew = GovernedCrew(
agents=[agent1, agent2],
tasks=[task1, task2],
control_plane=plane,
)from agent_control_plane.openai_adapter import GovernedAssistant
assistant = GovernedAssistant(
assistant_id="asst_xxx",
control_plane=plane,
)# Direct kernel integration
@plane.kernel.register
def my_tool(query: str) -> str:
# This function is governed by the kernel
return execute_query(query)
# Calls are validated before execution
result = plane.kernel.execute(my_tool, "SELECT * FROM users")Test your policies without blocking actions:
# Enable shadow mode
plane.shadow_mode = True
# Run your agent - actions execute but violations are logged
result = governed_agent.invoke({"input": "user request"})
# Review what WOULD have been blocked
violations = plane.get_shadow_violations()
for v in violations:
print(f"Would block: {v['action']} - {v['reason']}")
# Tune policies until satisfied, then disable shadow mode
plane.shadow_mode = FalseYou can keep prompts for user experience while relying on kernel for security:
# Prompts for UX (optional, not for security)
system_prompt = """
You are a helpful assistant. When you can't perform an action,
explain why in a friendly way.
"""
# Kernel for actual enforcement (required)
plane = ControlPlane(policies=[...])If issues arise, you can temporarily disable enforcement:
# Emergency: disable enforcement (NOT RECOMMENDED for production)
plane.enforcement_enabled = False
# Better: Use shadow mode to log without blocking
plane.shadow_mode = True- All prompt safety instructions converted to policies
- Agent wrapped with appropriate adapter
- Safe actions verified to work
- Dangerous actions verified to be blocked
- Audit logs capturing all actions
- Shadow mode testing completed
- Prompts simplified (safety removed)
- Team trained on new architecture
Check the audit log for the exact policy that blocked:
logs = plane.flight_recorder.get_recent_entries(10)
for log in logs:
if log["verdict"] == "blocked":
print(f"Blocked by: {log['policy']} - {log['reason']}")Add it to the constraint graph:
plane.policy_engine.add_constraint("agent_role", [
"existing_tool",
"new_tool_to_allow", # Add here
])Yes, use ABAC (Attribute-Based Access Control):
plane.policy_engine.set_agent_context("agent_id", {
"user_role": "admin",
"department": "finance",
})- Read Architecture Overview
- Explore Policy Templates
- Review Security Specification