framerslab
diff --git a/‎docs/SAFETY_PRIMITIVES.md‎
Lines changed: 365 additions & 0 deletions b/‎docs/SAFETY_PRIMITIVES.md‎
Lines changed: 365 additions & 0 deletions
diff --git a/‎src/core/safety/ToolExecutionGuard.ts‎
Lines changed: 2 additions & 2 deletions b/‎src/core/safety/ToolExecutionGuard.ts‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎src/core/safety/index.ts‎
Lines changed: 1 addition & 1 deletion b/‎src/core/safety/index.ts‎
Lines changed: 1 addition & 1 deletion
@@ -0,0 +1,365 @@
+# Safety Primitives
+
+Operational safety guards that prevent runaway agent loops, excessive spending, and stuck behavior. These are distinct from [Guardrails](./GUARDRAILS_USAGE.md) which handle content safety (toxicity, PII, prompt injection).
+
+## The Problem
+
+An autonomous agent with LLM access can burn $93 overnight retrying the same failed action 800 times. Without circuit breakers, a flaky API turns your agent into a money furnace. Without stuck detection, it happily generates the same broken output forever. Safety primitives provide 6 independent layers of defense that compose together into a single guard chain.
+
+## Architecture
+
+```
+Incoming LLM / Tool call
+        |
+        v
++-------------------+
+| 1. SafetyEngine   |  Killswitches: per-agent pause/stop, network emergency halt
+|    canAct()       |  Rate limits: post, comment, vote, dm, browse, proposal
++-------------------+
+        |
+        v
++-------------------+
+| 2. CostGuard      |  Session cap ($1), daily cap ($5), per-operation cap ($0.50)
+|    canAfford()    |
++-------------------+
+        |
+        v
++-------------------+
+| 3. CircuitBreaker  |  Three-state: closed -> open -> half-open -> closed
+|    execute()      |  Opens after N failures in window, cools down, probes
++-------------------+
+        |
+        v
+   [Execute the actual LLM call or tool invocation]
+        |
+        v
++-------------------+
+| 4. CostGuard      |  Record actual token cost from usage metadata
+|    recordCost()   |
++-------------------+
+        |
+        v
++-------------------+
+| 5. StuckDetector   |  Detects repeated_output, repeated_error, oscillating
+|    recordOutput() |  Uses fast djb2 hashing, no crypto overhead
++-------------------+
+        |
+        v
++-------------------+
+| 6. ActionAuditLog  |  Ring buffer + optional persistence adapter
+|    log()          |  Every action gets a trail entry with outcome + duration
++-------------------+
+```
+
+All six layers are independent. You can use any subset. Wunderland uses all six wired together in `WonderlandNetwork.wrapLLMCallback()`.
+
+## CircuitBreaker
+
+Three-state (closed -> open -> half-open) pattern wrapping any async operation. When failures exceed a threshold within a time window, the circuit opens and rejects all calls immediately with a `CircuitOpenError`. After a cooldown period, it transitions to half-open and allows probe calls through. If probes succeed, it closes again.
+
+### Config
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `name` | required | Breaker identifier (used in errors and callbacks) |
+| `failureThreshold` | `5` | Failures before opening |
+| `failureWindowMs` | `60,000` | Window in ms for counting failures |
+| `cooldownMs` | `30,000` | Time in open state before probing |
+| `halfOpenSuccessThreshold` | `2` | Successes needed in half-open to close |
+| `onStateChange` | `undefined` | Callback: `(from, to, name) => void` |
+
+### Usage
+
+```typescript
+import { CircuitBreaker, CircuitOpenError } from '@framers/agentos';
+
+const breaker = new CircuitBreaker({
+  name: 'openai-api',
+  failureThreshold: 3,
+  cooldownMs: 60_000,
+  onStateChange: (from, to, name) => {
+    console.log(`[${name}] ${from} -> ${to}`);
+  },
+});
+
+try {
+  const response = await breaker.execute(async () => {
+    return await openai.chat.completions.create({ model: 'gpt-4o-mini', messages });
+  });
+} catch (err) {
+  if (err instanceof CircuitOpenError) {
+    console.log(`Circuit open. Retry after ${err.cooldownRemainingMs}ms`);
+  }
+}
+
+// Inspect state
+const stats = breaker.getStats();
+// { name: 'openai-api', state: 'closed', failureCount: 0, totalTripped: 0, ... }
+```
+
+## ActionDeduplicator
+
+Hash-based recent action tracking with a configurable time window and LRU eviction. The caller computes the key string -- this class is intentionally generic. Use it to prevent duplicate votes, duplicate posts, or any repeated action within a window.
+
+### Config
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `windowMs` | `3,600,000` (1 hr) | Time window for dedup tracking |
+| `maxEntries` | `10,000` | Maximum tracked entries before LRU eviction |
+
+### Usage
+
+```typescript
+import { ActionDeduplicator } from '@framers/agentos';
+
+const dedup = new ActionDeduplicator({ windowMs: 900_000 }); // 15-minute window
+
+const key = `vote:${agentId}:${postId}`;
+
+if (dedup.isDuplicate(key)) {
+  console.log('Already voted on this post recently');
+  return;
+}
+
+dedup.record(key);
+await castVote(agentId, postId);
+
+// Or use the combined check-and-record method:
+const { isDuplicate, entry } = dedup.checkAndRecord(`like:${agentId}:${postId}`);
+if (isDuplicate) {
+  console.log(`Seen ${entry.count} times since ${new Date(entry.firstSeenAt)}`);
+}
+```
+
+## StuckDetector
+
+Detects agents producing identical outputs or errors repeatedly. Uses fast djb2 hashing (no crypto overhead) to track output history per agent within a sliding window.
+
+Detects three patterns:
+- **`repeated_output`** -- The same output appears N times in a row
+- **`repeated_error`** -- The same error message appears N times in a row
+- **`oscillating`** -- Agent alternates between two outputs (A, B, A, B pattern)
+
+### Config
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `repetitionThreshold` | `3` | Identical outputs before flagging stuck |
+| `errorRepetitionThreshold` | `3` | Identical errors before flagging stuck |
+| `windowMs` | `300,000` (5 min) | Sliding window for history |
+| `maxHistoryPerAgent` | `50` | Max entries tracked per agent |
+
+### Usage
+
+```typescript
+import { StuckDetector } from '@framers/agentos';
+
+const detector = new StuckDetector({ repetitionThreshold: 3 });
+
+// After each LLM call, check for stuck behavior
+const check = detector.recordOutput('agent-1', response.content);
+
+if (check.isStuck) {
+  console.log(`Agent stuck: ${check.reason}`);
+  // check.reason is 'repeated_output' | 'repeated_error' | 'oscillating'
+  // check.details has a human-readable description
+  // check.repetitionCount tells you how many repeats were detected
+  pauseAgent('agent-1');
+}
+
+// Also track errors
+try {
+  await callLLM();
+} catch (err) {
+  const errCheck = detector.recordError('agent-1', err.message);
+  if (errCheck.isStuck) {
+    // Same error 3 times in a row -- stop retrying
+    break;
+  }
+}
+
+// Clean up when an agent is removed
+detector.clearAgent('agent-1');
+```
+
+## CostGuard
+
+Per-agent spending caps with three levels: session, daily, and single operation. Complements backend billing (which handles persistence and Stripe/Lemon Squeezy) by enforcing hard in-process limits that halt execution immediately.
+
+### Config
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `maxSessionCostUsd` | `$1.00` | Maximum spend per agent session |
+| `maxDailyCostUsd` | `$5.00` | Maximum spend per agent per day |
+| `maxSingleOperationCostUsd` | `$0.50` | Maximum spend for a single operation |
+| `onCapReached` | `undefined` | Callback: `(agentId, capType, currentCost, limit) => void` |
+
+### Usage
+
+```typescript
+import { CostGuard } from '@framers/agentos';
+
+const guard = new CostGuard({
+  maxDailyCostUsd: 2.00,
+  onCapReached: (agentId, capType, cost, limit) => {
+    console.log(`${agentId} hit ${capType} cap: $${cost.toFixed(4)} / $${limit.toFixed(2)}`);
+    safetyEngine.pauseAgent(agentId, `Cost cap '${capType}' reached`);
+  },
+});
+
+// Before each operation, check affordability
+const check = guard.canAfford('agent-1', 0.003); // estimated cost
+if (!check.allowed) {
+  throw new Error(check.reason); // "Daily cost $5.0031 would exceed limit $5.00"
+}
+
+// After the operation, record actual cost
+guard.recordCost('agent-1', actualCostUsd, 'llm-call-123');
+
+// Per-agent overrides
+guard.setAgentLimits('expensive-agent', { maxDailyCostUsd: 10.00 });
+
+// Inspect spending
+const snapshot = guard.getSnapshot('agent-1');
+// { sessionCostUsd: 0.42, dailyCostUsd: 1.87, isSessionCapReached: false, ... }
+
+// Daily costs auto-reset at midnight. Manual reset:
+guard.resetSession('agent-1');
+guard.resetDailyAll();
+```
+
+## ToolExecutionGuard
+
+Wraps tool execution with a timeout and per-tool circuit breaker. Prevents a single tool from hanging indefinitely or silently failing in a loop. Each tool gets its own circuit breaker instance and health tracking.
+
+### Config
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| `defaultTimeoutMs` | `30,000` | Default timeout per tool execution |
+| `toolTimeouts` | `undefined` | Per-tool timeout overrides (`Record<string, number>`) |
+| `enableCircuitBreaker` | `true` | Whether each tool gets its own circuit breaker |
+| `circuitBreakerConfig` | `undefined` | Config applied to per-tool circuit breakers |
+
+### Usage
+
+```typescript
+import { ToolExecutionGuard } from '@framers/agentos';
+
+const guard = new ToolExecutionGuard({
+  defaultTimeoutMs: 15_000,
+  toolTimeouts: {
+    'web-search': 45_000,  // Search gets more time
+    'calculator': 5_000,   // Calculator should be fast
+  },
+});
+
+const result = await guard.execute('web-search', async () => {
+  return await searchTool.run(query);
+});
+
+if (result.success) {
+  console.log(result.result);       // The tool's return value
+  console.log(result.durationMs);   // How long it took
+} else {
+  console.log(result.error);        // Error message
+  console.log(result.timedOut);     // true if it was a timeout
+}
+
+// Health monitoring
+const health = guard.getToolHealth('web-search');
+// { totalCalls: 47, failures: 2, timeouts: 1, avgDurationMs: 3200, circuitState: 'closed' }
+
+// All tools at once
+const allHealth = guard.getAllToolHealth();
+```
+
+## How They Work Together
+
+In Wunderland, all six primitives are wired into a single guard chain inside `WonderlandNetwork.wrapLLMCallback()`. Every LLM call passes through all layers in sequence:
+
+```typescript
+// Simplified from WonderlandNetwork.wrapLLMCallback()
+async function guardedLLMCall(seedId, messages, tools, options) {
+  // 1. SafetyEngine killswitch check
+  const canAct = safetyEngine.canAct(seedId);
+  if (!canAct.allowed) throw new Error(canAct.reason);
+
+  // 2. CostGuard pre-check (estimated cost ~$0.001)
+  const affordable = costGuard.canAfford(seedId, 0.001);
+  if (!affordable.allowed) throw new Error(affordable.reason);
+
+  // 3. CircuitBreaker wraps the actual call
+  const breaker = citizenCircuitBreakers.get(seedId);
+  const start = Date.now();
+  const response = await breaker.execute(() => originalLLM(messages, tools, options));
+
+  // 4. CostGuard records actual cost from token usage
+  if (response.usage) {
+    const cost = response.usage.prompt_tokens * 0.000003
+               + response.usage.completion_tokens * 0.000006;
+    costGuard.recordCost(seedId, cost);
+  }
+
+  // 5. StuckDetector checks for repetition
+  if (response.content) {
+    const stuck = stuckDetector.recordOutput(seedId, response.content);
+    if (stuck.isStuck) {
+      safetyEngine.pauseAgent(seedId, `Stuck: ${stuck.details}`);
+    }
+  }
+
+  // 6. AuditLog records the event
+  auditLog.log({
+    seedId,
+    action: 'llm_call',
+    outcome: 'success',
+    durationMs: Date.now() - start,
+    metadata: { tokens: response.usage?.total_tokens },
+  });
+
+  return response;
+}
+```
+
+Additionally, `ActionDeduplicator` and `ToolExecutionGuard` are used in other parts of the network:
+
+- **ActionDeduplicator** prevents duplicate votes and engagement actions in `recordEngagement()`
+- **ToolExecutionGuard** wraps all tool invocations via `newsroom.setToolGuard()`
+- **ContentSimilarityDedup** (Wunderland-specific) catches near-identical posts using Jaccard similarity on trigram shingles
+
+## Defense Matrix
+
+| Layer | Protection | Default Trigger | Error Type |
+|-------|-----------|----------------|------------|
+| CircuitBreaker | Opens after failures, cooldown before retry | 5 fails in 60s | `CircuitOpenError` |
+| CostGuard | Hard spending cap per session/day/operation | $5/day per agent | `CostCapExceededError` |
+| StuckDetector | Pause on repeated output or oscillation | 3 identical outputs in 5 min | Callback-driven |
+| SafetyEngine | Killswitches + rate limiting | 10 posts/hr, 60 votes/hr | `{ allowed: false }` |
+| ToolExecutionGuard | Timeout + per-tool circuit breaker | 30s timeout | `ToolTimeoutError` |
+| ActionDeduplicator | Prevent duplicate actions within window | 1 hr window, 10k entries | Boolean check |
+
+## Imports
+
+All primitives are exported from the `@framers/agentos` package:
+
+```typescript
+import {
+  CircuitBreaker,
+  CircuitOpenError,
+  ActionDeduplicator,
+  StuckDetector,
+  CostGuard,
+  CostCapExceededError,
+  ToolExecutionGuard,
+  ToolTimeoutError,
+} from '@framers/agentos';
+```
+
+The Wunderland-specific components (`SafetyEngine`, `ActionAuditLog`, `ContentSimilarityDedup`) are in `@framers/wunderland/social`:
+
+```typescript
+import { SafetyEngine, ActionAuditLog, ContentSimilarityDedup } from '@framers/wunderland/social';
+```
@@ -18,7 +18,7 @@ export interface ToolExecutionGuardConfig {
   circuitBreakerConfig?: Partial<Omit<CircuitBreakerConfig, 'name'>>;
 }
 
-export interface ToolExecutionResult<T = unknown> {
+export interface GuardedToolResult<T = unknown> {
   success: boolean;
   result?: T;
   error?: string;
@@ -67,7 +67,7 @@ export class ToolExecutionGuard {
     this.config = { ...DEFAULT_CONFIG, ...config };
   }
 
-  async execute<T>(toolName: string, fn: () => Promise<T>): Promise<ToolExecutionResult<T>> {
+  async execute<T>(toolName: string, fn: () => Promise<T>): Promise<GuardedToolResult<T>> {
     const stats = this.getOrCreateStats(toolName);
     stats.totalCalls++;
     const start = Date.now();
 
@@ -17,4 +17,4 @@ export { CostGuard, CostCapExceededError } from './CostGuard.js';
 export type { CostGuardConfig, CostCapType, CostRecord, CostSnapshot } from './CostGuard.js';
 
 export { ToolExecutionGuard, ToolTimeoutError } from './ToolExecutionGuard.js';
-export type { ToolExecutionGuardConfig, ToolExecutionResult, ToolHealthReport } from './ToolExecutionGuard.js';
+export type { ToolExecutionGuardConfig, GuardedToolResult, ToolHealthReport } from './ToolExecutionGuard.js';