Skip to content

Commit 3df90ff

Browse files
itomekkovtcharovclaude
authored
Add tool execution guardrails with confirmation popup (#438) (#565)
## Summary - Adds a blocking confirmation popup in the Agent UI before `run_shell_command` executes, so users can Allow, Deny, or Always Allow each shell command the agent wants to run - CLI path (`gaia chat`) is unaffected — auto-approves as before - Reuses the existing `PermissionPrompt.tsx` / `GaiaNotification` / `notificationStore` infrastructure already in the codebase ## Architecture ``` Agent thread (sync) SSE consumer (async) Frontend ──────────────────── ──────────────────── ──────── _execute_tool("run_shell_command") → console.confirm_tool_execution() → emit {"type":"tool_confirm"} ──→ yields SSE event ──→ ChatView.onAgentEvent → threading.Event.wait(60s) → checks localStorage ↑ → shows PermissionPrompt │ POST /api/chat/confirm ←── → user clicks Allow/Deny └── Event.set() ─────────── resolve_confirmation() → execute or return denial ``` ## Files Changed | File | Change | |------|--------| | `agents/base/console.py` | `confirm_tool_execution()` on `OutputHandler` (default `True`) | | `agents/base/agent.py` | `TOOLS_REQUIRING_CONFIRMATION` + guardrail in `_execute_tool()` | | `ui/sse_handler.py` | Blocking `confirm_tool_execution()` + `resolve_confirmation()` | | `ui/server.py` | `app.state.active_sse_handlers` registry | | `ui/_chat_helpers.py` | Register/unregister handler; pass `http_request` | | `ui/routers/chat.py` | `POST /api/chat/confirm` endpoint | | `ui/models.py` | `ToolConfirmRequest` model | | `webui/src/types/index.ts` | `tool_confirm` event type + fields | | `webui/src/services/api.ts` | `confirmToolExecution()` + event routing | | `webui/src/components/ChatView.tsx` | Handle `tool_confirm`, localStorage auto-approve | | `webui/src/stores/notificationStore.ts` | HTTP fallback, Always Allow persistence | ## Files Reused (no changes) - `PermissionPrompt.tsx` — full modal UI with countdown, Allow/Deny/Always Allow, keyboard shortcuts - `GaiaNotification` type — already has `tool`, `toolArgs`, `timeoutSeconds` fields ## Test plan Manual (Agent UI): 1. `gaia chat --ui` → ask "run ls /tmp" → confirm popup appears with command shown 2. Click **Allow** → command executes, output visible in chat 3. Click **Deny** → agent responds gracefully without executing 4. Check **Remember** + **Allow** → reload page → same prompt skips popup 5. Do nothing for 60 s → popup auto-closes, command denied with warning CLI regression: - `gaia chat` → same shell command prompt → no popup, executes immediately Fixes #438 --------- Co-authored-by: kovtcharov <kalin@extropolis.ai> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 8c2d24a commit 3df90ff

15 files changed

Lines changed: 640 additions & 52 deletions

File tree

docs/guides/agent-ui.mdx

Lines changed: 1 addition & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -10,17 +10,6 @@ GAIA Agent UI is a desktop interface for running AI agents **100% locally** on y
1010
**Ready to install?** See the [Quickstart](/quickstart#agent-ui-fastest) for installation instructions.
1111
</Info>
1212

13-
<Warning>
14-
**Tested Configuration:** The Agent UI has been tested exclusively on **AMD Ryzen AI MAX+ 395** processors running the **Qwen3-Coder-30B-A3B-Instruct-GGUF** model via Lemonade Server. Other hardware or model combinations may work but are not officially verified.
15-
16-
If you encounter issues on a different configuration, please [open a GitHub issue](https://github.com/amd/gaia/issues/new) and include:
17-
- Your processor model (e.g., Ryzen AI 9 HX 370, Ryzen AI MAX+ 395)
18-
- RAM and available memory
19-
- The LLM model you are using
20-
- Operating system and version
21-
- Steps to reproduce the issue
22-
</Warning>
23-
2413
---
2514

2615
## What You Can Do
@@ -85,11 +74,7 @@ See the [Agent UI MCP Server guide](/guides/mcp/agent-ui) for setup instructions
8574

8675
<Accordion title="Port 4200 already in use">
8776
```bash
88-
# npm CLI
89-
gaia-ui --port 8080
90-
91-
# Python CLI
92-
gaia --ui-port 8080
77+
gaia --ui --ui-port 8080
9378
```
9479
</Accordion>
9580

docs/sdk/sdks/agent-ui.mdx

Lines changed: 8 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,6 @@ from gaia.ui.models import SystemStatus, ChatRequest, SessionResponse, DocumentR
1919

2020
**See also:** [User Guide](/guides/agent-ui) | [Agent SDK](/sdk/sdks/chat) | [API Specification](/spec/agent-ui-server)
2121

22-
<Warning>
23-
**Tested Configuration:** The Agent UI has been tested on **AMD Ryzen AI MAX+ 395** with **Qwen3-Coder-30B-A3B-Instruct-GGUF**. Other configurations are not officially verified. See the [User Guide](/guides/agent-ui) for full details and how to report issues on other hardware.
24-
</Warning>
25-
2622
---
2723

2824
## Overview
@@ -350,7 +346,6 @@ class MessageResponse(BaseModel):
350346
content: str
351347
created_at: str
352348
rag_sources: Optional[List[SourceInfo]] = None
353-
agent_steps: Optional[List[AgentStepResponse]] = None
354349

355350
class MessageListResponse(BaseModel):
356351
messages: List[MessageResponse]
@@ -370,7 +365,6 @@ class DocumentResponse(BaseModel):
370365
indexed_at: str
371366
last_accessed_at: Optional[str] = None
372367
sessions_using: int = 0
373-
indexing_status: str = "complete" # pending | indexing | complete | failed | cancelled | missing
374368

375369
class DocumentListResponse(BaseModel):
376370
documents: List[DocumentResponse]
@@ -865,8 +859,8 @@ from gaia.rag.sdk import RAGSDK, RAGConfig
865859

866860
config = RAGConfig()
867861
rag = RAGSDK(config)
868-
result = rag.index_document(filepath)
869-
chunk_count = result.get("num_chunks", 0)
862+
result = rag.index_file(filepath)
863+
chunk_count = result.get("chunk_count", 0)
870864
```
871865

872866
---
@@ -879,16 +873,16 @@ GAIA Agent UI is also available as an npm package for quick installation:
879873
npm install -g @amd-gaia/agent-ui
880874
```
881875

882-
This provides the `gaia-ui` CLI command:
876+
This provides the `gaia` CLI command:
883877

884878
```bash
885-
gaia-ui # Start Python backend + open browser
886-
gaia-ui --serve # Serve frontend only (Node.js static server)
887-
gaia-ui --port 8080 # Custom port
888-
gaia-ui --version # Show version
879+
gaia # Start Python backend + open browser
880+
gaia --serve # Serve frontend only (Node.js static server)
881+
gaia --port 8080 # Custom port
882+
gaia --version # Show version
889883
```
890884

891-
On first run, `gaia-ui` automatically installs the Python backend (uv, Python 3.12, amd-gaia) if not already present. On subsequent runs, it auto-updates if the version doesn't match.
885+
On first run, `gaia` automatically installs the Python backend (uv, Python 3.12, amd-gaia) if not already present.
892886

893887
### Package Contents
894888

src/gaia/agents/base/agent.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,11 @@
3131
CHUNK_TRUNCATION_THRESHOLD = 5000
3232
CHUNK_TRUNCATION_SIZE = 2500
3333

34+
# Tools that require explicit user confirmation before execution.
35+
# Adding a tool name here causes _execute_tool() to call
36+
# console.confirm_tool_execution() and block until the user responds.
37+
TOOLS_REQUIRING_CONFIRMATION = {"run_shell_command"}
38+
3439

3540
class Agent(abc.ABC):
3641
"""
@@ -1148,6 +1153,16 @@ def _execute_tool(self, tool_name: str, tool_args: Dict[str, Any]) -> Any:
11481153
logger.error(f"Tool '{tool_name}' not found in registry")
11491154
return {"status": "error", "error": f"Tool '{tool_name}' not found"}
11501155

1156+
# Guardrail: require explicit user confirmation for high-risk tools.
1157+
# The SSEOutputHandler overrides this to block until the frontend
1158+
# responds; the default implementation auto-approves (CLI path).
1159+
if tool_name in TOOLS_REQUIRING_CONFIRMATION:
1160+
if not self.console.confirm_tool_execution(tool_name, tool_args):
1161+
return {
1162+
"status": "denied",
1163+
"error": f"Tool '{tool_name}' was denied by the user.",
1164+
}
1165+
11511166
tool = _TOOL_REGISTRY[tool_name]["function"]
11521167
sig = inspect.signature(tool)
11531168

src/gaia/agents/base/console.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -205,6 +205,14 @@ def print_header(self, text: str): # pylint: disable=unused-argument
205205
"""Print header. Optional - default no-op."""
206206
...
207207

208+
def confirm_tool_execution(
209+
self,
210+
tool_name: str, # pylint: disable=unused-argument
211+
tool_args: Dict[str, Any], # pylint: disable=unused-argument
212+
) -> bool:
213+
"""Request user confirmation before executing a tool. Returns True to proceed."""
214+
return True
215+
208216
def print_separator(self, length: int = 50): # pylint: disable=unused-argument
209217
"""Print separator. Optional - default no-op."""
210218
...

src/gaia/apps/webui/src/components/ChatView.tsx

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@ import { useEffect, useRef, useCallback, useState } from 'react';
55
import { Edit3, Paperclip, Download, Send, Upload, MessageSquare, Square, ArrowDown, Lock, FileText, FolderSearch, CheckCircle2, X, Link } from 'lucide-react';
66
import { MessageBubble } from './MessageBubble';
77
import { useChatStore } from '../stores/chatStore';
8+
import { useNotificationStore, ALWAYS_ALLOW_TOOLS_KEY } from '../stores/notificationStore';
9+
import type { GaiaNotification } from '../types/agent';
810
import * as api from '../services/api';
911
import { log } from '../utils/logger';
1012
import { getSessionHash } from '../utils/format';
@@ -129,6 +131,8 @@ export function ChatView({ sessionId }: ChatViewProps) {
129131
systemStatus,
130132
} = useChatStore();
131133

134+
const { addNotification } = useNotificationStore();
135+
132136
const session = sessions.find((s) => s.id === sessionId);
133137
const [input, setInput] = useState('');
134138
const [editingTitle, setEditingTitle] = useState(false);
@@ -598,6 +602,43 @@ export function ChatView({ sessionId }: ChatViewProps) {
598602
}
599603
},
600604
onAgentEvent: (event) => {
605+
// ── Tool confirmation popup ──────────────────────────────
606+
if (event.type === 'tool_confirm') {
607+
if (!event.confirm_id) {
608+
console.error('[ChatView] tool_confirm event missing confirm_id, ignoring');
609+
return;
610+
}
611+
const toolName = event.tool || '';
612+
const alwaysAllowed: string[] = JSON.parse(
613+
localStorage.getItem(ALWAYS_ALLOW_TOOLS_KEY) || '[]'
614+
);
615+
if (alwaysAllowed.includes(toolName)) {
616+
// Auto-approve without showing the modal
617+
api.confirmToolExecution(sessionId, event.confirm_id, 'allow', false).catch(
618+
(err) => console.error('[ChatView] auto-confirm failed:', err)
619+
);
620+
return;
621+
}
622+
// Show the PermissionPrompt modal via notificationStore
623+
const notification: GaiaNotification = {
624+
id: event.confirm_id,
625+
type: 'permission_request',
626+
agentId: 'chat',
627+
agentName: 'GAIA',
628+
title: `Allow ${toolName}?`,
629+
message: `The agent wants to execute: ${toolName}`,
630+
timestamp: Date.now(),
631+
read: false,
632+
dismissed: false,
633+
priority: 'high',
634+
tool: toolName,
635+
toolArgs: event.args as Record<string, unknown> | undefined,
636+
timeoutSeconds: event.timeout_seconds ?? 60,
637+
};
638+
addNotification(notification);
639+
return;
640+
}
641+
601642
// Tool completion updates the last TOOL step (not just the last step,
602643
// since thinking/status events may have been interleaved during execution)
603644
if (event.type === 'tool_end') {

src/gaia/apps/webui/src/services/api.ts

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,7 @@ export interface StreamCallbacks {
143143
/** Agent event types that represent activity rather than content. */
144144
const AGENT_EVENT_TYPES = new Set([
145145
'status', 'step', 'thinking', 'plan',
146-
'tool_start', 'tool_end', 'tool_result', 'tool_args', 'agent_error',
146+
'tool_start', 'tool_end', 'tool_result', 'tool_args', 'tool_confirm', 'agent_error',
147147
]);
148148

149149
export function sendMessageStream(
@@ -277,6 +277,18 @@ export function sendMessageStream(
277277
return controller;
278278
}
279279

280+
// -- Tool Confirmation ---------------------------------------------------------
281+
282+
/** Resolve a pending tool execution confirmation (Allow or Deny). */
283+
export async function confirmToolExecution(
284+
sessionId: string,
285+
confirmId: string,
286+
action: 'allow' | 'deny',
287+
remember: boolean,
288+
): Promise<void> {
289+
return apiFetch('POST', '/chat/confirm', { session_id: sessionId, confirm_id: confirmId, action, remember });
290+
}
291+
280292
// -- Documents -----------------------------------------------------------------
281293

282294
export async function listDocuments(): Promise<{ documents: Document[]; total: number; total_size_bytes: number; total_chunks: number }> {

src/gaia/apps/webui/src/stores/notificationStore.ts

Lines changed: 32 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,17 @@
1111

1212
import { create } from 'zustand';
1313
import type { GaiaNotification, NotificationType } from '../types/agent';
14+
import { confirmToolExecution } from '../services/api';
15+
import { useChatStore } from './chatStore';
1416

1517
// ── Constants ────────────────────────────────────────────────────────────
1618

1719
/** Maximum notifications kept in the center to prevent unbounded growth. */
1820
const MAX_NOTIFICATIONS = 500;
1921

22+
/** localStorage key for the "always allow" tool list. */
23+
export const ALWAYS_ALLOW_TOOLS_KEY = 'gaia_always_allow_tools';
24+
2025
// ── State Interface ──────────────────────────────────────────────────────
2126

2227
interface NotificationState {
@@ -78,18 +83,41 @@ export const useNotificationStore = create<NotificationState>((set, get) => ({
7883
setTypeFilter: (type) => set({ typeFilter: type }),
7984

8085
respondToPermission: async (id, action, remember) => {
81-
const api = window.gaiaAPI;
82-
if (api) {
86+
const electronApi = window.gaiaAPI;
87+
if (electronApi) {
88+
// Electron path: route via IPC
8389
try {
84-
await api.notification.respondPermission(id, action, remember);
90+
await electronApi.notification.respondPermission(id, action, remember);
8591
} catch (err) {
8692
console.error('[notificationStore] Failed to send permission response via IPC:', err);
8793
// Don't update local state — the agent didn't receive the response.
8894
// The permission prompt remains actionable so the user can retry.
8995
return;
9096
}
97+
} else {
98+
// Web path: route via HTTP to /api/chat/confirm
99+
const sessionId = useChatStore.getState().currentSessionId;
100+
if (sessionId) {
101+
try {
102+
await confirmToolExecution(sessionId, id, action, remember);
103+
} catch (err) {
104+
console.error('[notificationStore] Failed to send permission response via HTTP:', err);
105+
return;
106+
}
107+
}
108+
}
109+
// Persist "always allow" preference in localStorage
110+
if (action === 'allow' && remember) {
111+
const notification = get().notifications.find((n) => n.id === id);
112+
if (notification?.tool) {
113+
const existing: string[] = JSON.parse(localStorage.getItem(ALWAYS_ALLOW_TOOLS_KEY) || '[]');
114+
if (!existing.includes(notification.tool)) {
115+
existing.push(notification.tool);
116+
localStorage.setItem(ALWAYS_ALLOW_TOOLS_KEY, JSON.stringify(existing));
117+
}
118+
}
91119
}
92-
// Update local state only after IPC succeeds (or if no IPC is available)
120+
// Update local state after response is delivered
93121
set((state) => ({
94122
notifications: state.notifications.map((n) =>
95123
n.id === id

src/gaia/apps/webui/src/styles/index.css

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -348,6 +348,23 @@ textarea:focus-visible {
348348
line-height: 1.3;
349349
}
350350

351+
/* ── Beta Badge ─────────────────────────────────────────────────── */
352+
353+
.beta-badge {
354+
display: inline-block;
355+
font-size: 10px;
356+
font-weight: 800;
357+
font-family: var(--font-mono);
358+
text-transform: uppercase;
359+
letter-spacing: 1px;
360+
padding: 2px 6px;
361+
border-radius: 3px;
362+
background: var(--accent-yellow);
363+
color: #111;
364+
vertical-align: middle;
365+
line-height: 1.3;
366+
}
367+
351368
/* ── Modal Base ──────────────────────────────────────────────────── */
352369

353370
.modal-overlay {

src/gaia/apps/webui/src/types/index.ts

Lines changed: 18 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -197,19 +197,20 @@ export interface AgentStep {
197197

198198
/** Extended SSE event types for agent communication. */
199199
export type StreamEventType =
200-
| 'chunk' // Text content chunk
201-
| 'done' // Stream complete
202-
| 'error' // Error
203-
| 'status' // Agent state change
204-
| 'step' // Step progress
205-
| 'thinking' // Agent reasoning
206-
| 'plan' // Agent plan
207-
| 'tool_start' // Tool execution started
208-
| 'tool_end' // Tool execution completed
209-
| 'tool_result' // Tool result summary
210-
| 'tool_args' // Tool arguments detail
211-
| 'answer' // Final answer from agent
212-
| 'agent_error';// Agent-level error (non-fatal)
200+
| 'chunk' // Text content chunk
201+
| 'done' // Stream complete
202+
| 'error' // Error
203+
| 'status' // Agent state change
204+
| 'step' // Step progress
205+
| 'thinking' // Agent reasoning
206+
| 'plan' // Agent plan
207+
| 'tool_start' // Tool execution started
208+
| 'tool_end' // Tool execution completed
209+
| 'tool_result' // Tool result summary
210+
| 'tool_args' // Tool arguments detail
211+
| 'tool_confirm' // Tool requires user confirmation (blocking)
212+
| 'answer' // Final answer from agent
213+
| 'agent_error'; // Agent-level error (non-fatal)
213214

214215
export interface StreamEvent {
215216
type: StreamEventType;
@@ -243,6 +244,10 @@ export interface StreamEvent {
243244
duration_seconds?: number;
244245
truncated?: boolean;
245246
};
247+
/** Confirmation ID (for tool_confirm events). */
248+
confirm_id?: string;
249+
/** Timeout in seconds (for tool_confirm events). */
250+
timeout_seconds?: number;
246251
/** Structured result data (for tool_result with search results, file lists, etc.). */
247252
result_data?: {
248253
type: string;

0 commit comments

Comments
 (0)