Skip to content

fix(agent): eliminate TOCTOU race in approval thread resolution#1543

Closed
zmanian wants to merge 2 commits intostagingfrom
fix/1486-approval-toctou-race
Closed

fix(agent): eliminate TOCTOU race in approval thread resolution#1543
zmanian wants to merge 2 commits intostagingfrom
fix/1486-approval-toctou-race

Conversation

@zmanian
Copy link
Copy Markdown
Collaborator

@zmanian zmanian commented Mar 21, 2026

Summary

  • Combines 4 separate lock acquisitions in process_approval() into a single atomic scope: check thread state, take pending approval, verify request_id, set auto-approve, and transition to Processing all happen under one lock
  • Eliminates the race window where concurrent operations could modify session state between check and mutation
  • Fixes rejection path to warn when thread disappears instead of silently succeeding

Closes #1486

Test plan

  • 2 regression tests: request_id mismatch preserves pending state, approval with always sets auto-approve + Processing atomically
  • All 358 agent tests pass
  • Clippy clean with --all-features
  • Both default and libsql features compile

Restructure process_approval() to hold the session lock for the entire
check-take-verify-mutate sequence. Previously, the lock was dropped after
taking the pending approval and re-acquired to verify request_id and set
state, creating a window where concurrent operations could modify thread
state or lose the pending approval.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added scope: agent Agent core (agent loop, router, scheduler) size: L 200-499 changed lines risk: medium Business logic, config, or moderate-risk modules contributor: core 20+ merged PRs labels Mar 21, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical race condition in the agent's approval process, ensuring that session state is modified atomically. It also improves error handling and adds regression tests to prevent future occurrences of the issue. The changes enhance the reliability and security of the agent.

Highlights

  • TOCTOU Race Condition Fix: This PR eliminates a Time-of-Check Time-of-Use (TOCTOU) race condition in the agent's approval thread resolution process.
  • Atomic Operations: Combines multiple lock acquisitions into a single atomic scope within process_approval() to ensure consistent session state.
  • Improved Error Handling: Enhances the rejection path to provide warnings when a thread disappears during approval rejection, instead of silently succeeding.
  • Regression Tests: Adds two new regression tests to verify that request_id mismatch preserves pending state and that approval with 'always' atomically sets auto-approve and Processing state.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request effectively addresses the TOCTOU (Time-of-Check to Time-of-Use) race condition in the process_approval function by consolidating multiple lock acquisitions into a single atomic scope. This significantly improves the reliability and correctness of thread state management during approval flows. The addition of a warning for disappeared threads during rejection paths enhances observability, and the new regression tests provide solid coverage for the atomic operations, ensuring the fix is robust.

Comment thread src/agent/thread_ops.rs
Comment on lines +891 to +933
let pending = match thread.take_pending_approval() {
Some(p) => p,
None => {
tracing::debug!(
%thread_id,
"Ignoring stale approval: no pending approval found"
);
return Ok(SubmissionResult::ok_with_message(""));
}
};

// Verify request ID if provided
if let Some(req_id) = request_id
&& req_id != pending.request_id
{
// Put it back and return error
let mut sess = session.lock().await;
if let Some(thread) = sess.threads.get_mut(&thread_id) {
// Verify request ID while still holding the lock
if let Some(req_id) = request_id
&& req_id != pending.request_id
{
// Restore the pending approval atomically without releasing the lock
thread.await_approval(pending);
return Ok(SubmissionResult::error(
"Request ID mismatch. Use the correct request ID.",
));
}
return Ok(SubmissionResult::error(
"Request ID mismatch. Use the correct request ID.",
));
}

if approved {
// If always, add to auto-approved set
if always {
let mut sess = session.lock().await;
sess.auto_approve_tool(&pending.tool_name);
// If approved with "always", set auto-approve while we still have the lock.
// Drop the thread borrow before calling sess methods to satisfy the borrow checker.
if approved && always {
let tool_name = pending.tool_name.clone();
sess.auto_approve_tool(&tool_name);
tracing::info!(
"Auto-approved tool '{}' for session {}",
pending.tool_name,
tool_name,
sess.id
);
}

// Reset thread state to processing
// Set thread state to Processing for approved requests.
// Re-borrow the thread since the previous borrow was dropped.
if approved
&& let Some(thread) = sess.threads.get_mut(&thread_id)
{
let mut sess = session.lock().await;
if let Some(thread) = sess.threads.get_mut(&thread_id) {
thread.state = ThreadState::Processing;
}
thread.state = ThreadState::Processing;
}

pending
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This refactoring correctly consolidates multiple state-modifying operations under a single lock acquisition. This is a critical change that eliminates the TOCTOU race condition, ensuring atomicity for pending approval management, request ID verification, auto-approval, and thread state transitions. This directly addresses the core issue outlined in the pull request.

References
  1. To prevent a time-of-check-to-time-of-use (TOCTOU) race condition when checking a condition under a read lock before performing a write, re-verify the condition after acquiring the write lock but before executing the write. This is known as the double-checked locking pattern.

Comment thread src/agent/thread_ops.rs
Comment on lines +1477 to +1480
tracing::warn!(
%thread_id,
"Thread disappeared during approval rejection"
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Adding a tracing::warn! message when a thread unexpectedly disappears during the approval rejection process is a valuable improvement. This enhances the system's observability by logging an unusual state, which can be crucial for debugging and understanding unexpected behavior in production.

Comment thread src/agent/thread_ops.rs
Comment on lines +2023 to +2081
/// Regression test for #1486: verify that request_id mismatch atomically
/// restores the pending approval (no TOCTOU window where it could be lost).
#[test]
fn test_request_id_mismatch_preserves_pending_state() {
use crate::agent::session::{PendingApproval, Session, Thread, ThreadState};
use uuid::Uuid;

let session_id = Uuid::new_v4();
let thread_id = Uuid::new_v4();
let correct_request_id = Uuid::new_v4();
let wrong_request_id = Uuid::new_v4();

let mut thread = Thread::with_id(thread_id, session_id);
let pending = PendingApproval {
request_id: correct_request_id,
tool_name: "shell".to_string(),
parameters: serde_json::json!({"command": "rm -rf /"}),
display_parameters: serde_json::json!({"command": "[REDACTED]"}),
description: "Execute dangerous command".to_string(),
tool_call_id: "call_42".to_string(),
context_messages: vec![],
deferred_tool_calls: vec![],
user_timezone: None,
allow_always: false,
};
thread.await_approval(pending);

let mut session = Session::new("test-user");
session.threads.insert(thread_id, thread);

// Verify initial state
assert_eq!(
session.threads[&thread_id].state,
ThreadState::AwaitingApproval
);
assert!(session.threads[&thread_id].pending_approval.is_some());

// Simulate the atomic check-take-verify sequence from the fixed code:
// take the approval, then verify request_id, and restore on mismatch.
let thread = session.threads.get_mut(&thread_id).unwrap();
let taken = thread.take_pending_approval().unwrap();
assert!(thread.pending_approval.is_none(), "take should clear it");

// Wrong request_id -- restore atomically (within same "lock scope" in production)
assert_ne!(wrong_request_id, taken.request_id);
thread.await_approval(taken);

// Verify the pending approval is fully restored
assert_eq!(
session.threads[&thread_id].state,
ThreadState::AwaitingApproval
);
let restored = session.threads[&thread_id]
.pending_approval
.as_ref()
.expect("pending approval must be restored after request_id mismatch");
assert_eq!(restored.request_id, correct_request_id);
assert_eq!(restored.tool_name, "shell");
assert_eq!(restored.tool_call_id, "call_42");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The new regression test test_request_id_mismatch_preserves_pending_state is excellent. It specifically validates the atomic restoration of pending approval state when a request ID mismatch occurs, directly confirming the fix for the TOCTOU race condition. This test is crucial for ensuring the stability of the approval mechanism.

References
  1. To prevent a time-of-check-to-time-of-use (TOCTOU) race condition when checking a condition under a read lock before performing a write, re-verify the condition after acquiring the write lock but before executing the write. This is known as the double-checked locking pattern.

Comment thread src/agent/thread_ops.rs
Comment on lines +2084 to +2141
/// Regression test for #1486: verify that approval with "always" flag
/// correctly transitions thread state and registers auto-approve in a
/// single lock scope.
#[test]
fn test_approval_with_always_sets_auto_approve_and_processing() {
use crate::agent::session::{PendingApproval, Session, Thread, ThreadState};
use uuid::Uuid;

let session_id = Uuid::new_v4();
let thread_id = Uuid::new_v4();
let request_id = Uuid::new_v4();

let mut thread = Thread::with_id(thread_id, session_id);
let pending = PendingApproval {
request_id,
tool_name: "web_fetch".to_string(),
parameters: serde_json::json!({"url": "https://example.com"}),
display_parameters: serde_json::json!({"url": "https://example.com"}),
description: "Fetch a web page".to_string(),
tool_call_id: "call_99".to_string(),
context_messages: vec![],
deferred_tool_calls: vec![],
user_timezone: None,
allow_always: true,
};
thread.await_approval(pending);

let mut session = Session::new("test-user");
session.threads.insert(thread_id, thread);

// Simulate the combined lock scope: take + verify + auto_approve + state transition
// Simulate the combined lock scope: take + verify + auto_approve + state transition.
// Use separate scopes to mirror the borrow pattern in production code.
let taken = {
let thread = session.threads.get_mut(&thread_id).unwrap();
thread.take_pending_approval().unwrap()
};

// Request ID matches
assert_eq!(taken.request_id, request_id);

// Auto-approve the tool (thread borrow dropped, so session is accessible)
session.auto_approve_tool(&taken.tool_name);

// Re-borrow thread and set state to Processing
{
let thread = session.threads.get_mut(&thread_id).unwrap();
thread.state = ThreadState::Processing;
}

// Verify both mutations happened atomically
assert!(session.is_tool_auto_approved("web_fetch"));
assert_eq!(
session.threads[&thread_id].state,
ThreadState::Processing
);
// Pending approval should be consumed (not restored)
assert!(session.threads[&thread_id].pending_approval.is_none());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The test_approval_with_always_sets_auto_approve_and_processing regression test provides strong validation for the atomic execution of auto-approval and thread state transition. This test is well-designed to cover the combined effects of the changes, reinforcing the robustness of the process_approval function.

References
  1. To prevent a time-of-check-to-time-of-use (TOCTOU) race condition when checking a condition under a read lock before performing a write, re-verify the condition after acquiring the write lock but before executing the write. This is known as the double-checked locking pattern.

@zmanian
Copy link
Copy Markdown
Collaborator Author

zmanian commented Mar 23, 2026

Superseded by #1591 (consolidated TOCTOU + error handling fix)

@zmanian zmanian closed this Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: medium Business logic, config, or moderate-risk modules scope: agent Agent core (agent loop, router, scheduler) size: L 200-499 changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CRITICAL] TOCTOU race condition in approval thread resolution

2 participants