Skip to content
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
119 changes: 119 additions & 0 deletions crates/goose/src/agents/moim.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,12 @@ pub async fn inject_moim(
let has_unexpected_issues = issues.iter().any(|issue| {
!issue.contains("Merged consecutive user messages")
&& !issue.contains("Merged consecutive assistant messages")
&& !issue.contains("Merged text content")
&& !issue.contains("Removed orphaned tool response")
&& !issue.contains("Removed orphaned tool request")
&& !issue.contains("Removed empty message")
&& !issue.contains("Removed leading assistant message")
&& !issue.contains("Removed trailing assistant message")
Comment thread
wpfleger96 marked this conversation as resolved.
Comment thread
wpfleger96 marked this conversation as resolved.
Comment on lines +42 to +43
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep assistant-removal issues out of MOIM allowlist

Allowlisting "Removed leading assistant message" and "Removed trailing assistant message" here makes inject_moim accept fix_conversation outputs that delete assistant turns instead of falling back to the original conversation. In compacted sessions, the compaction summary/continuation is stored in assistant message(s) (compact_messages builds and merges them in context_mgmt/mod.rs), so once MOIM is inserted and fix_conversation trims a leading/trailing assistant, this path now returns a context with that compaction payload removed, which can drop critical summary/tool-loop guidance for the next provider call.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove assistant-trim issues from MOIM allowlist

Allowlisting "Removed leading assistant message"/"Removed trailing assistant message" here lets inject_moim accept fix_conversation outputs that delete assistant turns instead of falling back to the original conversation. In recovery compaction when no trailing text-only user message is preserved, compact_messages builds the compacted context as assistant summary + assistant continuation (crates/goose/src/context_mgmt/mod.rs:151-167), so accepting a trailing-assistant removal drops the only compacted payload before the next provider call. Fresh evidence: those two compaction assistant messages are merged into one assistant turn there, so removing that turn deletes both the summary and tool-loop continuation guidance.

Useful? React with 👍 / 👎.

Comment on lines +42 to +43
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove assistant-trim issues from MOIM allowlist

Allowing "Removed leading assistant message" and "Removed trailing assistant message" here makes inject_moim accept fix_conversation results that delete assistant turns instead of falling back to the original conversation. Fresh evidence: compact_messages constructs the compacted payload as assistant summary + assistant continuation (crates/goose/src/context_mgmt/mod.rs around lines 151-167), and in the no-preserved-user path fix_lead_trail removes both assistant endpoints after MOIM inserts a user message between them, so the next provider call can lose the entire compacted summary/tool-loop guidance.

Useful? React with 👍 / 👎.

});

if has_unexpected_issues {
Expand Down Expand Up @@ -140,4 +146,117 @@ mod tests {
"MOIM should be in message before latest assistant message"
);
}

// After the allowlist fix: MOIM now applies the fix when fix_conversation detects an
// orphaned tool request, removing it from the conversation.
#[tokio::test]
async fn test_moim_fixes_orphaned_tool_request() {
let temp_dir = tempfile::tempdir().unwrap();
let em = ExtensionManager::new_without_provider(temp_dir.path().to_path_buf());
let working_dir = PathBuf::from("/test/dir");

let broken_conv = Conversation::new_unvalidated(vec![
Message::user().with_text("Do something"),
Message::assistant()
.with_text("I'll call a tool")
.with_tool_request("orphan_tool_1", Ok(CallToolRequestParams::new("some_tool"))),
]);

// Control: fix_conversation alone correctly removes the orphaned tool request.
let (_, issues) = fix_conversation(Conversation::new_unvalidated(
broken_conv.messages().clone(),
));
assert!(
issues
.iter()
.any(|i| i.contains("Removed orphaned tool request")),
"fix_conversation alone should detect the orphan, but issues were: {:?}",
issues
);

// inject_moim now applies the fix (orphan removal is in the allowlist).
let result = inject_moim("test-session-id", broken_conv.clone(), &em, &working_dir).await;
let msgs = result.messages();

// The orphaned ToolRequest should be removed.
let has_orphan = msgs.iter().any(|m| {
m.content.iter().any(|c| {
matches!(c, crate::conversation::message::MessageContent::ToolRequest(tr) if tr.id == "orphan_tool_1")
})
});
assert!(
!has_orphan,
"Orphaned tool request should have been removed by MOIM's fix_conversation"
);

// MOIM should have been injected (conversation was fixed).
let has_moim = msgs.iter().any(|m| {
m.content
.iter()
.any(|c| c.as_text().is_some_and(|t| t.contains("<info-msg>")))
});
assert!(
has_moim,
"MOIM should have been injected after fixing the orphan"
);
}

// After the allowlist fix: MOIM now fixes the cancellation scenario — removes the
// orphaned tool request and empty user message, preventing the Anthropic 400 error.
#[tokio::test]
async fn test_moim_fixes_cancellation_orphan() {
let temp_dir = tempfile::tempdir().unwrap();
let em = ExtensionManager::new_without_provider(temp_dir.path().to_path_buf());
let working_dir = PathBuf::from("/test/dir");

// Simulates what the agent produces after cancellation:
// - Valid prior exchange
// - Assistant issues a tool call
// - Pre-allocated user message was never populated (cancellation fired first)
let broken_conv = Conversation::new_unvalidated(vec![
Message::user().with_text("Search for something"),
Message::assistant().with_text("I searched and found results"),
Message::user().with_text("Now do something else"),
Message::assistant()
.with_text("I'll call a tool")
.with_tool_request(
"cancelled_tool",
Ok(CallToolRequestParams::new("some_tool")),
),
// Pre-allocated Message::user() never populated due to cancellation.
Message::user(),
]);

let result = inject_moim("test-session-id", broken_conv.clone(), &em, &working_dir).await;
let msgs = result.messages();

// The orphaned ToolRequest should be removed.
let has_orphan = msgs.iter().any(|m| {
m.content.iter().any(|c| {
matches!(c, crate::conversation::message::MessageContent::ToolRequest(tr) if tr.id == "cancelled_tool")
})
});
assert!(
!has_orphan,
"Orphaned tool request should have been removed by MOIM's fix_conversation"
);

// The empty user message should be removed.
let has_empty = msgs.iter().any(|m| m.content.is_empty());
assert!(
!has_empty,
"Empty user message should have been removed by MOIM's fix_conversation"
);

// MOIM should have been injected.
let has_moim = msgs.iter().any(|m| {
m.content
.iter()
.any(|c| c.as_text().is_some_and(|t| t.contains("<info-msg>")))
});
assert!(
has_moim,
"MOIM should have been injected after fixing the cancellation orphan"
);
}
}
98 changes: 98 additions & 0 deletions crates/goose/src/context_mgmt/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -799,4 +799,102 @@ mod tests {
let result = tool_id_to_summarize(&updated_conversation, 3);
assert!(result.is_none(), "Nothing left to summarize");
}

// Control test: after compaction + appending new tool calls, fix_conversation
// produces zero issues. Proves compaction output is structurally clean and
// compatible with the fix pipeline when new tool pairs are added.
#[tokio::test]
async fn test_compaction_output_with_new_tool_pair_is_valid() {
let response_message = Message::assistant().with_text("<mock summary>");
let provider = MockProvider::new(response_message, 1000);

// Pre-compaction conversation: user text, 3 tool pairs, user text at end
let mut messages = vec![Message::user().with_text("Help me with a task")];
for i in 0..3 {
messages.push(Message::assistant().with_tool_request(
format!("old_tool_{}", i),
Ok(CallToolRequestParams::new("some_tool")),
));
messages.push(Message::user().with_tool_response(
format!("old_tool_{}", i),
Ok(rmcp::model::CallToolResult::success(vec![
RawContent::text(format!("result {}", i)).no_annotation(),
])),
));
}
messages.push(Message::user().with_text("Now do something else"));

let conversation = Conversation::new_unvalidated(messages);
let (compacted, _usage) =
compact_messages(&provider, "test-session-id", &conversation, false)
.await
.unwrap();

// Simulate post-compaction agent loop: LLM makes a new tool call, tool executes
let new_asst = Message::assistant()
.with_text("I'll run a new tool")
.with_tool_request("new_tool_1", Ok(CallToolRequestParams::new("new_tool")));
let new_tool_resp = Message::user().with_tool_response(
"new_tool_1",
Ok(rmcp::model::CallToolResult::success(vec![
RawContent::text("new result").no_annotation(),
])),
);
// Rebuild conversation from compacted messages + new tool pair.
// Can't use Conversation::push() because it merges by ID, and we need
// the messages to be separate entries.
let mut all_msgs = compacted.messages().clone();
all_msgs.push(new_asst);
all_msgs.push(new_tool_resp);
let compacted = Conversation::new_unvalidated(all_msgs);

// fix_conversation should produce zero issues — the conversation is valid
let (fixed, issues) = crate::conversation::fix_conversation(compacted.clone());

assert!(
issues.is_empty(),
"Compacted conversation + new tool pair should have no issues, but found: {:?}",
issues
);

// The new tool pair should be intact in the visible messages
let visible = fixed.agent_visible_messages();
let has_new_tool_request = visible.iter().any(|m| {
m.content
.iter()
.any(|c| matches!(c, MessageContent::ToolRequest(tr) if tr.id == "new_tool_1"))
});
let has_new_tool_response = visible.iter().any(|m| {
m.content
.iter()
.any(|c| matches!(c, MessageContent::ToolResponse(tr) if tr.id == "new_tool_1"))
});
assert!(
has_new_tool_request,
"New tool request should be in visible messages"
);
assert!(
has_new_tool_response,
"New tool response should be in visible messages"
);

// The old tool pairs should still exist as non-visible messages
let all_msgs = fixed.messages();
let non_visible_tool_count = all_msgs
.iter()
.filter(|m| {
!m.metadata.agent_visible
&& m.content.iter().any(|c| {
matches!(
c,
MessageContent::ToolRequest(_) | MessageContent::ToolResponse(_)
)
})
})
.count();
assert!(
non_visible_tool_count > 0,
"Old tool pairs should be preserved as non-visible messages"
);
}
}
26 changes: 0 additions & 26 deletions ui/desktop/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading