Skip to content

Commit 2b84ddd

Browse files
mltheuserMalte Heuser
andauthored
KG-545 Fix requestLLMOnlyCallingTools ignoring tool calls after reasoning message (#1198)
## Motivation and Context Related to [KG-545](https://youtrack.jetbrains.com/issue/KG-545). Currently, `requestLLMOnlyCallingTools` relies on `executeSingle`, which returns the first message received from the LLM. When using models that output Chain of Thought or "Thinking" blocks (e.g., Nova, Claude) prior to calling a tool, the response sequence is often `[Message.Assistant(Thinking), Message.Tool.Call]`. As a result: 1. The method returns the "Thinking" text message instead of the expected Tool Call. 2. The actual Tool Call is discarded and never executed or saved to history. **Changes:** * Introduced `requestLLMMultipleOnlyCallingTools()` in `AIAgentLLMSession` to allow retrieving the full list of messages while enforcing `ToolChoice.Required`. * Updated `requestLLMOnlyCallingTools` to use this new method. It now persists **all** messages (preserving the reasoning context in the session history) but filters the return value to ensure the caller receives the `Message.Tool.Call`. ## Breaking Changes None. This is a behavioral fix to ensure the method contract (returning a tool call) is honored when the LLM is "chatty" or provides reasoning. --- #### Type of the changes - [ ] New feature (non-breaking change which adds functionality) - [x] Bug fix (non-breaking change which fixes an issue) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Tests improvement - [ ] Refactoring #### Checklist - [x] The pull request has a description of the proposed change - [x] I read the [Contributing Guidelines](https://github.com/JetBrains/koog/blob/main/CONTRIBUTING.md) before opening the pull request - [x] The pull request uses **`develop`** as the base branch - [x] Tests for the changes have been added - [x] All new and existing tests passed ##### Additional steps for pull requests adding a new feature - [x] An issue describing the proposed change exists - [x] The pull request includes a link to the issue - [x] The change was discussed and approved in the issue - [ ] Docs have been added / updated --------- Co-authored-by: Malte Heuser <[email protected]>
1 parent a15b6fc commit 2b84ddd

File tree

3 files changed

+146
-15
lines changed

3 files changed

+146
-15
lines changed

agents/agents-core/src/commonMain/kotlin/ai/koog/agents/core/agent/session/AIAgentLLMSession.kt

Lines changed: 30 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -146,17 +146,43 @@ public sealed class AIAgentLLMSession(
146146
/**
147147
* Sends a request to the language model that enforces the usage of tools and retrieves the response.
148148
*
149-
* This method updates the session's prompt configuration to mark tool usage as required before
150-
* executing the request. Additionally, it ensures the session is active before proceeding.
149+
* This method:
150+
* 1. Validates that the session is active.
151+
* 2. Updates the prompt configuration to mark tool usage as required (`ToolChoice.Required`).
152+
* 3. Retrieves all generated messages (including potential Chain of Thought/Reasoning blocks).
153+
* 4. Filters the result to return the first [Message.Tool.Call].
151154
*
152-
* @return The response from the language model after executing the request with enforced tool usage.
155+
* If no tool call is found (e.g., the model refused or asked a question), this method throws an exception.
156+
*
157+
* @return The tool call response from the language model.
153158
*/
154159
public open suspend fun requestLLMOnlyCallingTools(): Message.Response {
160+
validateSession()
161+
// We use the multiple-response method to ensure we capture all context (e.g. thinking)
162+
// even though we only return the specific tool call.
163+
val responses = requestLLMMultipleOnlyCallingTools()
164+
return responses.firstOrNull { it is Message.Tool.Call }
165+
?: error("requestLLMOnlyCallingTools expected at least one Tool.Call but received: ${responses.map { it::class.simpleName }}")
166+
}
167+
168+
/**
169+
* Sends a request to the language model that enforces the usage of tools and retrieves all responses.
170+
*
171+
* This is useful when the LLM returns multiple messages, such as a "thinking" block followed by tool calls,
172+
* or multiple parallel tool calls.
173+
*
174+
* This method:
175+
* 1. Validates that the session is active.
176+
* 2. Updates the prompt configuration to mark tool usage as required (`ToolChoice.Required`).
177+
*
178+
* @return A list of responses from the language model.
179+
*/
180+
public open suspend fun requestLLMMultipleOnlyCallingTools(): List<Message.Response> {
155181
validateSession()
156182
val promptWithOnlyCallingTools = prompt.withUpdatedParams {
157183
toolChoice = LLMParams.ToolChoice.Required
158184
}
159-
return executeSingle(promptWithOnlyCallingTools, tools)
185+
return executeMultiple(promptWithOnlyCallingTools, tools)
160186
}
161187

162188
/**

agents/agents-core/src/commonMain/kotlin/ai/koog/agents/core/agent/session/AIAgentLLMWriteSession.kt

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -349,9 +349,7 @@ public class AIAgentLLMWriteSession internal constructor(
349349
*/
350350
override suspend fun requestLLMMultipleWithoutTools(): List<Message.Response> {
351351
return super.requestLLMMultipleWithoutTools().also { responses ->
352-
appendPrompt {
353-
responses.forEach { message(it) }
354-
}
352+
appendPrompt { messages(responses) }
355353
}
356354
}
357355

@@ -368,13 +366,17 @@ public class AIAgentLLMWriteSession internal constructor(
368366
}
369367

370368
/**
371-
* Requests a response from the Language Learning Model (LLM) while also processing
372-
* the response by updating the current prompt with the received message.
369+
* Requests a response from the Language Model (LLM) enforcing tool usage (`ToolChoice.Required`),
370+
* validates the session, and processes all returned messages (e.g. thinking + tool call).
371+
*
372+
* Crucially, this method appends **all** received messages to the prompt history to preserve context.
373373
*
374-
* @return The response received from the Language Learning Model (LLM).
374+
* @return A list of responses received from the Language Model (LLM).
375375
*/
376-
override suspend fun requestLLMOnlyCallingTools(): Message.Response {
377-
return super.requestLLMOnlyCallingTools().also { response -> appendPrompt { message(response) } }
376+
override suspend fun requestLLMMultipleOnlyCallingTools(): List<Message.Response> {
377+
return super.requestLLMMultipleOnlyCallingTools().also { responses ->
378+
appendPrompt { messages(responses) }
379+
}
378380
}
379381

380382
/**
@@ -419,9 +421,7 @@ public class AIAgentLLMWriteSession internal constructor(
419421
*/
420422
override suspend fun requestLLMMultiple(): List<Message.Response> {
421423
return super.requestLLMMultiple().also { responses ->
422-
appendPrompt {
423-
responses.forEach { message(it) }
424-
}
424+
appendPrompt { messages(responses) }
425425
}
426426
}
427427

agents/agents-core/src/commonTest/kotlin/ai/koog/agents/core/agent/session/AIAgentLLMWriteSessionTest.kt

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -369,4 +369,109 @@ class AIAgentLLMWriteSessionTest {
369369
val response = session.requestLLM()
370370
assertEquals("Changed params response", response.content)
371371
}
372+
373+
@Test
374+
fun testRequestLLMMultipleOnlyCallingTools() = runTest {
375+
val thinkingContent = "<thinking>I need to use a tool</thinking>"
376+
val testTool = TestTool()
377+
378+
val mockExecutor = getMockExecutor(clock = testClock) {
379+
// Simulate [Assistant, ToolCall] sequence
380+
mockLLMMixedResponse(
381+
toolCalls = listOf(testTool to TestTool.Args("test")),
382+
responses = listOf(thinkingContent)
383+
) onCondition { true }
384+
}
385+
386+
val session = createSession(mockExecutor, listOf(testTool))
387+
388+
val responses = session.requestLLMMultipleOnlyCallingTools()
389+
390+
assertEquals(2, responses.size)
391+
assertEquals(thinkingContent, (responses[0] as Message.Assistant).content)
392+
assertEquals("test-tool", (responses[1] as Message.Tool.Call).tool)
393+
394+
// Verify that BOTH messages were appended to the prompt history in correct order
395+
val lastTwoMessages = session.prompt.messages.takeLast(2)
396+
assertEquals(thinkingContent, (lastTwoMessages[0] as Message.Assistant).content)
397+
assertEquals("test-tool", (lastTwoMessages[1] as Message.Tool.Call).tool)
398+
}
399+
400+
@Test
401+
fun testRequestLLMOnlyCallingToolsWithThinking() = runTest {
402+
val thinkingContent = "<thinking>Checking file...</thinking>"
403+
val testTool = TestTool()
404+
405+
val mockExecutor = getMockExecutor(clock = testClock) {
406+
mockLLMMixedResponse(
407+
toolCalls = listOf(testTool to TestTool.Args("test")),
408+
responses = listOf(thinkingContent)
409+
) onCondition { true }
410+
}
411+
412+
val session = createSession(mockExecutor, listOf(testTool))
413+
414+
val response = session.requestLLMOnlyCallingTools()
415+
416+
// It should strictly return the ToolCall (fixing the bug), skipping the thinking message
417+
assertTrue(response is Message.Tool.Call, "Expected response to be a Tool Call, not the thinking message")
418+
assertEquals("test-tool", response.tool)
419+
420+
// It should still persist the "Thinking" message in history in correct order
421+
val lastTwoMessages = session.prompt.messages.takeLast(2)
422+
assertEquals(thinkingContent, (lastTwoMessages[0] as Message.Assistant).content)
423+
assertEquals("test-tool", (lastTwoMessages[1] as Message.Tool.Call).tool)
424+
}
425+
426+
@Test
427+
fun testRequestLLMOnlyCallingToolsNoToolCallThrowsException() = runTest {
428+
val mockExecutor = getMockExecutor(clock = testClock) {
429+
// Simulate model refusing to use tools and just responding with text
430+
mockLLMAnswer("I cannot use tools for this request.").asDefaultResponse
431+
}
432+
433+
val session = createSession(mockExecutor, listOf(TestTool()))
434+
435+
val exception = kotlin.runCatching {
436+
session.requestLLMOnlyCallingTools()
437+
}.exceptionOrNull()
438+
439+
assertNotNull(exception, "Expected an exception when no tool call is found")
440+
assertTrue(
441+
exception is IllegalStateException,
442+
"Expected IllegalStateException but got ${exception::class.simpleName}"
443+
)
444+
assertTrue(
445+
exception.message?.contains("expected at least one Tool.Call") == true,
446+
"Exception message should indicate missing tool call"
447+
)
448+
}
449+
450+
@Test
451+
fun testRequestLLMOnlyCallingToolsWithMultipleToolCalls() = runTest {
452+
val testTool = TestTool()
453+
454+
val mockExecutor = getMockExecutor(clock = testClock) {
455+
// Simulate model returning multiple tool calls (parallel tool calling)
456+
mockLLMMixedResponse(
457+
toolCalls = listOf(
458+
testTool to TestTool.Args("first"),
459+
testTool to TestTool.Args("second")
460+
),
461+
responses = emptyList()
462+
) onCondition { true }
463+
}
464+
465+
val session = createSession(mockExecutor, listOf(testTool))
466+
467+
val response = session.requestLLMOnlyCallingTools()
468+
469+
// Should return the first tool call
470+
assertTrue(response is Message.Tool.Call, "Expected response to be a Tool Call")
471+
assertEquals("test-tool", response.tool)
472+
473+
// Both tool calls should be in history
474+
val lastTwoMessages = session.prompt.messages.takeLast(2)
475+
assertTrue(lastTwoMessages.all { it is Message.Tool.Call })
476+
}
372477
}

0 commit comments

Comments
 (0)