Skip to content

Nested agents required for agent swarm/orchestration doesn't work #6381

Open
@js0926861

Description

@js0926861

Description

We are trying to build a multi-agent system. Below is the code being used.

import { openai } from '@ai-sdk/openai';
import {
  convertToCoreMessages,
  createDataStreamResponse,
  streamText,
  tool,
} from 'ai';
import { z } from 'zod';

export async function POST(req: Request) {
  const { messages } = await req.json();

  return createDataStreamResponse({
    execute: async dataStream => {
      // step 1 example: forced tool call
      const result1 = streamText({
        maxSteps: 3,
        model: openai('gpt-4o-mini', { structuredOutputs: true }),
        system: 'Extract the user goal from the conversation.',
        messages,
        toolChoice: 'required', // force the model to call a tool
        tools: {
          extractGoal: tool({
            parameters: z.object({ goal: z.string() }),
            execute: async ({ goal }) => {
              const result2 = streamText({
                // different system prompt, different model, no tools:
                model: openai('gpt-4o'),
                maxSteps: 3,
                system:
                  'You are a helpful assistant with a different system prompt. Repeat the extract user goal in your answer.',
                // continue the workflow stream with the messages from the previous step:
                messages: [
                  ...convertToCoreMessages(messages),
                ],
              });

              // forward the 2nd result to the client (incl. the finish event):
              result2.mergeIntoDataStream(dataStream, {
                experimental_sendStart: false, // omit the start event
              });
              return await result2.text;
            }, 
          }),
        },
      });

      // forward the initial result to the client without the finish event:
      result1.mergeIntoDataStream(dataStream, {
        experimental_sendFinish: false, // omit the finish event
      });

      // note: you can use any programming construct here, e.g. if-else, loops, etc.
      // workflow programming is normal programming with this approach.

      // example: continue stream with forced tool call from previous step
    },
  });
}

Credit to @nicoalbanese for the code in #6269.

Here's the response.

f:{"messageId":"msg-pfvWDPQ7fJfuDN0ffSlpW8Vw"}
9:{"toolCallId":"call_ssBg7o0aO5gVCC4fHElLlpIj","toolName":"extractGoal","args":{"goal":"The user wants to hear a joke."}}
f:{"messageId":"msg-qcNW6hM1MJFfnFAOmEOfoGa4"}
0:"You"
0:" asked"
0:" for"
0:" a"
0:" joke"
0:"."
0:" Here's"
0:" one"
0:" for"
0:" you"
0:":\n\n"
0:"Why"
0:" don't"
0:" scientists"
0:" trust"
0:" atoms"
0:"?"
0:"  \n"
0:"Because"
0:" they"
0:" make"
0:" up"
0:" everything"
0:"!"
e:{"finishReason":"stop","usage":{"promptTokens":33,"completionTokens":25},"isContinued":false}
d:{"finishReason":"stop","usage":{"promptTokens":33,"completionTokens":25}}
a:{"toolCallId":"call_ssBg7o0aO5gVCC4fHElLlpIj","result":"You asked for a joke. Here's one for you:\n\nWhy don't scientists trust atoms?  \nBecause they make up everything!"}
e:{"finishReason":"tool-calls","usage":{"promptTokens":47,"completionTokens":22},"isContinued":false}
f:{"messageId":"msg-elV1eu2amCJJyoLlO46wiDD7"}
9:{"toolCallId":"call_YFR1m7xh4lmOEZWrt0BIWF7H","toolName":"extractGoal","args":{"goal":"The user wants to hear a joke."}}
f:{"messageId":"msg-irXUUBiXPCdmQKByZGx02U8C"}
0:"You"
0:" want"
0:" to"
0:" hear"
0:" a"
0:" joke"
0:"."
0:" Here's"
0:" one"
0:" for"
0:" you"
0:":\n\n"
0:"Why"
0:" don't"
0:" scientists"
0:" trust"
0:" atoms"
0:"?\n\n"
0:"Because"
0:" they"
0:" make"
0:" up"
0:" everything"
0:"!"
e:{"finishReason":"stop","usage":{"promptTokens":33,"completionTokens":25},"isContinued":false}
d:{"finishReason":"stop","usage":{"promptTokens":33,"completionTokens":25}}
a:{"toolCallId":"call_YFR1m7xh4lmOEZWrt0BIWF7H","result":"You want to hear a joke. Here's one for you:\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything!"}
e:{"finishReason":"tool-calls","usage":{"promptTokens":104,"completionTokens":22},"isContinued":false}
f:{"messageId":"msg-0YgrMV9O4fvIYZqkmYY6RuCt"}
9:{"toolCallId":"call_S0IetHPhVpy0dMFFyg9d7Fut","toolName":"extractGoal","args":{"goal":"The user is requesting a joke."}}
f:{"messageId":"msg-3l9qWNRyuY6CNVridyFg7oDd"}
0:"You"
0:" asked"
0:" for"
0:" a"
0:" joke"
0:"."
0:" Here"
0:"’s"
0:" one"
0:" for"
0:" you"
0:":\n\n"
0:"Why"
0:" don"
0:"’t"
0:" scientists"
0:" trust"
0:" atoms"
0:"?"
0:"  \n"
0:"Because"
0:" they"
0:" make"
0:" up"
0:" everything"
0:"!"
e:{"finishReason":"stop","usage":{"promptTokens":33,"completionTokens":27},"isContinued":false}
d:{"finishReason":"stop","usage":{"promptTokens":33,"completionTokens":27}}
a:{"toolCallId":"call_S0IetHPhVpy0dMFFyg9d7Fut","result":"You asked for a joke. Here’s one for you:\n\nWhy don’t scientists trust atoms?  \nBecause they make up everything!"}
e:{"finishReason":"tool-calls","usage":{"promptTokens":161,"completionTokens":21},"isContinued":false}

Rendering

**user:** tell me a joke

extractGoal
You asked for a joke. Here's one for you:
Why don't scientists trust atoms?
Because they make up everything!

extractGoal
You want to hear a joke. Here's one for you:
Why don't scientists trust atoms?
Because they make up everything!

extractGoal
You asked for a joke. Here’s one for you:
Why don’t scientists trust atoms?
Because they make up everything!

Notice the tool is being called 3 times which is not correct. We are building a multi-model swarm and orchestration system (inspired from on OpenAI Agent SDK) and this is a critical feature required to make that system work.

AI SDK Version

"@ai-sdk/react": "^1.1.20"

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions