You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: unlist lead/worker blog posts instead of deleting, revert casing churn
Restore deleted blog posts and mark them as unlisted with a danger
admonition pointing to Planning Mode. Revert unrelated Goose→goose
casing changes to keep the PR focused on lead/worker removal.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Angie Jones <jones.angie@gmail.com>
title: "Treating LLMs Like Tools in a Toolbox: A Multi-Model Approach to Smarter AI Agents"
3
-
description: How goose uses multiple LLMs within a single task, optimizing for speed, cost, and reliability in AI agent workflows
3
+
description: How Goose uses multiple LLMs within a single task, optimizing for speed, cost, and reliability in AI agent workflows
4
+
unlisted: true
4
5
authors:
5
6
- mic
6
7
- angie
7
8
---
8
9
10
+
:::danger Outdated
11
+
Lead/Worker mode has been removed from goose. It has been replaced by [Planning Mode](/docs/guides/creating-plans), which uses a dedicated planner model with the `/plan` command. See the [multi-model guide](/docs/guides/multi-model/) for current workflows.
12
+
:::
13
+
9
14

10
15
11
16
12
17
Not every task needs a genius. And not every step should cost a fortune.
13
18
14
-
That's something we've learned while scaling goose, our open source AI agent. The same model that's great at unpacking a planning request might totally fumble a basic shell command, or worse - it might burn through your token budget doing it.
19
+
That's something we've learned while scaling Goose, our open source AI agent. The same model that's great at unpacking a planning request might totally fumble a basic shell command, or worse - it might burn through your token budget doing it.
15
20
16
21
So we asked ourselves: what if we could mix and match models in a single session?
17
22
18
-
Not just switching based on user commands, but building goose with an actual system for routing tasks between different models, each playing to their strengths.
23
+
Not just switching based on user commands, but building Goose with an actual system for routing tasks between different models, each playing to their strengths.
19
24
20
-
This is the gap smarter multi-model workflows are designed to fill.
25
+
This is the gap the lead/worker model is designed to fill.
21
26
22
27
<!-- truncate -->
23
28
24
29
## The Problem with Single-Model Sessions
25
30
26
-
Originally, every goose session used a single model from start to finish. That worked fine for short tasks, but longer sessions were harder to tune:
31
+
Originally, every Goose session used a single model from start to finish. That worked fine for short tasks, but longer sessions were harder to tune:
27
32
28
33
* Go too cheap, and the model might miss nuance or break tools.
29
34
* Go too premium, and your cost graph starts looking like a ski slope.
@@ -32,26 +37,29 @@ There was no built-in way to adapt on the fly.
32
37
33
38
We saw this tension in real usage where agents would start strong, then stall out when the model struggled to follow through. Sometimes users would manually switch models mid-session. But that's not scalable, and definitely not agent like.
34
39
35
-
## Designing a Multi-Model System
40
+
## Designing the Lead/Worker System
36
41
37
42
The core idea is simple:
38
43
39
-
* Use a strong reasoning model when you need planning and strategy.
40
-
* Use a faster, lower-cost model for day-to-day execution.
41
-
* Switch intentionally based on the task, with `/plan` and dedicated planner settings when you need deeper reasoning.
44
+
* Start the session with a lead model that's strong at reasoning and planning.
45
+
* After a few back and forths between you and the model (what we call "turns"), hand off to a worker model that's faster and cheaper, but still capable.
46
+
* If the worker gets stuck, Goose can detect the failure and temporarily bring the lead back in.
47
+
48
+
49
+
You can configure how many turns the lead handles upfront (`GOOSE_LEAD_TURNS`), how many consecutive failures trigger fallback (`GOOSE_LEAD_FAILURE_THRESHOLD`), and how long the fallback lasts before Goose retries the worker.
42
50
43
51
This gives you a flexible, resilient setup where each model gets used where it shines.
44
52
45
53
One of the trickiest parts of this feature was defining what failure looks like.
46
54
47
-
We didn't want goose to swap models just because an API timed out. Instead, we focused on real task failures:
55
+
We didn't want Goose to swap models just because an API timed out. Instead, we focused on real task failures:
48
56
49
57
* Tool execution errors
50
58
* Syntax mistakes in generated code
51
59
* File not found or permission errors
52
60
* User corrections like "that's wrong" or "try again"
53
61
54
-
goose tracks these signals and knows when to escalate. And once the fallback model stabilizes things, it switches back without missing a beat.
62
+
Goose tracks these signals and knows when to escalate. And once the fallback model stabilizes things, it switches back without missing a beat.
55
63
56
64
## The Value of Multi-Model Design
57
65
@@ -63,22 +71,20 @@ We've found that this multi-model design unlocks new workflows:
63
71
***Cross-provider setups** (Claude for planning, OpenAI for execution)
64
72
***Lower-friction defaults** for teams worried about LLM spend
65
73
66
-
It also opens the door for even smarter routing in the future with things like switching based on tasks, ensemble voting, or maybe even letting goose decide which model to call based on tool context.
74
+
It also opens the door for even smarter routing in the future with things like switching based on tasks, ensemble voting, or maybe even letting Goose decide which model to call based on tool context.
67
75
68
76
## Try It Out
69
77
70
-
Use a dedicated planner model when you want strong up-front reasoning, and keep a fast default model for execution:
78
+
Lead/worker mode is already available in Goose. To enable, export these variables with two models that have already been configured in Goose:
71
79
72
80
```bash
73
-
exportGOOSE_PROVIDER="anthropic"
81
+
exportGOOSE_LEAD_MODEL="gpt-4o"
74
82
export GOOSE_MODEL="claude-4-sonnet"
75
-
export GOOSE_PLANNER_PROVIDER="openai"
76
-
export GOOSE_PLANNER_MODEL="gpt-4o"
77
83
```
78
84
79
-
Then use `/plan` for strategy and continue normal execution in your session.
85
+
From there, Goose takes care of the hand off, the fallback, and the recovery. You just... keep vibing.
80
86
81
-
For setup details, see the [planning guide](/docs/guides/creating-plans).
87
+
If you're curious how it all works under the hood, we've got a [full tutorial](/docs/tutorials/lead-worker).
82
88
83
89
---
84
90
@@ -89,11 +95,11 @@ If you're experimenting with multi-model setups, [share what's working and what
89
95
<metaproperty="og:title"content="Treating LLMs Like Tools in a Toolbox: A Multi-Model Approach to Smarter AI Agents" />
<metaproperty="og:description"content="How goose uses multiple LLMs within a single task, optimizing for speed, cost, and reliability in AI agent workflows" />
98
+
<metaproperty="og:description"content="How Goose uses multiple LLMs within a single task, optimizing for speed, cost, and reliability in AI agent workflows" />
<metaname="twitter:title"content="Treating LLMs Like Tools in a Toolbox: A Multi-Model Approach to Smarter AI Agents" />
97
-
<metaname="twitter:description"content="How goose uses multiple LLMs within a single task, optimizing for speed, cost, and reliability in AI agent workflows" />
103
+
<metaname="twitter:description"content="How Goose uses multiple LLMs within a single task, optimizing for speed, cost, and reliability in AI agent workflows" />
description: Dive into Goose's Lead/Worker model where one LLM plans while another executes - a game-changing approach to AI collaboration that can save costs and boost efficiency.
4
+
unlisted: true
5
+
authors:
6
+
- ebony
7
+
---
8
+
9
+
:::danger Outdated
10
+
Lead/Worker mode has been removed from goose. It has been replaced by [Planning Mode](/docs/guides/creating-plans), which uses a dedicated planner model with the `/plan` command. See the [multi-model guide](/docs/guides/multi-model/) for current workflows.
11
+
:::
12
+
13
+

14
+
15
+
Ever wondered what happens when you let two AI models work together like a tag team? That’s exactly what we tested in our latest livestream—putting Goose’s Lead/Worker model to work on a real project. Spoiler: it’s actually pretty great.
16
+
17
+
The Lead/Worker model is one of those features that sounds simple on paper but delivers some amazing benefits in practice. Think of it like having a project manager and a developer working in perfect harmony - one does the strategic thinking, the other gets their hands dirty with the actual implementation.
18
+
19
+
<!-- truncate -->
20
+
21
+
## What's This Lead/Worker Thing All About?
22
+
23
+
Instead of asking one LLM to do everything, Lead/Worker lets you split the load. Your lead model takes care of the thinking, decision-making, and big-picture planning, while your worker model focuses on execution—writing code, running commands, and making the plan happen. The magic is in the balance: you can put a more powerful (and sometimes more expensive) model in the lead and let a faster, more cost-effective one handle the heavy lifting.
24
+
25
+
Popular model pairings people are loving:
26
+
27
+
- GPT-4 + Claude Sonnet – Balanced intelligence and efficiency.
28
+
- Claude Opus + GPT-3.5 – Creative planning with quick execution.
29
+
- GPT-4o + Local models – Privacy-focused builds where data stays in-house.
30
+
31
+
## Why You'll Love This Setup
32
+
33
+
- 💰 Cost Optimization
34
+
Use cheaper models for execution while keeping the premium models for strategic planning. Your wallet will thank you.
35
+
36
+
- ⚡ Speed Boost
37
+
Get solid plans from capable models, then let optimized execution models fly through the implementation.
38
+
39
+
- 🔄 Mix and Match Providers
40
+
This is where it gets really cool - you can use Claude for reasoning and OpenAI for execution, or any combination that works for your workflow.
41
+
42
+
- 🏃♂️ Handle Long Dev Sessions
43
+
Perfect for those marathon coding sessions where you need sustained performance without breaking the bank.
44
+
45
+
## [Setting It Up](/docs/tutorials/lead-worker#configuration)
46
+
47
+
Getting started with the Lead/Worker model is surprisingly straightforward. In the Goose desktop app, you just need to:
48
+
49
+
1.**Enable the feature** - Look for the enable button in your settings
50
+
2.**Choose your lead model** - Pick something powerful for planning (like GPT-4)
51
+
3.**Select your worker model** - Go with something efficient for execution (like Claude Sonnet)
52
+
4.**Configure the behavior** - Set how many turns the worker gets before consulting the lead
53
+
54
+
The default settings work great for most people, but you can customize things like:
55
+
-**Number of turns**: How many attempts the worker model gets before pulling in the lead
56
+
-**Failure handling**: What happens when things don't go as planned
57
+
-**Fallback behavior**: How the system recovers from issues
58
+
59
+
## Real-World Magic in Action
60
+
61
+
During our [livestream](https://www.youtube.com/embed/IbBDBv9Chvg), we tackled a real project: adding install buttons to the MCP servers documentation page. What made this interesting wasn't just the end result, but watching how the two models collaborated.
62
+
63
+
The lead model would analyze the requirements, understand the existing codebase structure, and create a plan. Then the worker model would jump in and start implementing, making the actual code changes and handling the technical details.
64
+
65
+
### The Project: Documentation Enhancement
66
+
67
+
We wanted to add install buttons to our MCP server cards, similar to what we already had on our extensions page. We needed to figure out how to add this functionality without breaking existing workflows.
68
+
69
+
Here's what the Lead/Worker model helped us accomplish:
70
+
-**Analyzed the existing documentation structure**
71
+
-**Identified the best approach** (creating a custom page vs. modifying existing ones)
72
+
-**Implemented the solution** with proper routing and styling
73
+
-**Handled edge cases** like maintaining tutorial links while adding install functionality
74
+
75
+
## The Developer Experience
76
+
77
+
One thing that really stood out was how natural the interaction felt. You're not constantly switching contexts or managing different tools. You just describe what you want, and the system figures out the best way to divide the work.
78
+
79
+
The lead model acts as your strategic partner, while the worker model becomes your implementation buddy. It's like pair programming, but with AI models that never get tired or need coffee breaks.
80
+
81
+
## Pro Tips from Our Session
82
+
83
+
### Start with Good Goose Hints
84
+
We always recommend setting up your [goosehints](/docs/guides/context-engineering/using-goosehints) to give context about your project. It saves you from re-explaining the same things over and over.
85
+
86
+
### Don't Micromanage
87
+
Let the lead model do its planning thing. Sometimes the best results come from giving high-level direction and letting the system figure out the details.
88
+
89
+
### Use Git for Safety
90
+
Always work in a branch when experimenting. The models are smart, but having that safety net means you can be more adventurous with your requests.
91
+
92
+
### Visual Feedback Helps
93
+
While the desktop UI doesn't show the model switching as clearly as the CLI does, you can still follow along by expanding the tool outputs to see what's happening under the hood.
94
+
95
+
## The Results Speak for Themselves
96
+
97
+
By the end of our session, we had:
98
+
- ✅ Successfully added install buttons to our MCP server documentation
99
+
- ✅ Maintained all existing functionality (tutorial links still worked)
100
+
- ✅ Improved the user experience with better visual hierarchy
101
+
- ✅ Organized content into logical sections (community vs. built-in servers)
102
+
103
+
The best part? The models made smart decisions we hadn't even thought of, like automatically categorizing the servers and improving the overall page layout.
104
+
105
+
## Ready to Try It Yourself?
106
+
107
+
The [Lead/Worker model](/docs/tutorials/lead-worker) is available now in Goose. Whether you're working on documentation, building features, or tackling complex refactoring, having two specialized models working together can be a game changer.
108
+
109
+
Want to see it in action? Check out the full stream where we built this feature live:
110
+
111
+
<iframeclass="aspect-ratio"width="560"height="315"src="https://www.youtube.com/embed/IbBDBv9Chvg"title="LLM Tag Team: Who Plans, Who Executes?"frameborder="0"allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"allowfullscreen></iframe>
112
+
113
+
Got questions or want to share your own Lead/Worker success stories? Join us in our [Discord community](https://discord.gg/goose-oss) - we'd love to hear what you're building!
114
+
115
+
116
+
<head>
117
+
<metaproperty="og:title"content="LLM Tag Team: Who Plans, Who Executes?" />
<metaproperty="og:description"content="Dive into Goose's Lead/Worker model where one LLM plans while another executes - a game-changing approach to AI collaboration that can save costs and boost efficiency." />
<metaname="twitter:title"content="LLM Tag Team: Who Plans, Who Executes?" />
125
+
<metaname="twitter:description"content="Dive into Goose's Lead/Worker model where one LLM plans while another executes - a game-changing approach to AI collaboration that can save costs and boost efficiency." />
0 commit comments