Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
---
title: "Treating LLMs Like Tools in a Toolbox: A Multi-Model Approach to Smarter AI Agents"
description: How Goose uses multiple LLMs within a single task, optimizing for speed, cost, and reliability in AI agent workflows
unlisted: true
authors:
- mic
- angie
---

:::danger Outdated
Lead/Worker mode has been removed from goose. It has been replaced by [Planning Mode](/docs/guides/creating-plans), which uses a dedicated planner model with the `/plan` command. See the [multi-model guide](/docs/guides/multi-model/) for current workflows.
:::

![blog cover](multi-model-ai-agent.png)


Expand Down Expand Up @@ -79,7 +84,7 @@ export GOOSE_MODEL="claude-4-sonnet"

From there, Goose takes care of the hand off, the fallback, and the recovery. You just... keep vibing.

If you're curious how it all works under the hood, we've got a [full tutorial](/docs/tutorials/lead-worker).
If you're curious how it all works under the hood, see the [planning guide](/docs/guides/creating-plans).

---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,15 @@
---
title: "LLM Tag Team: Who Plans, Who Executes?"
description: Dive into Goose's Lead/Worker model where one LLM plans while another executes - a game-changing approach to AI collaboration that can save costs and boost efficiency.
unlisted: true
authors:
- ebony
---

:::danger Outdated
Lead/Worker mode has been removed from goose. It has been replaced by [Planning Mode](/docs/guides/creating-plans), which uses a dedicated planner model with the `/plan` command. See the [multi-model guide](/docs/guides/multi-model/) for current workflows.
:::

![blog cover](header-image.png)

Ever wondered what happens when you let two AI models work together like a tag team? That’s exactly what we tested in our latest livestream—putting Goose’s Lead/Worker model to work on a real project. Spoiler: it’s actually pretty great.
Expand Down Expand Up @@ -37,7 +42,7 @@ This is where it gets really cool - you can use Claude for reasoning and OpenAI
- 🏃‍♂️ Handle Long Dev Sessions
Perfect for those marathon coding sessions where you need sustained performance without breaking the bank.

## [Setting It Up](/docs/tutorials/lead-worker#configuration)
## [Setting It Up](/docs/guides/creating-plans)

Getting started with the Lead/Worker model is surprisingly straightforward. In the Goose desktop app, you just need to:

Expand Down Expand Up @@ -97,9 +102,9 @@ By the end of our session, we had:

The best part? The models made smart decisions we hadn't even thought of, like automatically categorizing the servers and improving the overall page layout.

## Ready to Try It Yourself?
## Ready to Try Multi-Model Workflows?

The [Lead/Worker model](/docs/tutorials/lead-worker) is available now in Goose. Whether you're working on documentation, building features, or tackling complex refactoring, having two specialized models working together can be a game changer.
Lead/Worker mode has been removed, but goose now supports [Planning Mode](/docs/guides/creating-plans) for multi-model workflows. Whether you're working on documentation, building features, or tackling complex refactoring, pairing a strong planner model with a fast execution model can be a game changer.

Want to see it in action? Check out the full stream where we built this feature live:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -87,9 +87,9 @@ The [Memory extension](https://goose-docs.ai/docs/mcp/memory-mcp) stores importa

Keep individual sessions focused on specific tasks. When you complete a task or reach a natural stopping point, start a new session. This prevents context window bloat from accumulated conversation history and ensures your tokens are spent on current, relevant work.

**7. Lead/worker model**
**7. Planner model + focused execution**

The [Lead/Worker model](https://goose-docs.ai/docs/tutorials/lead-worker) splits work between two models. The lead model handles high-level planning and decision-making, while the worker model executes the detailed implementation. This optimizes costs by using expensive models for strategic thinking and cheaper models for routine execution tasks.
Use a dedicated [planner model](/docs/guides/creating-plans) for complex reasoning and keep your default model focused on execution. This gives you control over cost and quality while keeping model behavior explicit and predictable.

---

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ Browse and select from a wide range of options, including:
- **Specialized models** optimized for different use cases

:::tip Protip
Want the best of both worlds? Use goose’s [Lead/Worker configuration](/docs/tutorials/lead-worker) to combine a powerful frontier model with a faster open-weight model. Let your Lead handle the high-level thinking while Workers take care of the repetitive tasks—saving you both time and credits.
Want the best of both worlds? Use a dedicated [planner model](/docs/guides/creating-plans) for complex strategy and a faster default model for execution. Pair this with `/plan` to get strong reasoning only when you need it—saving both time and credits.
:::

---
Expand Down
4 changes: 2 additions & 2 deletions documentation/docs/getting-started/providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -1208,8 +1208,8 @@ This method simplifies authentication and enhances security for enterprise envir

Beyond single-model setups, goose supports [multi-model configurations](/docs/guides/multi-model/) that can use different models and providers for specialized tasks:

- **Lead/Worker Model** - Automatic switching between a lead model for initial turns and a worker model for execution tasks
- **Planning Mode** - Manual planning phase using a dedicated model to create detailed project breakdowns before execution
- **Planning Mode** - Use a dedicated planner model to create detailed project breakdowns before execution
- **Subagents** - Delegate scoped tasks to isolated sessions to keep your primary workflow focused and efficient

## Gemini 3 Thinking Levels

Expand Down
16 changes: 8 additions & 8 deletions documentation/docs/guides/cli-providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ CLI providers are useful if you:
#### Workflow Integration
- **Recipe compatibility**: Use CLI providers in automated goose recipes
- **Scheduling support**: Include in scheduled tasks and workflows
- **Hybrid configurations**: Combine with LLM providers using lead/worker patterns
- **Hybrid configurations**: Combine with planning mode and model-specific workflows

#### Interface Consistency
- **Unified commands**: Use the same `goose session` interface across all providers
Expand Down Expand Up @@ -260,16 +260,16 @@ Once configured, you can start a goose session using these providers just like a
goose session
```

### Combining with Other Models
### Combining with Planner Models

CLI providers work well in combination with other models using goose's [lead/worker pattern](/docs/tutorials/lead-worker):
CLI providers also work well with planning mode when you want one model for strategy and another for execution:

```bash
# Use Claude Code as lead model, GPT-4o as worker
export GOOSE_LEAD_PROVIDER=claude-code
export GOOSE_PROVIDER=openai
export GOOSE_MODEL=gpt-4o
export GOOSE_LEAD_MODEL=default
# Use Claude Code for execution, OpenAI for planning
export GOOSE_PROVIDER=claude-code
export GOOSE_MODEL=default
export GOOSE_PLANNER_PROVIDER=openai
export GOOSE_PLANNER_MODEL=gpt-4o

goose session
```
Expand Down
2 changes: 0 additions & 2 deletions documentation/docs/guides/config-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,6 @@ The following settings can be configured at the root level of your config.yaml f
| `GOOSE_MAX_TOKENS` | Maximum number of tokens for each model response (truncates longer responses) | Positive integer | Model-specific | No |
| `GOOSE_MODE` | [Tool execution behavior](/docs/guides/goose-permissions) | "auto", "approve", "chat", "smart_approve" | "auto" | No |
| `GOOSE_MAX_TURNS` | [Maximum number of turns](/docs/guides/sessions/smart-context-management#maximum-turns) allowed without user input | Integer (e.g., 10, 50, 100) | 1000 | No |
| `GOOSE_LEAD_PROVIDER` | Provider for lead model in [lead/worker mode](/docs/guides/environment-variables#leadworker-model-configuration) | Same as `GOOSE_PROVIDER` options | Falls back to `GOOSE_PROVIDER` | No |
| `GOOSE_LEAD_MODEL` | Lead model for lead/worker mode | Model name | None | No |
| `GOOSE_PLANNER_PROVIDER` | Provider for [planning mode](/docs/guides/creating-plans) | Same as `GOOSE_PROVIDER` options | Falls back to `GOOSE_PROVIDER` | No |
| `GOOSE_PLANNER_MODEL` | Model for planning mode | Model name | Falls back to `GOOSE_MODEL` | No |
| `GOOSE_TOOLSHIM` | Enable tool interpretation | true/false | false | No |
Expand Down
2 changes: 1 addition & 1 deletion documentation/docs/guides/creating-plans.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ The goose CLI plan mode uses two configuration values:
- `GOOSE_PLANNER_MODEL`: Which model to use for planning

:::tip Multi-Model Alternative to Plan Mode
goose also supports automatic model switching with [Lead/Worker mode](/docs/guides/environment-variables#leadworker-model-configuration), which provides turn-based switching between two models to help balance model capabilities with cost and speed.
You can combine planning mode with a different default execution model to balance cost, speed, and quality.
:::

:::tip Customize Plan Format
Expand Down
40 changes: 0 additions & 40 deletions documentation/docs/guides/environment-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,40 +130,6 @@ export GOOSE_PREDEFINED_MODELS='[

Custom context limits and request parameters are applied when the model is used. Custom context limits are displayed in goose CLI's [token usage indicator](/docs/guides/sessions/smart-context-management#token-usage).

### Lead/Worker Model Configuration

These variables configure a [lead/worker model pattern](/docs/tutorials/lead-worker) where a powerful lead model handles initial planning and complex reasoning, then switches to a faster/cheaper worker model for execution. The switch happens automatically based on your settings.

| Variable | Purpose | Values | Default |
|----------|---------|---------|---------|
| `GOOSE_LEAD_MODEL` | **Required to enable lead mode.** Name of the lead model | Model name (e.g., "gpt-4o", "claude-sonnet-4-20250514") | None |
| `GOOSE_LEAD_PROVIDER` | Provider for the lead model | [See available providers](/docs/getting-started/providers#available-providers) | Falls back to `GOOSE_PROVIDER` |
| `GOOSE_LEAD_TURNS` | Number of initial turns using the lead model before switching to the worker model | Integer | 3 |
| `GOOSE_LEAD_FAILURE_THRESHOLD` | Consecutive failures before falling back to the lead model | Integer | 2 |
| `GOOSE_LEAD_FALLBACK_TURNS` | Number of turns to use the lead model in fallback mode | Integer | 2 |

A _turn_ is one complete prompt-response interaction. Here's how it works with the default settings:
- Use the lead model for the first 3 turns
- Use the worker model starting on the 4th turn
- Fallback to the lead model if the worker model struggles for 2 consecutive turns
- Use the lead model for 2 turns and then switch back to the worker model

The lead model and worker model names are displayed at the start of the goose CLI session. If you don't export a `GOOSE_MODEL` for your session, the worker model defaults to the `GOOSE_MODEL` in your [configuration file](/docs/guides/config-files).

**Examples**

```bash
# Basic lead/worker setup
export GOOSE_LEAD_MODEL="o4"

# Advanced lead/worker configuration
export GOOSE_LEAD_MODEL="claude4-opus"
export GOOSE_LEAD_PROVIDER="anthropic"
export GOOSE_LEAD_TURNS=5
export GOOSE_LEAD_FAILURE_THRESHOLD=3
export GOOSE_LEAD_FALLBACK_TURNS=2
```

### Claude Thinking Configuration

These variables control Claude's reasoning behavior. Supported on Anthropic and Databricks providers.
Expand Down Expand Up @@ -350,8 +316,6 @@ These variables allow you to override the default context window size (token lim
|----------|---------|---------|---------|
| `GOOSE_CONTEXT_LIMIT` | Override context limit for the main model | Integer (number of tokens) | Model-specific default or 128,000 |
| `GOOSE_INPUT_LIMIT` | Override input prompt limit for ollama requests (maps to `num_ctx`) | Integer (number of tokens) | Falls back to `GOOSE_CONTEXT_LIMIT` or model default |
| `GOOSE_LEAD_CONTEXT_LIMIT` | Override context limit for the lead model in [lead/worker mode](/docs/tutorials/lead-worker) | Integer (number of tokens) | Falls back to `GOOSE_CONTEXT_LIMIT` or model default |
| `GOOSE_WORKER_CONTEXT_LIMIT` | Override context limit for the worker model in lead/worker mode | Integer (number of tokens) | Falls back to `GOOSE_CONTEXT_LIMIT` or model default |
| `GOOSE_PLANNER_CONTEXT_LIMIT` | Override context limit for the [planner model](/docs/guides/creating-plans) | Integer (number of tokens) | Falls back to `GOOSE_CONTEXT_LIMIT` or model default |

**Examples**
Expand All @@ -362,10 +326,6 @@ export GOOSE_CONTEXT_LIMIT=200000
# Override ollama input prompt limit
export GOOSE_INPUT_LIMIT=32000

# Set different context limits for lead/worker models
export GOOSE_LEAD_CONTEXT_LIMIT=500000 # Large context for planning
export GOOSE_WORKER_CONTEXT_LIMIT=128000 # Smaller context for execution

# Set context limit for planner
export GOOSE_PLANNER_CONTEXT_LIMIT=1000000
```
Expand Down
21 changes: 7 additions & 14 deletions documentation/docs/guides/multi-model/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@ import VideoCarousel from '@site/src/components/VideoCarousel';
<h2 className={styles.categoryTitle}>📚 Documentation & Guides</h2>
<div className={styles.cardGrid}>
<Card
title="Lead/Worker Multi-Model Setup"
description="Automatic switching between models using a lead model for initial turns and a worker model for execution."
link="/docs/tutorials/lead-worker"
title="Planner + Execution Model Setup"
description="Use a dedicated planner model for strategic reasoning and a separate default model for execution."
link="/docs/guides/creating-plans"
/>
<Card
title="Creating Plans Before Working"
Expand All @@ -45,9 +45,9 @@ import VideoCarousel from '@site/src/components/VideoCarousel';
link="/blog/2025/06/16/multi-model-in-goose"
/>
<Card
title="LLM Tag Team: Who Plans, Who Executes?"
description="Learn how lead/worker model configuration creates an effective AI tag team, with one model for planning and another for execution."
link="/blog/2025/08/11/llm-tag-team-lead-worker-model"
title="The AI Skeptic's Guide to Context Windows"
description="Learn practical ways to manage context windows and token usage in long-running sessions."
link="/blog/2025/08/18/understanding-context-windows"
/>
</div>
</div>
Expand All @@ -62,15 +62,8 @@ import VideoCarousel from '@site/src/components/VideoCarousel';
type: 'iframe',
src: 'https://www.youtube.com/embed/ZyhUTsChFUw',
title: 'goose\'s Multi-Model Setup',
description: 'Learn about lead/worker mode, from configuration to best practices',
description: 'Learn practical multi-model workflows in goose',
duration: '5:01'
},
{
type: 'iframe',
src: 'https://www.youtube.com/embed/SJ6EZpyCKrk',
title: 'Livestream - LLM Tag Team: Who Plans, Who Executes?',
description: 'Using lead/worker mode to add features to the goose docs in real time',
duration: '9:36'
}
]}
/>
Expand Down
13 changes: 5 additions & 8 deletions documentation/docs/guides/sessions/smart-context-management.md
Original file line number Diff line number Diff line change
Expand Up @@ -293,8 +293,6 @@ Context limits are automatically detected based on your model name, but goose pr
| Model | Description | Best For | Setting |
|-------|-------------|----------|---------|
| **Main** | Set context limit for the main model (also serves as fallback for other models) | LiteLLM proxies, custom models with non-standard names | `GOOSE_CONTEXT_LIMIT` |
| **Lead** | Set larger context for planning in [lead/worker mode](/docs/tutorials/lead-worker) | Complex planning tasks requiring more context | `GOOSE_LEAD_CONTEXT_LIMIT` |
| **Worker** | Set smaller context for execution in lead/worker mode | Cost optimization during execution phase | `GOOSE_WORKER_CONTEXT_LIMIT` |
| **Planner** | Set context for [planner models](/docs/guides/creating-plans) | Large planning tasks requiring extensive context | `GOOSE_PLANNER_CONTEXT_LIMIT` |

:::info
Expand All @@ -311,7 +309,7 @@ This feature is particularly useful with:
goose resolves context limits with the following precedence (highest to lowest):

1. Explicit context_limit in model configuration (if set programmatically)
2. Specific environment variable (e.g., `GOOSE_LEAD_CONTEXT_LIMIT`)
2. Specific environment variable (e.g., `GOOSE_PLANNER_CONTEXT_LIMIT`)
3. Global environment variable (`GOOSE_CONTEXT_LIMIT`)
4. Model-specific default based on name pattern matching
5. Global default (128,000 tokens)
Expand Down Expand Up @@ -348,13 +346,12 @@ export GOOSE_MODEL="my-custom-gpt4-proxy"
export GOOSE_CONTEXT_LIMIT=200000 # Override the 32k default
```

2. Lead/worker setup with different context limits
2. Planner setup with a different context limit

```bash
# Different context limits for planning vs execution
export GOOSE_LEAD_MODEL="claude-opus-custom"
export GOOSE_LEAD_CONTEXT_LIMIT=500000 # Large context for planning
export GOOSE_WORKER_CONTEXT_LIMIT=128000 # Smaller context for execution
# Set a larger context window for planning
export GOOSE_PLANNER_MODEL="claude-opus-custom"
export GOOSE_PLANNER_CONTEXT_LIMIT=500000
```

3. Planner with large context
Expand Down
4 changes: 2 additions & 2 deletions documentation/docs/guides/tips.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,8 @@ goose Desktop lets you [customize the sidebar](/docs/guides/desktop-navigation)
### Keep goose updated
Regularly [update](/docs/guides/updating-goose) goose to benefit from the latest features, bug fixes, and performance improvements.

### Pair Two Models to Save Money
Use [lead/worker model](/docs/tutorials/lead-worker/) to have goose use a "lead" model for early planning before handing the task to a lower-cost "worker" model for execution.
### Use a Dedicated Planner Model
Use [planning mode](/docs/guides/creating-plans) with a dedicated planner model for complex reasoning, while keeping a faster default model for everyday execution.

### Make Recipes Safe to Re-run
Write [recipes](/docs/guides/recipes/session-recipes) that check your current state before acting, so they can be run multiple times without causing any errors or duplication.
Expand Down
17 changes: 0 additions & 17 deletions documentation/docs/troubleshooting/known-issues.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,23 +114,6 @@ For detailed steps on updating your LLM provider, refer to the [Installation][in

If you encounter errors when configuring GitHub Copilot as your provider, try these workarounds for common scenarios.

#### OAuth Error with Lead/Worker Models

If the [lead/worker model](/docs/tutorials/lead-worker) feature is configured in your environment, you might see the following error during GitHub Copilot setup. This feature conflicts with the OAuth flow to connect to the provider.
```
Failed to authenticate: Execution error: OAuth configuration not supported by this provider
```

To resolve:
1. Temporarily comment out or remove lead/worker model variables from the main config file (`~/.config/goose/config.yaml`):
```yaml
# GOOSE_LEAD_MODEL: your-model
# GOOSE_WORKER_MODEL: your-model
```
2. Run `goose configure` again to set up GitHub Copilot
3. Complete the OAuth authentication flow
4. Re-enable your lead/worker model settings as needed

#### Container and Keyring Issues

goose tries to use the system keyring (typically via Secret Service over DBus) to securely store your GitHub Copilot token. In containerized or headless environments, DBus and/or a desktop keyring service may not be available (and some setups fail with X11-based DBus autolaunch errors), so keyring access can fail.
Expand Down
6 changes: 3 additions & 3 deletions documentation/docs/tutorials/headless-goose.md
Original file line number Diff line number Diff line change
Expand Up @@ -302,9 +302,9 @@ export GOOSE_CLI_MIN_PRIORITY=0.2 # Reduce verbose output
### Advanced Configuration

```bash
# For complex workflows requiring different models
export GOOSE_LEAD_MODEL=gpt-4o # For planning
export GOOSE_WORKER_MODEL=gpt-4o-mini # For execution
# For complex workflows requiring dedicated planning
export GOOSE_PLANNER_PROVIDER=openai
export GOOSE_PLANNER_MODEL=gpt-4o

# Security and permissions
export GOOSE_ALLOWLIST=https://company.com/allowed-extensions.json
Expand Down
Loading
Loading