aaif-goose · angiejones · Apr 10, 2026 · Apr 10, 2026 · Apr 10, 2026 · Apr 10, 2026
@@ -1,11 +1,16 @@
 ---
 title: "Treating LLMs Like Tools in a Toolbox: A Multi-Model Approach to Smarter AI Agents"
 description: How Goose uses multiple LLMs within a single task, optimizing for speed, cost, and reliability in AI agent workflows
+unlisted: true
 authors:
     - mic
     - angie
 ---
 
+:::danger Outdated
+Lead/Worker mode has been removed from goose. It has been replaced by [Planning Mode](/docs/guides/creating-plans), which uses a dedicated planner model with the `/plan` command. See the [multi-model guide](/docs/guides/multi-model/) for current workflows.
+:::
+
 ![blog cover](multi-model-ai-agent.png)
 
 
@@ -79,7 +84,7 @@ export GOOSE_MODEL="claude-4-sonnet"
 
 From there, Goose takes care of the hand off, the fallback, and the recovery. You just... keep vibing.
 
-If you're curious how it all works under the hood, we've got a [full tutorial](/docs/tutorials/lead-worker).
+If you're curious how it all works under the hood, see the [planning guide](/docs/guides/creating-plans).
 
 ---
 

@@ -1,10 +1,15 @@
 ---
 title: "LLM Tag Team: Who Plans, Who Executes?"
 description: Dive into Goose's Lead/Worker model where one LLM plans while another executes - a game-changing approach to AI collaboration that can save costs and boost efficiency.
+unlisted: true
 authors: 
     - ebony
 ---
 
+:::danger Outdated
+Lead/Worker mode has been removed from goose. It has been replaced by [Planning Mode](/docs/guides/creating-plans), which uses a dedicated planner model with the `/plan` command. See the [multi-model guide](/docs/guides/multi-model/) for current workflows.
+:::
+
 ![blog cover](header-image.png)
 
 Ever wondered what happens when you let two AI models work together like a tag team? That’s exactly what we tested in our latest livestream—putting Goose’s Lead/Worker model to work on a real project. Spoiler: it’s actually pretty great.
@@ -37,7 +42,7 @@ This is where it gets really cool - you can use Claude for reasoning and OpenAI
 - 🏃‍♂️ Handle Long Dev Sessions
 Perfect for those marathon coding sessions where you need sustained performance without breaking the bank.
 
-## [Setting It Up](/docs/tutorials/lead-worker#configuration)
+## [Setting It Up](/docs/guides/creating-plans)
 
 Getting started with the Lead/Worker model is surprisingly straightforward. In the Goose desktop app, you just need to:
 
@@ -97,9 +102,9 @@ By the end of our session, we had:
 
 The best part? The models made smart decisions we hadn't even thought of, like automatically categorizing the servers and improving the overall page layout.
 
-## Ready to Try It Yourself?
+## Ready to Try Multi-Model Workflows?
 
-The [Lead/Worker model](/docs/tutorials/lead-worker) is available now in Goose. Whether you're working on documentation, building features, or tackling complex refactoring, having two specialized models working together can be a game changer.
+Lead/Worker mode has been removed, but goose now supports [Planning Mode](/docs/guides/creating-plans) for multi-model workflows. Whether you're working on documentation, building features, or tackling complex refactoring, pairing a strong planner model with a fast execution model can be a game changer.
 
 Want to see it in action? Check out the full stream where we built this feature live:
 

@@ -87,9 +87,9 @@ The [Memory extension](https://goose-docs.ai/docs/mcp/memory-mcp) stores importa
 
 Keep individual sessions focused on specific tasks. When you complete a task or reach a natural stopping point, start a new session. This prevents context window bloat from accumulated conversation history and ensures your tokens are spent on current, relevant work.
 
-**7. Lead/worker model**
+**7. Planner model + focused execution**
 
-The [Lead/Worker model](https://goose-docs.ai/docs/tutorials/lead-worker) splits work between two models. The lead model handles high-level planning and decision-making, while the worker model executes the detailed implementation. This optimizes costs by using expensive models for strategic thinking and cheaper models for routine execution tasks.
+Use a dedicated [planner model](/docs/guides/creating-plans) for complex reasoning and keep your default model focused on execution. This gives you control over cost and quality while keeping model behavior explicit and predictable.
 
 ---
 

@@ -55,7 +55,7 @@ Browse and select from a wide range of options, including:
 - **Specialized models** optimized for different use cases
 
 :::tip Protip 
- Want the best of both worlds? Use goose’s [Lead/Worker configuration](/docs/tutorials/lead-worker) to combine a powerful frontier model with a faster open-weight model. Let your Lead handle the high-level thinking while Workers take care of the repetitive tasks—saving you both time and credits.
+ Want the best of both worlds? Use a dedicated [planner model](/docs/guides/creating-plans) for complex strategy and a faster default model for execution. Pair this with `/plan` to get strong reasoning only when you need it—saving both time and credits.
 :::
 
 ---

@@ -1208,8 +1208,8 @@ This method simplifies authentication and enhances security for enterprise envir
 
 Beyond single-model setups, goose supports [multi-model configurations](/docs/guides/multi-model/) that can use different models and providers for specialized tasks:
 
-- **Lead/Worker Model** - Automatic switching between a lead model for initial turns and a worker model for execution tasks
-- **Planning Mode** - Manual planning phase using a dedicated model to create detailed project breakdowns before execution
+- **Planning Mode** - Use a dedicated planner model to create detailed project breakdowns before execution
+- **Subagents** - Delegate scoped tasks to isolated sessions to keep your primary workflow focused and efficient
 
 ## Gemini 3 Thinking Levels
 

@@ -37,7 +37,7 @@ CLI providers are useful if you:
 #### Workflow Integration  
 - **Recipe compatibility**: Use CLI providers in automated goose recipes
 - **Scheduling support**: Include in scheduled tasks and workflows
-- **Hybrid configurations**: Combine with LLM providers using lead/worker patterns
+- **Hybrid configurations**: Combine with planning mode and model-specific workflows
 
 #### Interface Consistency
 - **Unified commands**: Use the same `goose session` interface across all providers
@@ -260,16 +260,16 @@ Once configured, you can start a goose session using these providers just like a
 goose session
 ```
 
-### Combining with Other Models
+### Combining with Planner Models
 
-CLI providers work well in combination with other models using goose's [lead/worker pattern](/docs/tutorials/lead-worker):
+CLI providers also work well with planning mode when you want one model for strategy and another for execution:
 
 ```bash
-# Use Claude Code as lead model, GPT-4o as worker
-export GOOSE_LEAD_PROVIDER=claude-code
-export GOOSE_PROVIDER=openai
-export GOOSE_MODEL=gpt-4o
-export GOOSE_LEAD_MODEL=default
+# Use Claude Code for execution, OpenAI for planning
+export GOOSE_PROVIDER=claude-code
+export GOOSE_MODEL=default
+export GOOSE_PLANNER_PROVIDER=openai
+export GOOSE_PLANNER_MODEL=gpt-4o
 
 goose session
 ```

@@ -37,8 +37,6 @@ The following settings can be configured at the root level of your config.yaml f
 | `GOOSE_MAX_TOKENS` | Maximum number of tokens for each model response (truncates longer responses) | Positive integer | Model-specific | No |
 | `GOOSE_MODE` | [Tool execution behavior](/docs/guides/goose-permissions) | "auto", "approve", "chat", "smart_approve" | "auto" | No |
 | `GOOSE_MAX_TURNS` | [Maximum number of turns](/docs/guides/sessions/smart-context-management#maximum-turns) allowed without user input | Integer (e.g., 10, 50, 100) | 1000 | No |
-| `GOOSE_LEAD_PROVIDER` | Provider for lead model in [lead/worker mode](/docs/guides/environment-variables#leadworker-model-configuration) | Same as `GOOSE_PROVIDER` options | Falls back to `GOOSE_PROVIDER` | No |
-| `GOOSE_LEAD_MODEL` | Lead model for lead/worker mode | Model name | None | No |
 | `GOOSE_PLANNER_PROVIDER` | Provider for [planning mode](/docs/guides/creating-plans) | Same as `GOOSE_PROVIDER` options | Falls back to `GOOSE_PROVIDER` | No |
 | `GOOSE_PLANNER_MODEL` | Model for planning mode | Model name | Falls back to `GOOSE_MODEL` | No |
 | `GOOSE_TOOLSHIM` | Enable tool interpretation | true/false | false | No |

@@ -35,7 +35,7 @@ The goose CLI plan mode uses two configuration values:
 - `GOOSE_PLANNER_MODEL`: Which model to use for planning
 
 :::tip Multi-Model Alternative to Plan Mode
-goose also supports automatic model switching with [Lead/Worker mode](/docs/guides/environment-variables#leadworker-model-configuration), which provides turn-based switching between two models to help balance model capabilities with cost and speed.
+You can combine planning mode with a different default execution model to balance cost, speed, and quality.
 :::
 
 :::tip Customize Plan Format

@@ -130,40 +130,6 @@ export GOOSE_PREDEFINED_MODELS='[
 
 Custom context limits and request parameters are applied when the model is used. Custom context limits are displayed in goose CLI's [token usage indicator](/docs/guides/sessions/smart-context-management#token-usage).
 
-### Lead/Worker Model Configuration
-
-These variables configure a [lead/worker model pattern](/docs/tutorials/lead-worker) where a powerful lead model handles initial planning and complex reasoning, then switches to a faster/cheaper worker model for execution. The switch happens automatically based on your settings.
-
-| Variable | Purpose | Values | Default |
-|----------|---------|---------|---------|
-| `GOOSE_LEAD_MODEL` | **Required to enable lead mode.** Name of the lead model | Model name (e.g., "gpt-4o", "claude-sonnet-4-20250514") | None |
-| `GOOSE_LEAD_PROVIDER` | Provider for the lead model | [See available providers](/docs/getting-started/providers#available-providers) | Falls back to `GOOSE_PROVIDER` |
-| `GOOSE_LEAD_TURNS` | Number of initial turns using the lead model before switching to the worker model | Integer | 3 |
-| `GOOSE_LEAD_FAILURE_THRESHOLD` | Consecutive failures before falling back to the lead model | Integer | 2 |
-| `GOOSE_LEAD_FALLBACK_TURNS` | Number of turns to use the lead model in fallback mode | Integer | 2 |
-
-A _turn_ is one complete prompt-response interaction. Here's how it works with the default settings:
-- Use the lead model for the first 3 turns
-- Use the worker model starting on the 4th turn
-- Fallback to the lead model if the worker model struggles for 2 consecutive turns
-- Use the lead model for 2 turns and then switch back to the worker model
-
-The lead model and worker model names are displayed at the start of the goose CLI session. If you don't export a `GOOSE_MODEL` for your session, the worker model defaults to the `GOOSE_MODEL` in your [configuration file](/docs/guides/config-files).
-
-**Examples**
-
-```bash
-# Basic lead/worker setup
-export GOOSE_LEAD_MODEL="o4"
-
-# Advanced lead/worker configuration
-export GOOSE_LEAD_MODEL="claude4-opus"
-export GOOSE_LEAD_PROVIDER="anthropic"
-export GOOSE_LEAD_TURNS=5
-export GOOSE_LEAD_FAILURE_THRESHOLD=3
-export GOOSE_LEAD_FALLBACK_TURNS=2
-```
-
 ### Claude Thinking Configuration
 
 These variables control Claude's reasoning behavior. Supported on Anthropic and Databricks providers.
@@ -350,8 +316,6 @@ These variables allow you to override the default context window size (token lim
 |----------|---------|---------|---------|
 | `GOOSE_CONTEXT_LIMIT` | Override context limit for the main model | Integer (number of tokens) | Model-specific default or 128,000 |
 | `GOOSE_INPUT_LIMIT` | Override input prompt limit for ollama requests (maps to `num_ctx`) | Integer (number of tokens) | Falls back to `GOOSE_CONTEXT_LIMIT` or model default |
-| `GOOSE_LEAD_CONTEXT_LIMIT` | Override context limit for the lead model in [lead/worker mode](/docs/tutorials/lead-worker) | Integer (number of tokens) | Falls back to `GOOSE_CONTEXT_LIMIT` or model default |
-| `GOOSE_WORKER_CONTEXT_LIMIT` | Override context limit for the worker model in lead/worker mode | Integer (number of tokens) | Falls back to `GOOSE_CONTEXT_LIMIT` or model default |
 | `GOOSE_PLANNER_CONTEXT_LIMIT` | Override context limit for the [planner model](/docs/guides/creating-plans) | Integer (number of tokens) | Falls back to `GOOSE_CONTEXT_LIMIT` or model default |
 
 **Examples**
@@ -362,10 +326,6 @@ export GOOSE_CONTEXT_LIMIT=200000
 # Override ollama input prompt limit
 export GOOSE_INPUT_LIMIT=32000
 
-# Set different context limits for lead/worker models
-export GOOSE_LEAD_CONTEXT_LIMIT=500000   # Large context for planning
-export GOOSE_WORKER_CONTEXT_LIMIT=128000 # Smaller context for execution
-
 # Set context limit for planner
 export GOOSE_PLANNER_CONTEXT_LIMIT=1000000
 ```

@@ -19,9 +19,9 @@ import VideoCarousel from '@site/src/components/VideoCarousel';
   <h2 className={styles.categoryTitle}>📚 Documentation & Guides</h2>
   <div className={styles.cardGrid}>
     <Card 
-      title="Lead/Worker Multi-Model Setup"
-      description="Automatic switching between models using a lead model for initial turns and a worker model for execution."
-      link="/docs/tutorials/lead-worker"
+      title="Planner + Execution Model Setup"
+      description="Use a dedicated planner model for strategic reasoning and a separate default model for execution."
+      link="/docs/guides/creating-plans"
     />
     <Card 
       title="Creating Plans Before Working"
@@ -45,9 +45,9 @@ import VideoCarousel from '@site/src/components/VideoCarousel';
       link="/blog/2025/06/16/multi-model-in-goose"
     />
     <Card
-      title="LLM Tag Team: Who Plans, Who Executes?"
-      description="Learn how lead/worker model configuration creates an effective AI tag team, with one model for planning and another for execution."
-      link="/blog/2025/08/11/llm-tag-team-lead-worker-model"
+      title="The AI Skeptic's Guide to Context Windows"
+      description="Learn practical ways to manage context windows and token usage in long-running sessions."
+      link="/blog/2025/08/18/understanding-context-windows"
     />
   </div>
 </div>
@@ -62,15 +62,8 @@ import VideoCarousel from '@site/src/components/VideoCarousel';
       type: 'iframe', 
       src: 'https://www.youtube.com/embed/ZyhUTsChFUw',
       title: 'goose\'s Multi-Model Setup',
-      description: 'Learn about lead/worker mode, from configuration to best practices',
+      description: 'Learn practical multi-model workflows in goose',
       duration: '5:01'
-    },
-    { 
-      type: 'iframe', 
-      src: 'https://www.youtube.com/embed/SJ6EZpyCKrk',
-      title: 'Livestream - LLM Tag Team: Who Plans, Who Executes?',
-      description: 'Using lead/worker mode to add features to the goose docs in real time',
-      duration: '9:36'
     }
   ]}
 />

@@ -293,8 +293,6 @@ Context limits are automatically detected based on your model name, but goose pr
 | Model | Description | Best For | Setting |
 |-------|-------------|----------|---------|
 | **Main** | Set context limit for the main model (also serves as fallback for other models) | LiteLLM proxies, custom models with non-standard names | `GOOSE_CONTEXT_LIMIT` |
-| **Lead** | Set larger context for planning in [lead/worker mode](/docs/tutorials/lead-worker) | Complex planning tasks requiring more context | `GOOSE_LEAD_CONTEXT_LIMIT` |
-| **Worker** | Set smaller context for execution in lead/worker mode | Cost optimization during execution phase | `GOOSE_WORKER_CONTEXT_LIMIT` |
 | **Planner** | Set context for [planner models](/docs/guides/creating-plans) | Large planning tasks requiring extensive context | `GOOSE_PLANNER_CONTEXT_LIMIT` |
 
 :::info
@@ -311,7 +309,7 @@ This feature is particularly useful with:
 goose resolves context limits with the following precedence (highest to lowest):
 
 1. Explicit context_limit in model configuration (if set programmatically)
-2. Specific environment variable (e.g., `GOOSE_LEAD_CONTEXT_LIMIT`)
+2. Specific environment variable (e.g., `GOOSE_PLANNER_CONTEXT_LIMIT`)
 3. Global environment variable (`GOOSE_CONTEXT_LIMIT`)
 4. Model-specific default based on name pattern matching
 5. Global default (128,000 tokens)
@@ -348,13 +346,12 @@ export GOOSE_MODEL="my-custom-gpt4-proxy"
 export GOOSE_CONTEXT_LIMIT=200000  # Override the 32k default
 ```
 
-2. Lead/worker setup with different context limits
+2. Planner setup with a different context limit
 
 ```bash
-# Different context limits for planning vs execution
-export GOOSE_LEAD_MODEL="claude-opus-custom"
-export GOOSE_LEAD_CONTEXT_LIMIT=500000    # Large context for planning
-export GOOSE_WORKER_CONTEXT_LIMIT=128000  # Smaller context for execution
+# Set a larger context window for planning
+export GOOSE_PLANNER_MODEL="claude-opus-custom"
+export GOOSE_PLANNER_CONTEXT_LIMIT=500000
 ```
 
 3. Planner with large context

@@ -57,8 +57,8 @@ goose Desktop lets you [customize the sidebar](/docs/guides/desktop-navigation)
 ### Keep goose updated
 Regularly [update](/docs/guides/updating-goose) goose to benefit from the latest features, bug fixes, and performance improvements.
 
-### Pair Two Models to Save Money 
-Use [lead/worker model](/docs/tutorials/lead-worker/) to have goose use a "lead" model for early planning before handing the task to a lower-cost "worker" model for execution.  
+### Use a Dedicated Planner Model
+Use [planning mode](/docs/guides/creating-plans) with a dedicated planner model for complex reasoning, while keeping a faster default model for everyday execution.
 
 ### Make Recipes Safe to Re-run
 Write [recipes](/docs/guides/recipes/session-recipes) that check your current state before acting, so they can be run multiple times without causing any errors or duplication. 

@@ -114,23 +114,6 @@ For detailed steps on updating your LLM provider, refer to the [Installation][in
 
 If you encounter errors when configuring GitHub Copilot as your provider, try these workarounds for common scenarios.
 
-#### OAuth Error with Lead/Worker Models
-
-If the [lead/worker model](/docs/tutorials/lead-worker) feature is configured in your environment, you might see the following error during GitHub Copilot setup. This feature conflicts with the OAuth flow to connect to the provider.
-```
-Failed to authenticate: Execution error: OAuth configuration not supported by this provider
-``` 
-
-To resolve:
-1. Temporarily comment out or remove lead/worker model variables from the main config file (`~/.config/goose/config.yaml`):
-   ```yaml
-   # GOOSE_LEAD_MODEL: your-model
-   # GOOSE_WORKER_MODEL: your-model
-   ```
-2. Run `goose configure` again to set up GitHub Copilot
-3. Complete the OAuth authentication flow
-4. Re-enable your lead/worker model settings as needed
-
 #### Container and Keyring Issues
 
 goose tries to use the system keyring (typically via Secret Service over DBus) to securely store your GitHub Copilot token. In containerized or headless environments, DBus and/or a desktop keyring service may not be available (and some setups fail with X11-based DBus autolaunch errors), so keyring access can fail.

@@ -302,9 +302,9 @@ export GOOSE_CLI_MIN_PRIORITY=0.2  # Reduce verbose output
 ### Advanced Configuration
 
 ```bash
-# For complex workflows requiring different models
-export GOOSE_LEAD_MODEL=gpt-4o  # For planning
-export GOOSE_WORKER_MODEL=gpt-4o-mini  # For execution
+# For complex workflows requiring dedicated planning
+export GOOSE_PLANNER_PROVIDER=openai
+export GOOSE_PLANNER_MODEL=gpt-4o
 
 # Security and permissions
 export GOOSE_ALLOWLIST=https://company.com/allowed-extensions.json