diff --git a/documentation/blog/2025-06-16-multi-model-in-goose/index.md b/documentation/blog/2025-06-16-multi-model-in-goose/index.md index 1cc7a562289e..7bbf5d36a5d2 100644 --- a/documentation/blog/2025-06-16-multi-model-in-goose/index.md +++ b/documentation/blog/2025-06-16-multi-model-in-goose/index.md @@ -1,11 +1,16 @@ --- title: "Treating LLMs Like Tools in a Toolbox: A Multi-Model Approach to Smarter AI Agents" description: How Goose uses multiple LLMs within a single task, optimizing for speed, cost, and reliability in AI agent workflows +unlisted: true authors: - mic - angie --- +:::danger Outdated +Lead/Worker mode has been removed from goose. It has been replaced by [Planning Mode](/docs/guides/creating-plans), which uses a dedicated planner model with the `/plan` command. See the [multi-model guide](/docs/guides/multi-model/) for current workflows. +::: +  @@ -79,7 +84,7 @@ export GOOSE_MODEL="claude-4-sonnet" From there, Goose takes care of the hand off, the fallback, and the recovery. You just... keep vibing. -If you're curious how it all works under the hood, we've got a [full tutorial](/docs/tutorials/lead-worker). +If you're curious how it all works under the hood, see the [planning guide](/docs/guides/creating-plans). --- diff --git a/documentation/blog/2025-08-11-llm-tag-team-lead-worker-model/index.md b/documentation/blog/2025-08-11-llm-tag-team-lead-worker-model/index.md index 462a0170bbe7..8885843a9d6c 100644 --- a/documentation/blog/2025-08-11-llm-tag-team-lead-worker-model/index.md +++ b/documentation/blog/2025-08-11-llm-tag-team-lead-worker-model/index.md @@ -1,10 +1,15 @@ --- title: "LLM Tag Team: Who Plans, Who Executes?" description: Dive into Goose's Lead/Worker model where one LLM plans while another executes - a game-changing approach to AI collaboration that can save costs and boost efficiency. +unlisted: true authors: - ebony --- +:::danger Outdated +Lead/Worker mode has been removed from goose. It has been replaced by [Planning Mode](/docs/guides/creating-plans), which uses a dedicated planner model with the `/plan` command. See the [multi-model guide](/docs/guides/multi-model/) for current workflows. +::: +  Ever wondered what happens when you let two AI models work together like a tag team? That’s exactly what we tested in our latest livestream—putting Goose’s Lead/Worker model to work on a real project. Spoiler: it’s actually pretty great. @@ -37,7 +42,7 @@ This is where it gets really cool - you can use Claude for reasoning and OpenAI - 🏃♂️ Handle Long Dev Sessions Perfect for those marathon coding sessions where you need sustained performance without breaking the bank. -## [Setting It Up](/docs/tutorials/lead-worker#configuration) +## [Setting It Up](/docs/guides/creating-plans) Getting started with the Lead/Worker model is surprisingly straightforward. In the Goose desktop app, you just need to: @@ -97,9 +102,9 @@ By the end of our session, we had: The best part? The models made smart decisions we hadn't even thought of, like automatically categorizing the servers and improving the overall page layout. -## Ready to Try It Yourself? +## Ready to Try Multi-Model Workflows? -The [Lead/Worker model](/docs/tutorials/lead-worker) is available now in Goose. Whether you're working on documentation, building features, or tackling complex refactoring, having two specialized models working together can be a game changer. +Lead/Worker mode has been removed, but goose now supports [Planning Mode](/docs/guides/creating-plans) for multi-model workflows. Whether you're working on documentation, building features, or tackling complex refactoring, pairing a strong planner model with a fast execution model can be a game changer. Want to see it in action? Check out the full stream where we built this feature live: diff --git a/documentation/blog/2025-08-18-understanding-context-windows/index.md b/documentation/blog/2025-08-18-understanding-context-windows/index.md index 093a08644492..204c76626c8e 100644 --- a/documentation/blog/2025-08-18-understanding-context-windows/index.md +++ b/documentation/blog/2025-08-18-understanding-context-windows/index.md @@ -87,9 +87,9 @@ The [Memory extension](https://goose-docs.ai/docs/mcp/memory-mcp) stores importa Keep individual sessions focused on specific tasks. When you complete a task or reach a natural stopping point, start a new session. This prevents context window bloat from accumulated conversation history and ensures your tokens are spent on current, relevant work. -**7. Lead/worker model** +**7. Planner model + focused execution** -The [Lead/Worker model](https://goose-docs.ai/docs/tutorials/lead-worker) splits work between two models. The lead model handles high-level planning and decision-making, while the worker model executes the detailed implementation. This optimizes costs by using expensive models for strategic thinking and cheaper models for routine execution tasks. +Use a dedicated [planner model](/docs/guides/creating-plans) for complex reasoning and keep your default model focused on execution. This gives you control over cost and quality while keeping model behavior explicit and predictable. --- diff --git a/documentation/blog/2025-08-27-get-started-for-free-with-tetrate/index.md b/documentation/blog/2025-08-27-get-started-for-free-with-tetrate/index.md index 11cecbc32802..35d3e39903dd 100644 --- a/documentation/blog/2025-08-27-get-started-for-free-with-tetrate/index.md +++ b/documentation/blog/2025-08-27-get-started-for-free-with-tetrate/index.md @@ -55,7 +55,7 @@ Browse and select from a wide range of options, including: - **Specialized models** optimized for different use cases :::tip Protip - Want the best of both worlds? Use goose’s [Lead/Worker configuration](/docs/tutorials/lead-worker) to combine a powerful frontier model with a faster open-weight model. Let your Lead handle the high-level thinking while Workers take care of the repetitive tasks—saving you both time and credits. + Want the best of both worlds? Use a dedicated [planner model](/docs/guides/creating-plans) for complex strategy and a faster default model for execution. Pair this with `/plan` to get strong reasoning only when you need it—saving both time and credits. ::: --- diff --git a/documentation/docs/getting-started/providers.md b/documentation/docs/getting-started/providers.md index 2152597f31db..0bd217a4a00b 100644 --- a/documentation/docs/getting-started/providers.md +++ b/documentation/docs/getting-started/providers.md @@ -1208,8 +1208,8 @@ This method simplifies authentication and enhances security for enterprise envir Beyond single-model setups, goose supports [multi-model configurations](/docs/guides/multi-model/) that can use different models and providers for specialized tasks: -- **Lead/Worker Model** - Automatic switching between a lead model for initial turns and a worker model for execution tasks -- **Planning Mode** - Manual planning phase using a dedicated model to create detailed project breakdowns before execution +- **Planning Mode** - Use a dedicated planner model to create detailed project breakdowns before execution +- **Subagents** - Delegate scoped tasks to isolated sessions to keep your primary workflow focused and efficient ## Gemini 3 Thinking Levels diff --git a/documentation/docs/guides/cli-providers.md b/documentation/docs/guides/cli-providers.md index 00665c3b530d..ad9128231b88 100644 --- a/documentation/docs/guides/cli-providers.md +++ b/documentation/docs/guides/cli-providers.md @@ -37,7 +37,7 @@ CLI providers are useful if you: #### Workflow Integration - **Recipe compatibility**: Use CLI providers in automated goose recipes - **Scheduling support**: Include in scheduled tasks and workflows -- **Hybrid configurations**: Combine with LLM providers using lead/worker patterns +- **Hybrid configurations**: Combine with planning mode and model-specific workflows #### Interface Consistency - **Unified commands**: Use the same `goose session` interface across all providers @@ -260,16 +260,16 @@ Once configured, you can start a goose session using these providers just like a goose session ``` -### Combining with Other Models +### Combining with Planner Models -CLI providers work well in combination with other models using goose's [lead/worker pattern](/docs/tutorials/lead-worker): +CLI providers also work well with planning mode when you want one model for strategy and another for execution: ```bash -# Use Claude Code as lead model, GPT-4o as worker -export GOOSE_LEAD_PROVIDER=claude-code -export GOOSE_PROVIDER=openai -export GOOSE_MODEL=gpt-4o -export GOOSE_LEAD_MODEL=default +# Use Claude Code for execution, OpenAI for planning +export GOOSE_PROVIDER=claude-code +export GOOSE_MODEL=default +export GOOSE_PLANNER_PROVIDER=openai +export GOOSE_PLANNER_MODEL=gpt-4o goose session ``` diff --git a/documentation/docs/guides/config-files.md b/documentation/docs/guides/config-files.md index d8c02003dc91..ef6206131686 100644 --- a/documentation/docs/guides/config-files.md +++ b/documentation/docs/guides/config-files.md @@ -37,8 +37,6 @@ The following settings can be configured at the root level of your config.yaml f | `GOOSE_MAX_TOKENS` | Maximum number of tokens for each model response (truncates longer responses) | Positive integer | Model-specific | No | | `GOOSE_MODE` | [Tool execution behavior](/docs/guides/goose-permissions) | "auto", "approve", "chat", "smart_approve" | "auto" | No | | `GOOSE_MAX_TURNS` | [Maximum number of turns](/docs/guides/sessions/smart-context-management#maximum-turns) allowed without user input | Integer (e.g., 10, 50, 100) | 1000 | No | -| `GOOSE_LEAD_PROVIDER` | Provider for lead model in [lead/worker mode](/docs/guides/environment-variables#leadworker-model-configuration) | Same as `GOOSE_PROVIDER` options | Falls back to `GOOSE_PROVIDER` | No | -| `GOOSE_LEAD_MODEL` | Lead model for lead/worker mode | Model name | None | No | | `GOOSE_PLANNER_PROVIDER` | Provider for [planning mode](/docs/guides/creating-plans) | Same as `GOOSE_PROVIDER` options | Falls back to `GOOSE_PROVIDER` | No | | `GOOSE_PLANNER_MODEL` | Model for planning mode | Model name | Falls back to `GOOSE_MODEL` | No | | `GOOSE_TOOLSHIM` | Enable tool interpretation | true/false | false | No | diff --git a/documentation/docs/guides/creating-plans.md b/documentation/docs/guides/creating-plans.md index a01bc9aa0c76..b5dae558a4ca 100644 --- a/documentation/docs/guides/creating-plans.md +++ b/documentation/docs/guides/creating-plans.md @@ -35,7 +35,7 @@ The goose CLI plan mode uses two configuration values: - `GOOSE_PLANNER_MODEL`: Which model to use for planning :::tip Multi-Model Alternative to Plan Mode -goose also supports automatic model switching with [Lead/Worker mode](/docs/guides/environment-variables#leadworker-model-configuration), which provides turn-based switching between two models to help balance model capabilities with cost and speed. +You can combine planning mode with a different default execution model to balance cost, speed, and quality. ::: :::tip Customize Plan Format diff --git a/documentation/docs/guides/environment-variables.md b/documentation/docs/guides/environment-variables.md index e007dd6b9235..7eeaf99ee455 100644 --- a/documentation/docs/guides/environment-variables.md +++ b/documentation/docs/guides/environment-variables.md @@ -130,40 +130,6 @@ export GOOSE_PREDEFINED_MODELS='[ Custom context limits and request parameters are applied when the model is used. Custom context limits are displayed in goose CLI's [token usage indicator](/docs/guides/sessions/smart-context-management#token-usage). -### Lead/Worker Model Configuration - -These variables configure a [lead/worker model pattern](/docs/tutorials/lead-worker) where a powerful lead model handles initial planning and complex reasoning, then switches to a faster/cheaper worker model for execution. The switch happens automatically based on your settings. - -| Variable | Purpose | Values | Default | -|----------|---------|---------|---------| -| `GOOSE_LEAD_MODEL` | **Required to enable lead mode.** Name of the lead model | Model name (e.g., "gpt-4o", "claude-sonnet-4-20250514") | None | -| `GOOSE_LEAD_PROVIDER` | Provider for the lead model | [See available providers](/docs/getting-started/providers#available-providers) | Falls back to `GOOSE_PROVIDER` | -| `GOOSE_LEAD_TURNS` | Number of initial turns using the lead model before switching to the worker model | Integer | 3 | -| `GOOSE_LEAD_FAILURE_THRESHOLD` | Consecutive failures before falling back to the lead model | Integer | 2 | -| `GOOSE_LEAD_FALLBACK_TURNS` | Number of turns to use the lead model in fallback mode | Integer | 2 | - -A _turn_ is one complete prompt-response interaction. Here's how it works with the default settings: -- Use the lead model for the first 3 turns -- Use the worker model starting on the 4th turn -- Fallback to the lead model if the worker model struggles for 2 consecutive turns -- Use the lead model for 2 turns and then switch back to the worker model - -The lead model and worker model names are displayed at the start of the goose CLI session. If you don't export a `GOOSE_MODEL` for your session, the worker model defaults to the `GOOSE_MODEL` in your [configuration file](/docs/guides/config-files). - -**Examples** - -```bash -# Basic lead/worker setup -export GOOSE_LEAD_MODEL="o4" - -# Advanced lead/worker configuration -export GOOSE_LEAD_MODEL="claude4-opus" -export GOOSE_LEAD_PROVIDER="anthropic" -export GOOSE_LEAD_TURNS=5 -export GOOSE_LEAD_FAILURE_THRESHOLD=3 -export GOOSE_LEAD_FALLBACK_TURNS=2 -``` - ### Claude Thinking Configuration These variables control Claude's reasoning behavior. Supported on Anthropic and Databricks providers. @@ -350,8 +316,6 @@ These variables allow you to override the default context window size (token lim |----------|---------|---------|---------| | `GOOSE_CONTEXT_LIMIT` | Override context limit for the main model | Integer (number of tokens) | Model-specific default or 128,000 | | `GOOSE_INPUT_LIMIT` | Override input prompt limit for ollama requests (maps to `num_ctx`) | Integer (number of tokens) | Falls back to `GOOSE_CONTEXT_LIMIT` or model default | -| `GOOSE_LEAD_CONTEXT_LIMIT` | Override context limit for the lead model in [lead/worker mode](/docs/tutorials/lead-worker) | Integer (number of tokens) | Falls back to `GOOSE_CONTEXT_LIMIT` or model default | -| `GOOSE_WORKER_CONTEXT_LIMIT` | Override context limit for the worker model in lead/worker mode | Integer (number of tokens) | Falls back to `GOOSE_CONTEXT_LIMIT` or model default | | `GOOSE_PLANNER_CONTEXT_LIMIT` | Override context limit for the [planner model](/docs/guides/creating-plans) | Integer (number of tokens) | Falls back to `GOOSE_CONTEXT_LIMIT` or model default | **Examples** @@ -362,10 +326,6 @@ export GOOSE_CONTEXT_LIMIT=200000 # Override ollama input prompt limit export GOOSE_INPUT_LIMIT=32000 -# Set different context limits for lead/worker models -export GOOSE_LEAD_CONTEXT_LIMIT=500000 # Large context for planning -export GOOSE_WORKER_CONTEXT_LIMIT=128000 # Smaller context for execution - # Set context limit for planner export GOOSE_PLANNER_CONTEXT_LIMIT=1000000 ``` diff --git a/documentation/docs/guides/multi-model/index.mdx b/documentation/docs/guides/multi-model/index.mdx index 61d3b345922f..dc3392891893 100644 --- a/documentation/docs/guides/multi-model/index.mdx +++ b/documentation/docs/guides/multi-model/index.mdx @@ -19,9 +19,9 @@ import VideoCarousel from '@site/src/components/VideoCarousel';