fix: add strict:false to Responses API tools and gpt-5.4 to known models by baxen · Pull Request #8636 · aaif-goose/goose

baxen · 2026-04-18T01:36:17Z

Category: fix
User Impact: Users can now use OpenAI's gpt-5.4 model with goose for tool-calling workflows that previously failed silently or returned API errors.

Problem: GPT-5.4 requires the Responses API (/v1/responses), which defaults strict: true for function tool definitions. MCP tool schemas are dynamically generated at runtime and don't conform to strict mode requirements (e.g., missing additionalProperties: false), causing the API to reject tool calls. Additionally, gpt-5.4 was missing from the known models list, so it wouldn't get a proper context limit.

Solution: Explicitly set strict: false on tool definitions in create_responses_request so the Responses API accepts MCP tool schemas as-is — matching the non-strict default that Chat Completions used. Also added gpt-5.4 to OPEN_AI_KNOWN_MODELS with a 400k context limit. Both changes were developed using red-green TDD.

File changes

crates/goose/src/providers/formats/openai_responses.rs
Added "strict": false to each tool definition in create_responses_request. This prevents the Responses API from applying strict schema validation to MCP tool schemas that can't guarantee compliance. Also added a test (test_responses_tools_include_strict_false) that constructs a request with a tool and asserts the strict field is present and set to false.

crates/goose/src/providers/openai.rs
Added ("gpt-5.4", 400_000) to OPEN_AI_KNOWN_MODELS so the model gets a proper context limit. Added a test (gpt_5_4_is_in_known_models) that asserts gpt-5.4 is present in the list.

Reproduction Steps

Configure goose with an OpenAI API key and set the model to gpt-5.4.
Start a session and ask goose to perform a task that requires tool use (e.g., "list the files in the current directory").
Verify that goose successfully calls tools and returns results without API errors about strict schema validation.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ee17f24905

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-18T02:45:19Z

+    ("gpt-5.4", 400_000),
+    ("gpt-5.4-mini", 400_000),
+    ("gpt-5.4-nano", 128_000),
+    ("gpt-5.4-pro", 400_000),


Correct GPT-5.4 family context limits in known models

OPEN_AI_KNOWN_MODELS assigns gpt-5.4 and gpt-5.4-pro a 400_000 limit and gpt-5.4-nano a 128_000 limit, but the current model specs are 1,050,000 for gpt-5.4/gpt-5.4-pro and 400,000 for gpt-5.4-nano. Because these values are propagated into ModelConfig when provider defaults are applied, Goose will compact/truncate conversation history far earlier than necessary for these models, materially reducing available long-context behavior.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 12e2304d68

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-18T03:02:50Z

+    ("gpt-5.4", 400_000),
+    ("gpt-5.4-mini", 400_000),
+    ("gpt-5.4-nano", 128_000),
+    ("gpt-5.4-pro", 400_000),


Correct gpt-5.4 family context limits

OPEN_AI_KNOWN_MODELS hard-codes gpt-5.4/gpt-5.4-pro at 400_000 and gpt-5.4-nano at 128_000, but the repo’s canonical model table (crates/goose/src/providers/canonical/data/canonical_models.json) lists 1_050_000 for gpt-5.4/gpt-5.4-pro and 400_000 for gpt-5.4-nano. Because OpenAiProvider::metadata() publishes these literal values and ProviderEntry::normalize_model_config falls back to provider metadata when canonical lookup is unavailable, this underreports/undersizes context windows for these models.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 36a396a3ec

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-19T21:41:01Z

+        let base = normalized
+            .strip_prefix("databricks-")
+            .unwrap_or(&normalized);


Handle goose-prefixed GPT-5.4 models in responses routing

DatabricksProvider::is_responses_model removes only the databricks- prefix before matching GPT-5 families, so names like goose-gpt-5.4/goose-gpt-5.4-high are treated as non-Responses models. In stream(), that sends these models through the Chat Completions path instead of create_responses_request, which can break tool-calling for GPT-5.4 variants. This is inconsistent with the rest of the codebase (for example ModelConfig::is_openai_reasoning_model and extract_reasoning_effort) that already recognizes goose- prefixes.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: da0141deeb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-21T00:28:49Z

+/// automatically.
+pub fn is_openai_responses_model(model_name: &str) -> bool {
+    static RE: OnceLock<Regex> = OnceLock::new();
+    let re = RE.get_or_init(|| Regex::new(r"(?i)^o\d|^gpt-5").unwrap());


Keep o1-mini models out of Responses routing

is_openai_responses_model now matches every o* model via ^o\d, so o1-mini/o1-preview are treated as Responses models and routed through OpenAiProvider::stream’s Responses branch instead of Chat Completions. That bypasses the existing o1-mini safeguard in providers/formats/openai.rs::create_request (and the analogous Databricks formatter guard), so unsupported tool-calling models are no longer rejected early and instead fail at runtime API call time; this is especially risky because the canonical table marks openai/o1-mini as "tool_call": false in canonical_models.json.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc414a3ab7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-21T00:58:02Z

+pub fn is_openai_responses_model(model_name: &str) -> bool {
+    static RE: OnceLock<Regex> = OnceLock::new();
+    let re =
+        RE.get_or_init(|| Regex::new(r"(?i)(?:^|[-/])(?:o\d+(?:$|-)|gpt-5(?:$|[-.]))").unwrap());


Keep o1-mini and o1-preview out of Responses routing

The new is_openai_responses_model regex matches any o\d family, which now classifies o1-mini and o1-preview as Responses models and routes them through the Responses path (OpenAiProvider::stream/DatabricksProvider::stream) instead of the Chat Completions formatter path that contains the existing early o1-mini safeguard. That means tool-enabled requests for these models now fail only at runtime API calls; this is inconsistent with the canonical capabilities table, which still marks openai/o1-mini and openai/o1-preview as tool_call: false in canonical_models.json (lines 80475-80510).

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8530161bb6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-21T03:39:43Z

+pub fn is_openai_responses_model(model_name: &str) -> bool {
+    static RE: OnceLock<Regex> = OnceLock::new();
+    let re =
+        RE.get_or_init(|| Regex::new(r"(?i)(?:^|[-/])(?:o\d+(?:$|-)|gpt-5(?:$|[-.]))").unwrap());


Keep o-series chat fallback on OpenAI-compatible hosts

is_openai_responses_model now matches every o* family, which makes OpenAiProvider::should_use_responses_api route o1/o3/o4 models to /v1/responses whenever base_path is the default v1/chat/completions. This breaks OpenAI-compatible hosts that expose only Chat Completions (a common setup in declarative providers using chat-completions URLs), because stream() enters the Responses branch and has no 404 fallback back to chat. The regression is input-dependent (custom host + o* model), but it can turn previously working requests into hard endpoint errors.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 319a4001f1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-21T04:07:10Z

+    if !is_openai_responses_model(model_name) {
        return (model_name.to_string(), None);


Decouple reasoning-effort extraction from Responses routing

extract_reasoning_effort now bails out whenever is_openai_responses_model is false, which reclassifies plain GPT-5 families (gpt-5, gpt-5-mini, gpt-5-nano, etc.) as non-reasoning. That change propagates into providers/formats/openai.rs::create_request (and Databricks equivalent), causing these models to use system role + max_tokens and to allow temperature instead of the reasoning path (reasoning_effort + max_completion_tokens), so requests for GPT-5 chat-completions models can be malformed or behaviorally downgraded.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-21T04:07:10Z

+/// work without provider-specific normalization.
+pub fn is_openai_responses_model(model_name: &str) -> bool {
+    static RE: OnceLock<Regex> = OnceLock::new();
+    let re = RE.get_or_init(|| Regex::new(r"(?i)(?:^|[-/])(?:o\d+(?:$|-)|gpt-5[.-]\d)").unwrap());


Route gpt-5-codex through Responses API

The new Responses-model regex only matches gpt-5 names with a numeric segment (gpt-5.2, gpt-5-4, etc.), so gpt-5-codex no longer matches and OpenAiProvider::should_use_responses_api now sends it to chat-completions. This is a regression from the prior codex-specific routing and can break tool-calling for users selecting the supported gpt-5-codex model on providers that require /v1/responses for codex variants.

Useful? React with 👍 / 👎.

…ing effort lookup - Unify is_openai_responses_model() regex for routing all o-series and gpt-5 family models to the Responses API across OpenAI and Databricks providers. - extract_reasoning_effort() no longer defaults to "medium"; omits the reasoning field when no explicit suffix is provided, letting the API apply its own per-model default (avoids 400s on gpt-5-pro which only accepts "high"). - Add context window limits for new gpt-5 model variants. - Set strict: false on Responses API tool schemas (MCP schemas are not strict-compatible). - Add Responses API mock to session_id_propagation_test. - Switch test fixtures from responses-routed models (gpt-5-nano, o4-mini) to gpt-4.1 for Chat Completions mock compatibility. TODO: add Responses API mock to OpenAiFixture for full coverage. Signed-off-by: Bradley Axen <baxen@squareup.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c5e8fbe998

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-21T22:27:57Z

    let session_id = &session_b.session_id().0;
    match via {
-        SetModelVia::Dedicated => conn.set_model(session_id, "o4-mini").await.unwrap(),
+        SetModelVia::Dedicated => conn.set_model(session_id, "gpt-4.1").await.unwrap(),


Use a non-default model in model-switch ACP tests

TEST_MODEL was changed to gpt-4.1, so this set_model call now sets Session B to the same model it already has from TestConnectionConfig::default(). That makes run_model_set/run_config_option_model_set effectively no-ops for model switching, so they can pass even if the session model update path regresses. Switch Session B to a different chat-completions model to keep validating the model-change behavior.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-21T22:27:57Z

+    // responses-routed models like o4-mini here.
    let config = TestConnectionConfig {
-        current_model: "o4-mini".to_string(),
+        current_model: "gpt-4.1".to_string(),


Restore actual mismatch in prompt-model-mismatch test

This test is intended to start with a server model different from TEST_MODEL, but after TEST_MODEL was updated to gpt-4.1, setting current_model to gpt-4.1 removes the mismatch entirely. The branch that forces set_model(TEST_MODEL) before prompting is no longer exercised, so regressions in mismatch handling can slip through undetected.

Useful? React with 👍 / 👎.

…els (aaif-goose#8636) Signed-off-by: Bradley Axen <baxen@squareup.com>

* main: feat: extend goose2 context window ux with auto-compaction (#8721) improve goose2 agent management flows (#8737) alexhancock/tui-improvements (#8736) fix: add strict:false to Responses API tools and gpt-5.4 to known models (#8636) persist and reliably apply chat model selection (#8734) merge goose-acp crate into goose (#8726) docs: AGENTS.md section on goose2 desktop backend architecture (#8732) feat: goose2 message bubble + action tray (#8720) consolidate provider ACP methods onto inventory (#8710) ci: declare and enforce MSRV of 1.91.1 (#8670) fix(ui): correct grammar in apps view description (#8668) (#8679) Stop load openai fast model for openapi compatible custom endpoint (#8644)

* main: (41 commits) removed the specific code owner for documentation change (#8749) fix(providers): handle missing delta field in streaming chunks (#8700) refactor(providers): extract http_status module and rename handle_status_openai_compat (#8620) fix(providers/openai): accept streaming chunks with both reasoning fields (#8715) feat: associate threads with projects (#8745) upgrade goose sdk and tui to be compatible with the latest agentclientprotocol/sdk package (#8667) feat: extend goose2 context window ux with auto-compaction (#8721) improve goose2 agent management flows (#8737) alexhancock/tui-improvements (#8736) fix: add strict:false to Responses API tools and gpt-5.4 to known models (#8636) persist and reliably apply chat model selection (#8734) merge goose-acp crate into goose (#8726) docs: AGENTS.md section on goose2 desktop backend architecture (#8732) feat: goose2 message bubble + action tray (#8720) consolidate provider ACP methods onto inventory (#8710) ci: declare and enforce MSRV of 1.91.1 (#8670) fix(ui): correct grammar in apps view description (#8668) (#8679) Stop load openai fast model for openapi compatible custom endpoint (#8644) feat(hooks): add Husky git hooks for ui/goose2 (#8577) fix: links in chat could not be opened (#8544) ...

chatgpt-codex-connector Bot reviewed Apr 18, 2026

View reviewed changes

baxen force-pushed the fix/openai-gpt-5.4-responses-api-strict branch from 12e2304 to 36a396a Compare April 19, 2026 21:32

chatgpt-codex-connector Bot reviewed Apr 19, 2026

View reviewed changes

baxen force-pushed the fix/openai-gpt-5.4-responses-api-strict branch from 36a396a to da0141d Compare April 21, 2026 00:19

chatgpt-codex-connector Bot reviewed Apr 21, 2026

View reviewed changes

baxen force-pushed the fix/openai-gpt-5.4-responses-api-strict branch from da0141d to fc414a3 Compare April 21, 2026 00:47

chatgpt-codex-connector Bot reviewed Apr 21, 2026

View reviewed changes

baxen force-pushed the fix/openai-gpt-5.4-responses-api-strict branch 2 times, most recently from efcfea7 to 8530161 Compare April 21, 2026 03:30

chatgpt-codex-connector Bot reviewed Apr 21, 2026

View reviewed changes

baxen force-pushed the fix/openai-gpt-5.4-responses-api-strict branch from 8530161 to 319a400 Compare April 21, 2026 03:53

chatgpt-codex-connector Bot reviewed Apr 21, 2026

View reviewed changes

baxen force-pushed the fix/openai-gpt-5.4-responses-api-strict branch 2 times, most recently from 29ab29a to 217b67f Compare April 21, 2026 04:23

jh-block approved these changes Apr 21, 2026

View reviewed changes

baxen force-pushed the fix/openai-gpt-5.4-responses-api-strict branch from 217b67f to 4bab639 Compare April 21, 2026 22:14

baxen force-pushed the fix/openai-gpt-5.4-responses-api-strict branch from 4bab639 to c5e8fbe Compare April 21, 2026 22:20

chatgpt-codex-connector Bot reviewed Apr 21, 2026

View reviewed changes

baxen added this pull request to the merge queue Apr 21, 2026

Merged via the queue into main with commit ab0c40d Apr 21, 2026
23 checks passed

baxen deleted the fix/openai-gpt-5.4-responses-api-strict branch April 21, 2026 22:46

spikewang pushed a commit to spikewang/goose that referenced this pull request Apr 22, 2026

fix: add strict:false to Responses API tools and gpt-5.4 to known mod…

7ddfee7

…els (aaif-goose#8636) Signed-off-by: Bradley Axen <baxen@squareup.com>

github-actions Bot mentioned this pull request Apr 23, 2026

chore(release): release version 1.32.0 #8764

Closed

BrewTestBot mentioned this pull request Apr 23, 2026

block-goose-cli 1.32.0 Homebrew/homebrew-core#279029

Merged

		if !is_openai_responses_model(model_name) {
		return (model_name.to_string(), None);

Conversation

baxen commented Apr 18, 2026

Reproduction Steps

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants