Skip to content

Conversation

@SDGLBL
Copy link
Contributor

@SDGLBL SDGLBL commented Aug 6, 2025

Description

Add reasoning content support to the OpenAI adapter's chat_output function, enabling proper handling of reasoning content from OpenAI models that support it, such as oss series models.

Why This Change?

OpenAI's oss series models on openrouter return reasoning tokens that provide insight into the model's thinking process. Without this support, the CodeCompanion chat buffer would lose this valuable reasoning content during streaming responses.

Data Structure Changes

Before:

output = {
  role = delta.role,
  content = delta.content,
}

After:

output = {
  role = delta.role,
}

if delta.reasoning then
  output.reasoning = {
    content = delta.reasoning,
  }
elseif delta.content then
  output.content = delta.content
end

Related Issue(s)

None - This is a feature enhancement to support new model capabilities.

Screenshots

image

Checklist

  • I've read the contributing guidelines and have adhered to them in this PR
  • I've updated CodeCompanion.has in the init.lua file for my new feature (N/A - reasoning is an adapter enhancement, not a user-facing feature)
  • I've added test coverage for this fix/feature
  • I've updated the README and/or relevant docs pages (N/A - reasoning models already documented in adapters docs)
  • I've run make all to ensure docs are generated, tests pass and my formatting is applied

SDGLBL and others added 2 commits August 6, 2025 14:26
Add reasoning support to the OpenAI adapter's chat_output function
similar to the DeepSeek adapter implementation. This allows the
OpenAI adapter to handle reasoning content from models that support
it, such as o1 and o3 series models.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Add the content field to the output table when a delta contains reasoning. Previously, the output table omitted the content field in this case, which caused the caller to receive incomplete responses. This patch restores the correct behavior by setting `output.content` when `delta.content` is provided after handling reasoning.
@olimorris
Copy link
Owner

olimorris commented Aug 6, 2025

Thanks for this. I'll need test coverage (as we have for DeepSeek) before I can merge this. Also, please stick to the PR template in the future - it just reduces the visual debt when you're trawling through lots of PRs.

olimorris

This comment was marked as resolved.

@olimorris olimorris added the reviewed-by-AI The CodeCompanion agent reviewed this PR label Aug 6, 2025
@olimorris olimorris dismissed their stale review August 6, 2025 09:09

Not required

Add a stub file containing OpenAI reasoning stream chunks and a corresponding test case that verifies reasoning content is correctly accumulated during streaming. The new test confirms that reasoning tokens are captured and that the final assistant message is produced as expected.
@SDGLBL
Copy link
Contributor Author

SDGLBL commented Aug 6, 2025

Thanks for this. I'll need test coverage (as we have for DeepSeek) before I can merge this. Also, please stick to the PR template in the future - it just reduces the visual debt when you're trawling through lots of PRs.

I've updated it according to the template.

role = delta.role,
}

if delta.reasoning then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make the name of the reasoning field configurable (maybe via opts)? A lot of providers (deepseek, llama.cpp, etc.) implement a slightly modified openai-api that sends the reasoning under a different name. For example, the DeepSeek API calls it reasoning_content. Since a lot of the other adapters reuse code from the openai adapter, it's probably worth it to make it more "feature-complete" so that it'll be easier to maintain/create other adapters.

@znculee
Copy link

znculee commented Aug 7, 2025

Thanks for implementing this! Is it compatible with all openai_compatible adapters rather than just openai, so that it'd be more flexible to be extended.

@SDGLBL
Copy link
Contributor Author

SDGLBL commented Aug 7, 2025

Thanks for implementing this! Is it compatible with all openai_compatible adapters rather than just openai, so that it'd be more flexible to be extended.

This modification was actually made to use openai_compatible to request openrouter. Openrouter, in turn, returns the thought process through the reasoning field. I actually considered writing a dedicated openrouter_compatible and then separately implementing chat_output and providing a configurable reasoning field to support this capability. However, considering their complexity and redundancy, I decided against it.

If you're open to implementing a configurable delta.custom_field adapter mechanism that supports the OpenRouter API for inference, perhaps it would be better if you handled it. @Davidyz @znculee

@znculee
Copy link

znculee commented Aug 7, 2025

@SDGLBL Thanks! I'm mainly using local models with this plugin, and I just confirmed that this branch is already working with the local Ollama (gpt-oss:20b) as follows.

## Me

hi

## CodeCompanion (OpenAI Compatible)

### Reasoning

We need to respond as ChatGPT - generic friendly start. Probably the user says "hi". We respond politely.

### Response

Hi! How can I help you today?

## Me


@SDGLBL
Copy link
Contributor Author

SDGLBL commented Aug 7, 2025

@SDGLBL Thanks! I'm mainly using local models with this plugin, and I just confirmed that this branch is already working with the local Ollama (gpt-oss:20b) as follows.

## Me

hi

## CodeCompanion (OpenAI Compatible)

### Reasoning

We need to respond as ChatGPT - generic friendly start. Probably the user says "hi". We respond politely.

### Response

Hi! How can I help you today?

## Me
openrouter = require("codecompanion.adapters").extend("ollama", {
  -- comment
  env = {
    api_key = os.getenv "OPENROUTER_API_KEY",
  },
  opts = {},
  url = os.getenv "OPENROUTER_API_BASE" .. "/chat/completions",
  schema = {
    model = {
      default = "openai/gpt-oss-120b",
      choices = {
        ["google/gemini-2.5-pro"] = { opts = { can_reason = true, can_use_tools = true } },
        ["google/gemini-2.5-flash"] = { opts = { can_reason = false, can_use_tools = true } },
        ["anthropic/claude-4-sonnet"] = { opts = { can_reason = true, can_use_tools = true } },
        ["openai/gpt-4.1"] = { opts = { can_reason = true, can_use_tools = true } },
        ["openai/gpt-oss-20b"] = { opts = { can_reason = true, can_use_tools = true } },
        ["openai/gpt-oss-120b"] = { opts = { can_reason = true, can_use_tools = true } },
      },
    },
  },
  handlers = {
    ---@param self CodeCompanion.Adapter
    ---@return boolean
    setup = function(self)
      local os = require("codecompanion.adapters.openai").handlers.setup
      if os then
        os(self)
      end
                                                                                                
      self.parameters.provider = {
        order = {
          "Groq",
        },
        allow_fallbacks = true,
      }
                                                                                                
      return true
    end,
  },
}),

But I'm using OpenRouter, which apparently doesn't support Ollama as an adapter. Ollama supports this feature simply because the Ollama adapter handles the reasoning field. Here

@Davidyz
Copy link
Contributor

Davidyz commented Aug 7, 2025

If you're open to implementing a configurable delta.custom_field adapter mechanism that supports the OpenRouter API for inference, perhaps it would be better if you handled it.

The thing is, the custom handler will probably be 99% identical to the openai adapter, with the custom field being the only thing that sets them apart. I think it'd be nice to just make the openai adapter more versatile. But at the end of the day, there's no standardized way of doing this, so it's ok if you don't want to include this here.

@znculee
Copy link

znculee commented Aug 8, 2025

Agree with @Davidyz, and @SDGLBL rather than extending ollama, you may try extending openai_compatible which should be more flexible. I haven't tried it with open router though

    adapters = {
      ollama = function()
        return require("codecompanion.adapters").extend("openai_compatible", {
          name = "ollama",
          env = {
            url = "http://127.0.0.1:11434",
          },
          schema = {
            model = {
              default = os.getenv("CODECOMPANION_OLLAMA_MODEL"),
            },
          },
        })
      end,

@bassamsdata
Copy link
Contributor

Hello everyone, just wanted to share my two cents. I believe openai_compatible doesn’t support function calling, so it’s better to extend the OpenAI adapter directly (which is what I do with OpenRouter). I’m not entirely sure about the reasoning handling though, but especially since openai_compatible won’t be supported anymore as per this.

@Davidyz
Copy link
Contributor

Davidyz commented Aug 8, 2025

I believe openai_compatible doesn’t support function calling, so it’s better to extend the OpenAI adapter directly (which is what I do with OpenRouter).

openai_compatible does support tool calling as is. I've used it with various providers (both cloud and local).

@Davidyz
Copy link
Contributor

Davidyz commented Aug 24, 2025

After some further research, I'd like to re-raise my point on the configurable reasoning output field name. llama.cpp, vllm and alibaba (qwen) support listing models via the /v1/models endpoint and streams the reasoning content in message.reasoning_content, which is outside of the OpenAI API specs and is different from the openrouter format. I think this is the same as the deepseek format, but the deepseek adapter hardcodes schema.model.choices so it's actually not very convenient to use. Also, there's a chance that OpenAI will refine its API and define its own reasoning format. Having a flexible design in the first place will save us the trouble in the future.

Just to clarify, I'm suggesting something like:

adapter.opts = {
  stream = true,
  tools = true,
  vision = true,
  -- users can override this to fit their needs
  reasoning_field = 'reasoning'
}

and in the chat_output handler, we do:

output = {
  role = delta.role,
}

if delta[opts.reasoning_field] then
  output.reasoning = {
    content = delta[opts.reasoning_field],
  }
elseif delta.content then
  output.content = delta.content
end

@olimorris any thoughts on this?

@github-actions
Copy link
Contributor

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the Stale label Sep 23, 2025
@olimorris olimorris added P1 High impact, high urgency P4 Negligible impact and urgency and removed Stale reviewed-by-AI The CodeCompanion agent reviewed this PR P1 High impact, high urgency labels Sep 23, 2025
@github-actions
Copy link
Contributor

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the Stale label Oct 24, 2025
@Davidyz
Copy link
Contributor

Davidyz commented Oct 25, 2025

Any updates on this? @olimorris I'm tempted to add reasoning output to the gemini adapter, which might introduce some changes to the openai adapter and might have conflicts with this PR here. It'll be easier if we could decide what happens with this PR first.

@olimorris
Copy link
Owner

Hi All,

Apologies for not having the chance to look at this sooner. Actually reviewing a PR such as this is non-trivial and I am tightly time-boxing my time on CodeCompanion these days.

If a PR like this doesn't include links to OpenAI documentation then I have to manually review every single line alongside their docs. "How are we processing the reasoning output?". "Does it need to be passed back to OpenAI on the next turn?". All such questions that came up when I added support for the Responses API.

After a quick review, I can't accept this PR without adding reasoning support for when streaming is turned off. I believe OpenAI still have some models on the completions endpoint that don't allow streaming and there's always the chance they release new ones that don't support it anyway.

@Davidyz
Copy link
Contributor

Davidyz commented Oct 26, 2025

If a PR like this doesn't include links to OpenAI documentation

I don't think the standard OpenAI API contains anything about reasoning output or even a summary. OpenAI-compatible providers that actually provide reasoning output tend to come up with their own implementation for this. DeepSeek, for example, uses an extra field in the delta to transfer the string (some other providers follow this convention). Google Gemini takes a different approach (see #2306).

As for the non-streaming format, I'm not sure about the openrouter format, but I'm pretty sure the deepseek format work with non-streaming requests. That's why I proposed these changes: it's non-intrusive when used with the official OpenAI endpoint, and since it's just the deepseek format if you set opts.reasoning_field to reasoning_content, we can probably just reuse the deepseek stubs for testing. This is similar to the openai-python's implementation, where they just allow extra fields in their BaseModel, which is inherited by almost all the data structures that they implemented for the openai API, including the choices in chat completions, streaming or not streaming.

Update: Examples of deepseek and openrouter using the official openai-python package to work with reasoning tokens.

@github-actions
Copy link
Contributor

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the Stale label Nov 26, 2025
@Davidyz
Copy link
Contributor

Davidyz commented Nov 28, 2025

I believe #2359 superseded this PR? Should be safe to close this imo.

1 similar comment
@Davidyz
Copy link
Contributor

Davidyz commented Nov 28, 2025

I believe #2359 superseded this PR? Should be safe to close this imo.

@SDGLBL
Copy link
Contributor Author

SDGLBL commented Nov 28, 2025

Duplicate with #2359,Thanks to Davidyz for his contribution

@SDGLBL SDGLBL closed this Nov 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P4 Negligible impact and urgency Stale

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants