Skip to content

Python: Incompatibility with gpt-oss on NIM and OpenAIResponsesClient #1423

@pamelafox

Description

@pamelafox

I have deployed gpt-oss using NIM image on Container Apps GPU following this guide:
https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/responses?tabs=python-key

I am able to use gpt-oss via the OpenAI Responses API endpoint, and I can use agent-framework with that endpoint as long as I specify no tools. However, as soon as I specify a tool, I get this error:

File "/Users/pamelafox/python-ai-agents-demos/.venv/lib/python3.12/site-packages/agent_framework/openai/_responses_client.py", line 115, in _inner_get_response
raise ServiceResponseException(
agent_framework.exceptions.ServiceResponseException: <class 'agent_framework.openai._responses_client.OpenAIResponsesClient'> service failed to complete the prompt: Error code: 400 - {'error': {'message': '[{'type': 'string_type', 'loc': ('body', 'input', 'str'), 'msg': 'Input should be a valid string', 'input': [{'role': 'user', 'content': [{'type': 'input_text', 'text': 'Whats weather today in San Francisco?'}]}, {'call_id': 'call_befa29227d8b475aa84e20a4db7c8134', 'id': 'ft_befa29227d8b475aa84e20a4db7c8134', 'type': 'function_call', 'name': 'get_weather', 'arguments': '{"city":"San Francisco"}'}, {'role': 'assistant', 'content': [{'type': 'reasoning', 'summary': {'type': 'summary_text', 'text': 'We need to call function get_weather with city "San Francisco".'}}]}, {'call_id': 'call_befa29227d8b475aa84e20a4db7c8134', 'id': 'ft_befa29227d8b475aa84e20a4db7c8134', 'type': 'function_call_output', 'output': '{"temperature": 60, "description": "Rainy"}'}]}, {'type': 'string_type', 'loc': ('body', 'input', 'list[union[EasyInputMessageParam,Message,ResponseOutputMessageParam,ResponseFileSearchToolCallParam,ResponseComputerToolCallParam,ComputerCallOutput,ResponseFunctionWebSearchParam,ResponseFunctionToolCallParam,FunctionCallOutput,ResponseReasoningItemParam,ImageGenerationCall,ResponseCodeInterpreterToolCallParam,LocalShellCall,LocalShellCallOutput,McpListTools,McpApprovalRequest,McpApprovalResponse,McpCall,ResponseCustomToolCallOutputParam,ResponseCustomToolCallParam,ItemReference,ResponseReasoningItem,ResponseFunctionToolCall]]', 2, 'EasyInputMessageParam', 'content', 'str'), 'msg': 'Input should be a valid string', 'input': [{'type': 'reasoning', 'summary': {'type': 'summary_text', 'text': 'We need to call function get_weather with city "San Francisco".'}}]}, {'type': 'missing', 'loc': ('body', 'input', 'list[union[EasyInputMessageParam,Message,ResponseOutputMessageParam,ResponseFileSearchToolCallParam,ResponseComputerToolCallParam,ComputerCallOutput,ResponseFunctionWebSearchParam,ResponseFunctionToolCallParam,FunctionCallOutput,ResponseReasoningItemParam,ImageGenerationCall,ResponseCodeInterpreterToolCallParam,LocalShellCall,LocalShellCallOutput,McpListTools,McpApprovalRequest,McpApprovalResponse,McpCall,ResponseCustomToolCallOutputParam,ResponseCustomToolCallParam,ItemReference,ResponseReasoningItem,ResponseFunctionToolCall]]', 2, 'EasyInputMessageParam', 'content', 'list[union[...,...,...]]', 0, 'ResponseInputTextParam', 'text'), 'msg': 'Field required', 'input': {'type': 'reasoning', 'summary': {'type': 'summary_text', 'text': 'We need to call function get_weather with city "San Francisco".'}}},

...The error repeats for a very long time, but it seems to be the same content repeated, so I'll truncate it there.

Here's the code:

import asyncio
import logging
import os
import random
from typing import Annotated

from agent_framework.openai import OpenAIResponsesClient
from dotenv import load_dotenv
from pydantic import Field
from rich import print
from rich.logging import RichHandler

# Setup logging with rich
logging.basicConfig(level=logging.WARNING, format="%(message)s", datefmt="[%X]", handlers=[RichHandler()])

load_dotenv(override=True)
API_HOST = os.getenv("API_HOST", "github")

client = OpenAIResponsesClient(
    base_url=os.environ.get("OLLAMA_ENDPOINT", "http://localhost:11434/v1"),
    api_key="none",
    model_id=os.environ.get("OLLAMA_MODEL", "llama3.1:latest"),
)

def get_weather(
    city: Annotated[str, Field(description="The city to get the weather for.")],
) -> dict:
    """Returns weather data for a given city, a dictionary with temperature and description."""
    if random.random() < 0.05:
        return {
            "temperature": 72,
            "description": "Sunny",
        }
    else:
        return {
            "temperature": 60,
            "description": "Rainy",
        }



async def main():
    response = await client.get_response(
        "Whats weather today in San Francisco?",
        tools=get_weather
    )
    print(response.text)


if __name__ == "__main__":
    asyncio.run(main())

I can also provide the endpoint if you message me on Microsoft Teams. Thanks!

Metadata

Metadata

Assignees

Labels

model clientsIssues related to the model client implementationspython

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions