-
Notifications
You must be signed in to change notification settings - Fork 31
Open
Description
I'm using vscode-openai 1.6.18 with code-server 4.100.3 (Code 1.100.3), working through a LiteLLM proxy (1.52.0). When I try to use the conversation feature, the message "waiting for response" appears, but no response appears.
The only log message (Log level is set to Debug) that appears is:
2025-06-23 16:08:28.234 [info] chat-completion - event properties
{
"service_provider": "Custom-OpenAI",
"default_model": "gpt-4o-2024-08-06",
"tokens_prompt": "0",
"tokens_completion": "0",
"tokens_total": "12",
"tokens_session": "12"
}
The server reports receiving the POST and gives a 200 response. When the command
curl -X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "openai/gpt-4o",
"messages": [
{"role": "user", "content": "What models do you have?"}
]
}' \
https://bearborg.berkeley.edu:4433/v1/chat/completions
Is run from the command line, I get this response:
{"id":"chatcmpl-Ble48wFvPUsFLABp0SA46tbQVAOPY","choices":[{"finish_reason":"stop","index":0,"message":{"content":"I am based on OpenAI's GPT-3.5 architecture, which is designed to understand and generate human-like text. If you're referring to specific versions or variations within the GPT models, I primarily operate using the capabilities and updates provided up to the GPT-3.5 series. If you have questions about specific functionalities or need assistance with something, feel free to ask!","role":"assistant","tool_calls":null,"function_call":null}}],"created":1750695128,"model":"gpt-4o-2024-08-06","object":"chat.completion","system_fingerprint":"fp_ee1d74bde0","usage":{"completion_tokens":76,"prompt_tokens":13,"total_tokens":89,"completion_tokens_details":{"accepted_prediction_tokens":0,"audio_tokens":0,"reasoning_tokens":0,"rejected_prediction_tokens":0},"prompt_tokens_details":{"audio_tokens":0,"cached_tokens":0}},"service_tier":null,"prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"jailbreak":{"filtered":false,"detected":false},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}]}
The settings for vscode-openai are:
{
"vscode-openai.serviceProvider": "Custom-OpenAI",
"vscode-openai.baseUrl": "https://bearborg.berkeley.edu:4433/v1/",
// "vscode-openai.defaultModel": "openai/gpt-4o" // Set this to the *exact* name of the model you defined in your LiteLLM config, e.g., "gpt-4o" or "gemini-1.5-pro-litellm"
"vscode-openai.defaultModel": "gpt-4o-2024-08-06",
"vscode-openai.logLevel": "Debug",
"vscode-openai.conversation-configuration.max-tokens": 50,
"vscode-openai.conversation-configuration.summary-threshold": 3
}
My guess is that vscode-openai isn't detecting termination of the response for some reason
Metadata
Metadata
Assignees
Labels
No labels