You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: Update Responses API docs for cross-provider support (#1685)
* docs: Update Responses API docs for cross-provider support
The Responses API endpoint is now available for all configured models.
For OpenAI-compatible providers, Spice automatically adapts between
Chat Completions and Responses API formats. The responses_api parameter
now controls Chat Completions backend selection rather than Responses
API availability.
Source: spiceai/spiceai#10724
* trigger CI re-check
---------
Co-authored-by: claudespice <claudespice@users.noreply.github.com>
|`azure_entra_token`| The Azure Entra token for authentication. | - |
17
-
|`responses_api`|`enabled` or `disabled`. Whether to enable invoking this model from the `/v1/responses`HTTP endpoint |`disabled`|
17
+
|`responses_api`|`enabled` or `disabled`. Controls the Chat Completions backend: `enabled` proxies `/v1/chat/completions` through the backend Responses API; `disabled` uses the standard Chat Completions backend. The `/v1/responses` endpoint is available for all models regardless of this setting.|`disabled`|
18
18
|`azure_openai_responses_tools`| Comma-separated list of OpenAI-hosted tools exposed via the Responses API for this model. These hosted tools are **not** available from the `/v1/chat/completions` HTTP endpoint. Supported tools: `code_interpreter`, `web_search`. | - |
Copy file name to clipboardExpand all lines: website/docs/components/models/openai/deployment.md
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,7 +26,7 @@ API keys must be sourced from a [secret store](../../secret-stores/) in producti
26
26
27
27
### OpenAI-Compatible Providers
28
28
29
-
Set `endpoint` to target any OpenAI-compatible provider (Azure OpenAI, xAI, Groq, Together, on-prem vLLM, etc.). See the [OpenAI model reference](./) for provider-specific configuration examples. When using the Responses API (`responses_api: enabled`), confirm the target provider implements OpenAI's Responses API.
29
+
Set `endpoint` to target any OpenAI-compatible provider (Azure OpenAI, xAI, Groq, Together, on-prem vLLM, etc.). See the [OpenAI model reference](./) for provider-specific configuration examples. The `/v1/responses` endpoint works with all OpenAI-compatible providers through automatic format adaptation. When setting `responses_api: enabled` (which proxies Chat Completions through the Responses API backend), confirm the target provider implements OpenAI's Responses API natively.
30
30
31
31
## Resilience Controls
32
32
@@ -64,9 +64,11 @@ The built-in rate controller queues and paces outbound requests to stay within t
64
64
65
65
### Responses API
66
66
67
+
All configured models are registered for the `/v1/responses` endpoint. For OpenAI-compatible providers, Spice automatically adapts between Chat Completions and Responses API formats, so `/v1/responses` works even when the backend only supports `/v1/chat/completions`.
|`responses_api`|`disabled`|`enabled` routes`/v1/responses` traffic through the OpenAI Responses API. |
71
+
|`responses_api`|`disabled`|Controls the Chat Completions backend. `disabled` proxies`/v1/chat/completions` to the backend's `/v1/chat/completions`. `enabled` proxies `/v1/chat/completions` to the backend's `/v1/responses`, which can improve tool-use and reasoning for providers that natively support the Responses API.|
70
72
|`openai_responses_tools`| - | Comma-separated list of OpenAI-hosted tools (`code_interpreter`, `web_search`) exposed via Responses. |
71
73
72
74
Note: Responses API hosted tools are **not** available from the `/v1/chat/completions` endpoint.
@@ -118,5 +120,5 @@ Chat and Responses operations emit these [task history](../../../reference/task_
118
120
|`401 Unauthorized`| Wrong / revoked API key. | Rotate the key, update the secret store. |
119
121
|`429 rate_limit_exceeded`| Tier budget too low or burst exceeds concurrency. | Raise `openai_usage_tier`, reduce `max_concurrency`, or upgrade the OpenAI tier. |
120
122
|`429 tokens_per_min` errors despite pacing | Spice rate-limits requests, not tokens. | Reduce per-request token budget; throttle via `max_concurrency`. |
121
-
| Responses API returns `404`| Provider does not implement Responses. | Set `responses_api: disabled`; or point at a Responses-capable endpoint. |
123
+
| Responses API returns `404`| Provider does not implement Responses natively and `responses_api: enabled` is set. | Set `responses_api: disabled` (default) to use the automatic Chat Completions adapter for `/v1/responses`.|
122
124
| Slow first-token latency |`stream=false` waits for full completion. | Use `stream=true` for interactive chat. |
0 commit comments