Skip to content

Commit e28830d

Browse files
authored
docs: Update Responses API docs for cross-provider support (#1685)
* docs: Update Responses API docs for cross-provider support The Responses API endpoint is now available for all configured models. For OpenAI-compatible providers, Spice automatically adapts between Chat Completions and Responses API formats. The responses_api parameter now controls Chat Completions backend selection rather than Responses API availability. Source: spiceai/spiceai#10724 * trigger CI re-check --------- Co-authored-by: claudespice <claudespice@users.noreply.github.com>
1 parent cd4f7e8 commit e28830d

2 files changed

Lines changed: 6 additions & 4 deletions

File tree

website/docs/components/models/azure.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ To use a language model hosted on Azure OpenAI, specify the `azure` path in the
1414
| `azure_deployment_name` | The name of the model deployment. | Model name |
1515
| `endpoint` | The Azure OpenAI resource endpoint, e.g., `https://resource-name.openai.azure.com`. | - |
1616
| `azure_entra_token` | The Azure Entra token for authentication. | - |
17-
| `responses_api` | `enabled` or `disabled`. Whether to enable invoking this model from the `/v1/responses` HTTP endpoint | `disabled` |
17+
| `responses_api` | `enabled` or `disabled`. Controls the Chat Completions backend: `enabled` proxies `/v1/chat/completions` through the backend Responses API; `disabled` uses the standard Chat Completions backend. The `/v1/responses` endpoint is available for all models regardless of this setting. | `disabled` |
1818
| `azure_openai_responses_tools` | Comma-separated list of OpenAI-hosted tools exposed via the Responses API for this model. These hosted tools are **not** available from the `/v1/chat/completions` HTTP endpoint. Supported tools: `code_interpreter`, `web_search`. | - |
1919

2020

website/docs/components/models/openai/deployment.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ API keys must be sourced from a [secret store](../../secret-stores/) in producti
2626

2727
### OpenAI-Compatible Providers
2828

29-
Set `endpoint` to target any OpenAI-compatible provider (Azure OpenAI, xAI, Groq, Together, on-prem vLLM, etc.). See the [OpenAI model reference](./) for provider-specific configuration examples. When using the Responses API (`responses_api: enabled`), confirm the target provider implements OpenAI's Responses API.
29+
Set `endpoint` to target any OpenAI-compatible provider (Azure OpenAI, xAI, Groq, Together, on-prem vLLM, etc.). See the [OpenAI model reference](./) for provider-specific configuration examples. The `/v1/responses` endpoint works with all OpenAI-compatible providers through automatic format adaptation. When setting `responses_api: enabled` (which proxies Chat Completions through the Responses API backend), confirm the target provider implements OpenAI's Responses API natively.
3030

3131
## Resilience Controls
3232

@@ -64,9 +64,11 @@ The built-in rate controller queues and paces outbound requests to stay within t
6464

6565
### Responses API
6666

67+
All configured models are registered for the `/v1/responses` endpoint. For OpenAI-compatible providers, Spice automatically adapts between Chat Completions and Responses API formats, so `/v1/responses` works even when the backend only supports `/v1/chat/completions`.
68+
6769
| Parameter | Default | Description |
6870
| ---------------------- | ---------- | ---------------------------------------------------------------------------------------------- |
69-
| `responses_api` | `disabled` | `enabled` routes `/v1/responses` traffic through the OpenAI Responses API. |
71+
| `responses_api` | `disabled` | Controls the Chat Completions backend. `disabled` proxies `/v1/chat/completions` to the backend's `/v1/chat/completions`. `enabled` proxies `/v1/chat/completions` to the backend's `/v1/responses`, which can improve tool-use and reasoning for providers that natively support the Responses API. |
7072
| `openai_responses_tools`| - | Comma-separated list of OpenAI-hosted tools (`code_interpreter`, `web_search`) exposed via Responses. |
7173

7274
Note: Responses API hosted tools are **not** available from the `/v1/chat/completions` endpoint.
@@ -118,5 +120,5 @@ Chat and Responses operations emit these [task history](../../../reference/task_
118120
| `401 Unauthorized` | Wrong / revoked API key. | Rotate the key, update the secret store. |
119121
| `429 rate_limit_exceeded` | Tier budget too low or burst exceeds concurrency. | Raise `openai_usage_tier`, reduce `max_concurrency`, or upgrade the OpenAI tier. |
120122
| `429 tokens_per_min` errors despite pacing | Spice rate-limits requests, not tokens. | Reduce per-request token budget; throttle via `max_concurrency`. |
121-
| Responses API returns `404` | Provider does not implement Responses. | Set `responses_api: disabled`; or point at a Responses-capable endpoint. |
123+
| Responses API returns `404` | Provider does not implement Responses natively and `responses_api: enabled` is set. | Set `responses_api: disabled` (default) to use the automatic Chat Completions adapter for `/v1/responses`. |
122124
| Slow first-token latency | `stream=false` waits for full completion. | Use `stream=true` for interactive chat. |

0 commit comments

Comments
 (0)