You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/source/guide/prompts_keys.md
+11-2
Original file line number
Diff line number
Diff line change
@@ -102,7 +102,7 @@ You can find all this information in the **Details** section of the deployment i
102
102
103
103
You can use your own self-hosted and fine-tuned model as long as it meets the following criteria:
104
104
105
-
* Your server must provide [JSON mode](https://python.useinstructor.com/concepts/patching/#json-mode) for the LLM.
105
+
* Your server must provide [JSON mode](https://js.useinstructor.com/concepts/patching/#json-schema-mode) for the LLM, specifically, the API must accepts `response_format` with `type: json_object` and `schema` with a valid JSON schema: ` {"response_format": {"type": "json_object", "schema": <schema>}}`
106
106
* The server API must follow [OpenAI format](https://platform.openai.com/docs/api-reference/chat/create#chat-create-response_format).
107
107
108
108
Examples of compatible LLMs include [Ollama](https://ollama.com/) and [sglang](https://github.com/sgl-project/sglang?tab=readme-ov-file#openai-compatible-api).
@@ -114,7 +114,7 @@ To add a custom model, enter the following:
114
114
* An API key to access the model. An API key is tied to a specific account, but the access is shared within the org if added. (Optional)
115
115
* An auth token to access the model API. An auth token provides API access at the server level. (Optional)
116
116
117
-
### Example
117
+
### Example with Ollama
118
118
119
119
1. Setup [Ollama](https://ollama.com/), e.g. `ollama run llama3.2`
120
120
2.[Verify your local OpenAI-compatible API is working](https://ollama.com/blog/openai-compatibility), e.g. `http://localhost:11434/v1`
@@ -124,3 +124,12 @@ To add a custom model, enter the following:
124
124
- Endpoint: `https://my.openai.endpoint.com/v1` (note `v1` suffix is required)
125
125
- API key: `ollama` (default)
126
126
- Auth token: empty
127
+
128
+
129
+
### Example with Hugging Face Inference Endpoints
130
+
1. Use [DeepSeek model](https://huggingface.co/deepseek-ai/DeepSeek-R1)
0 commit comments