You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/source/guide/prompts_examples.md
+121-15
Original file line number
Diff line number
Diff line change
@@ -83,7 +83,7 @@ This example demonstrates how to set up Prompts to predict image captions.
83
83
!!! note
84
84
Prompts does not currently support image data uploaded as raw images. Only image references (HTTP URIs to images) or images imported via cloud storage are supported.
85
85
86
-
2. Create a [label config](setup) for image captioning, for example:
86
+
2. Create a [label config](setup) for image captioning (or Ask AI to create one for you), for example:
87
87
88
88
```xml
89
89
<View>
@@ -104,23 +104,21 @@ This example demonstrates how to set up Prompts to predict image captions.
104
104
!!! note
105
105
Ensure you include `{image}` in your instructions. Click `image` above the instruction field to insert it.
106
106
107
-

108
-
109
107
!!! info Tip
110
108
You can also automatically generate the instructions using the [**Enhance Prompt** action](prompts_draft#Enhance-prompt). Before you can use this action, you must at least add the variable name `{image}` and then click **Save**.
111
109
112
110

113
111
114
-
5. Run the prompt. View predictions to accept or correct.
112
+
5. Run the prompt! View predictions to accept or correct.
115
113
116
114
You can [read more about evaluation metrics](prompts_draft#Evaluation-results) and ways to assess your prompt performance.
117
115
118
116
!!! info Tip
119
-
Use the drop-down menu above the results field to change the subset of data being used (e.g. only data with Ground Truth annotations, or a small sample of records).
117
+
You can change the subset of data being used (e.g. only data with Ground Truth annotations, or a small sample of records).
120
118
121
119

122
120
123
-
6. Accept the [predictions as annotations](prompts_predictions#Create-annotations-from-predictions).
121
+
6. Accept the [predictions as annotations](prompts_predictions#Create-annotations-from-predictions)!
124
122
125
123
126
124
### Evaluate LLM outputs for toxicity
@@ -131,7 +129,7 @@ This example demonstrates how to set up Prompts to evaluate if the LLM-generated
131
129
132
130
For example: you can use the [jigsaw_toxicity](https://huggingface.co/datasets/tasksource/jigsaw_toxicity) dataset as an example. See [the appendix](#Appendix-Generate-dataset) for how you can pre-process and (optionally) downsample this dataset to use with this guide.
133
131
134
-
2. Create a [label config](setup) for toxicity detection, for example:
132
+
2. Create a [label config](setup) for toxicity detection (or Ask AI to create one for you), for example:
135
133
136
134
```xml
137
135
<View>
@@ -192,25 +190,23 @@ This example demonstrates how to set up Prompts to evaluate if the LLM-generated
192
190
!!! note
193
191
Ensure you include `{comment_text}` in your instructions. Click `comment_text` above the instruction field to insert it.
194
192
195
-

196
-
197
193
!!! info Tip
198
194
You can also automatically generate the instructions using the [**Enhance Prompt** action](prompts_draft#Enhance-prompt). Before you can use this action, you must at least add the variable name `{comment_text}` and then click **Save**.
199
195
200
-

196
+

201
197
202
-
5. Run the prompt. View predictions to accept or correct.
198
+
5. Run the prompt! View predictions to accept or correct.
203
199
204
200
You can [read more about evaluation metrics](prompts_draft#Evaluation-results) and ways to assess your prompt performance.
205
201
206
202
!!! info Tip
207
-
Use the drop-down menu above the results field to change the subset of data being used (e.g. only data with Ground Truth annotations, or a small sample of records).
203
+
You can change the subset of data being used (e.g. only data with Ground Truth annotations, or a small sample of records).
208
204
209
-

205
+

210
206
211
-
6. Accept the [predictions as annotations](prompts_predictions#Create-annotations-from-predictions).
207
+
6. Accept the [predictions as annotations](prompts_predictions#Create-annotations-from-predictions)!
212
208
213
-
### Appendix: Preprocess jigsaw toxicity dataset
209
+
####Appendix: Preprocess jigsaw toxicity dataset
214
210
215
211
Download the jigsaw_toxicity dataset, then downsample/format using the following script (modify the `INPUT_PATH` and `OUTPUT_PATH` to suit your needs):
216
212
@@ -259,3 +255,113 @@ with open(OUTPUT_PATH, "w") as f:
259
255
```
260
256
261
257
If you choose to, you could also easily change how many records to use (or use the entire dataset by removing the sample step).
258
+
259
+
### Generate Synthetic Q&A Datasets
260
+
261
+
#### Overview
262
+
263
+
Synthetic datasets are datasets artificially generated rather than being collected from real-world observations. They encode characteristics similar to real data, but allow for scaling up data diversity or volume gaps in general purpose application, such as model training and evaluation. Synthetic datasets also work well in enhancing AI systems that have unbound human context as inputs and output, such as chatbot question and answers, test datasets for evaluation, and rich knowledge datasets for contextual retrieval. LLMs are particularly effective for generating synthetic datasets for these use cases, and allow you to enhance your AI system’s performance by creating diversity to learn from.
264
+
265
+
#### Example
266
+
267
+
Let’s expand on the Q&A use case above with an example demonstrating how to use Prompts to generate synthetic user prompts for a chatbot RAG system. Given a dataset of chatbot answers, we’ll generate some questions that could return each answer.
268
+
269
+
270
+
1.[Create a new label studio project](setup_project) by importing chunks of text that would be meaningful answers from a chatbot.
271
+
272
+
You can use a preprocessed sample of the [SQuAD](https://huggingface.co/datasets/rajpurkar/squad) dataset as an example. See [the appendix](#Appendix-Preprocess-SQuAD-Q-A-dataset) for how this was generated.
273
+
274
+
2. Create a [label config](setup) for question generation (or Ask AI to create one for you), for example:
275
+
276
+
```xml
277
+
<View>
278
+
<Headervalue="Context" />
279
+
<Textname="context"value="$context" />
280
+
<Headervalue="Answer" />
281
+
<Textname="answer"value="$answer" />
282
+
283
+
<Headervalue="Questions" />
284
+
<TextAreaname="question1"toName="context"
285
+
placeholder="Enter question 1"
286
+
rows="2"
287
+
maxSubmissions="1" />
288
+
289
+
<TextAreaname="question2"toName="context"
290
+
placeholder="Enter question 2"
291
+
rows="2"
292
+
maxSubmissions="1" />
293
+
294
+
<TextAreaname="question3"toName="context"
295
+
placeholder="Enter question 3"
296
+
rows="2"
297
+
maxSubmissions="1" />
298
+
</View>
299
+
```
300
+
301
+
3. Navigate to **Prompts** from the sidebar, and [create a prompt](prompts_create) for the project
302
+
303
+
If you have not yet set up API the keys you want to use, do that now: [API keys](prompts_create#Model-provider-API-keys).
304
+
305
+
4. Add instructions to create 3 questions:
306
+
307
+
*Using the "context" below as context, come up with 3 questions ("question1", "question2", and "question3") for which the appropriate answer would be the "answer" below:*
308
+
309
+
*Context:*
310
+
311
+
*---*
312
+
313
+
*{context}*
314
+
315
+
*---*
316
+
317
+
*Answer:*
318
+
319
+
*---*
320
+
321
+
*{answer}*
322
+
323
+
*---*
324
+
325
+
326
+
!!! note
327
+
Ensure you include `{answer}` and `{context}` in your instructions. Click `answer`/`context` above the instruction field to insert them.
328
+
329
+
!!! info Tip
330
+
You can also automatically generate the instructions using the [**Enhance Prompt** action](prompts_draft#Enhance-prompt). Before you can use this action, you must at least add a variable name (e.g. `{context}` or `{answer}`) and then click **Save**.
331
+
332
+

333
+
334
+
5. Run the Prompt! View predictions to accept or correct.
335
+
336
+
You can [read more about evaluation metrics](prompts_draft#Evaluation-results) and ways to assess your prompt performance.
337
+
338
+
!!! info Tip
339
+
You can change the subset of data being used (e.g. only data with Ground Truth annotations, or a small sample of records).
340
+
341
+

342
+
343
+
6. Accept the [predictions as annotations](prompts_predictions#Create-annotations-from-predictions)!
344
+
345
+
#### Appendix: Preprocess SQuAD Q&A dataset
346
+
347
+
This downloads the SQuAD dataset from Huggingface and formats it for use in Label Studio.
0 commit comments