|
2 | 2 | "cells": [ |
3 | 3 | { |
4 | 4 | "cell_type": "markdown", |
5 | | - "id": "a4ac4d55", |
| 5 | + "id": "32cfb8c7", |
6 | 6 | "metadata": {}, |
7 | 7 | "source": [ |
8 | 8 | "# 🎨 Data Designer Tutorial: The Basics\n", |
|
14 | 14 | }, |
15 | 15 | { |
16 | 16 | "cell_type": "markdown", |
17 | | - "id": "9e9f3c47", |
| 17 | + "id": "988fb995", |
18 | 18 | "metadata": {}, |
19 | 19 | "source": [ |
20 | 20 | "### ⚡ Colab Setup\n", |
|
25 | 25 | { |
26 | 26 | "cell_type": "code", |
27 | 27 | "execution_count": null, |
28 | | - "id": "41b31194", |
| 28 | + "id": "789ff539", |
29 | 29 | "metadata": {}, |
30 | 30 | "outputs": [], |
31 | 31 | "source": [ |
|
35 | 35 | { |
36 | 36 | "cell_type": "code", |
37 | 37 | "execution_count": null, |
38 | | - "id": "502b3aba", |
| 38 | + "id": "7a1e9285", |
39 | 39 | "metadata": {}, |
40 | 40 | "outputs": [], |
41 | 41 | "source": [ |
|
52 | 52 | }, |
53 | 53 | { |
54 | 54 | "cell_type": "markdown", |
55 | | - "id": "8c512fbc", |
| 55 | + "id": "398ab04c", |
56 | 56 | "metadata": {}, |
57 | 57 | "source": [ |
58 | 58 | "### 📦 Import the essentials\n", |
|
63 | 63 | { |
64 | 64 | "cell_type": "code", |
65 | 65 | "execution_count": null, |
66 | | - "id": "8fae521f", |
| 66 | + "id": "ceff1922", |
67 | 67 | "metadata": {}, |
68 | 68 | "outputs": [], |
69 | 69 | "source": [ |
70 | 70 | "from data_designer.essentials import (\n", |
71 | 71 | " CategorySamplerParams,\n", |
| 72 | + " ChatCompletionInferenceParams,\n", |
72 | 73 | " DataDesigner,\n", |
73 | 74 | " DataDesignerConfigBuilder,\n", |
74 | | - " InferenceParameters,\n", |
75 | 75 | " LLMTextColumnConfig,\n", |
76 | 76 | " ModelConfig,\n", |
77 | 77 | " PersonFromFakerSamplerParams,\n", |
|
84 | 84 | }, |
85 | 85 | { |
86 | 86 | "cell_type": "markdown", |
87 | | - "id": "e71d0256", |
| 87 | + "id": "7a4f633d", |
88 | 88 | "metadata": {}, |
89 | 89 | "source": [ |
90 | 90 | "### ⚙️ Initialize the Data Designer interface\n", |
|
97 | 97 | { |
98 | 98 | "cell_type": "code", |
99 | 99 | "execution_count": null, |
100 | | - "id": "68fc7172", |
| 100 | + "id": "06c7ef7d", |
101 | 101 | "metadata": {}, |
102 | 102 | "outputs": [], |
103 | 103 | "source": [ |
|
106 | 106 | }, |
107 | 107 | { |
108 | 108 | "cell_type": "markdown", |
109 | | - "id": "9a821a27", |
| 109 | + "id": "3d2eb368", |
110 | 110 | "metadata": {}, |
111 | 111 | "source": [ |
112 | 112 | "### 🎛️ Define model configurations\n", |
|
123 | 123 | { |
124 | 124 | "cell_type": "code", |
125 | 125 | "execution_count": null, |
126 | | - "id": "a9515141", |
| 126 | + "id": "db859391", |
127 | 127 | "metadata": {}, |
128 | 128 | "outputs": [], |
129 | 129 | "source": [ |
|
144 | 144 | " alias=MODEL_ALIAS,\n", |
145 | 145 | " model=MODEL_ID,\n", |
146 | 146 | " provider=MODEL_PROVIDER,\n", |
147 | | - " inference_parameters=InferenceParameters(\n", |
| 147 | + " inference_parameters=ChatCompletionInferenceParams(\n", |
148 | 148 | " temperature=0.5,\n", |
149 | 149 | " top_p=1.0,\n", |
150 | 150 | " max_tokens=1024,\n", |
|
155 | 155 | }, |
156 | 156 | { |
157 | 157 | "cell_type": "markdown", |
158 | | - "id": "3b940ab9", |
| 158 | + "id": "9264dfaf", |
159 | 159 | "metadata": {}, |
160 | 160 | "source": [ |
161 | 161 | "### 🏗️ Initialize the Data Designer Config Builder\n", |
|
170 | 170 | { |
171 | 171 | "cell_type": "code", |
172 | 172 | "execution_count": null, |
173 | | - "id": "ec21da7e", |
| 173 | + "id": "6c78af77", |
174 | 174 | "metadata": {}, |
175 | 175 | "outputs": [], |
176 | 176 | "source": [ |
|
179 | 179 | }, |
180 | 180 | { |
181 | 181 | "cell_type": "markdown", |
182 | | - "id": "85b2324e", |
| 182 | + "id": "2ea6b7fb", |
183 | 183 | "metadata": {}, |
184 | 184 | "source": [ |
185 | 185 | "## 🎲 Getting started with sampler columns\n", |
|
196 | 196 | { |
197 | 197 | "cell_type": "code", |
198 | 198 | "execution_count": null, |
199 | | - "id": "f49f435e", |
| 199 | + "id": "aba21308", |
200 | 200 | "metadata": {}, |
201 | 201 | "outputs": [], |
202 | 202 | "source": [ |
|
205 | 205 | }, |
206 | 206 | { |
207 | 207 | "cell_type": "markdown", |
208 | | - "id": "f582b642", |
| 208 | + "id": "23620264", |
209 | 209 | "metadata": {}, |
210 | 210 | "source": [ |
211 | 211 | "Let's start designing our product review dataset by adding product category and subcategory columns.\n" |
|
214 | 214 | { |
215 | 215 | "cell_type": "code", |
216 | 216 | "execution_count": null, |
217 | | - "id": "8cfc43b1", |
| 217 | + "id": "a41a75ae", |
218 | 218 | "metadata": {}, |
219 | 219 | "outputs": [], |
220 | 220 | "source": [ |
|
295 | 295 | }, |
296 | 296 | { |
297 | 297 | "cell_type": "markdown", |
298 | | - "id": "2d0eea21", |
| 298 | + "id": "b8449bc2", |
299 | 299 | "metadata": {}, |
300 | 300 | "source": [ |
301 | 301 | "Next, let's add samplers to generate data related to the customer and their review.\n" |
|
304 | 304 | { |
305 | 305 | "cell_type": "code", |
306 | 306 | "execution_count": null, |
307 | | - "id": "b5e65724", |
| 307 | + "id": "5c115016", |
308 | 308 | "metadata": {}, |
309 | 309 | "outputs": [], |
310 | 310 | "source": [ |
|
341 | 341 | }, |
342 | 342 | { |
343 | 343 | "cell_type": "markdown", |
344 | | - "id": "e6788771", |
| 344 | + "id": "52afce00", |
345 | 345 | "metadata": {}, |
346 | 346 | "source": [ |
347 | 347 | "## 🦜 LLM-generated columns\n", |
|
356 | 356 | { |
357 | 357 | "cell_type": "code", |
358 | 358 | "execution_count": null, |
359 | | - "id": "a2705cd9", |
| 359 | + "id": "c5d8a438", |
360 | 360 | "metadata": {}, |
361 | 361 | "outputs": [], |
362 | 362 | "source": [ |
|
393 | 393 | }, |
394 | 394 | { |
395 | 395 | "cell_type": "markdown", |
396 | | - "id": "e3dd2f69", |
| 396 | + "id": "5cdba8c3", |
397 | 397 | "metadata": {}, |
398 | 398 | "source": [ |
399 | 399 | "### 🔁 Iteration is key – preview the dataset!\n", |
|
410 | 410 | { |
411 | 411 | "cell_type": "code", |
412 | 412 | "execution_count": null, |
413 | | - "id": "c6e43147", |
| 413 | + "id": "b06081ce", |
414 | 414 | "metadata": {}, |
415 | 415 | "outputs": [], |
416 | 416 | "source": [ |
|
420 | 420 | { |
421 | 421 | "cell_type": "code", |
422 | 422 | "execution_count": null, |
423 | | - "id": "fab77d01", |
| 423 | + "id": "2de2bfbd", |
424 | 424 | "metadata": {}, |
425 | 425 | "outputs": [], |
426 | 426 | "source": [ |
|
431 | 431 | { |
432 | 432 | "cell_type": "code", |
433 | 433 | "execution_count": null, |
434 | | - "id": "875ee6a6", |
| 434 | + "id": "5e71a6b4", |
435 | 435 | "metadata": {}, |
436 | 436 | "outputs": [], |
437 | 437 | "source": [ |
|
441 | 441 | }, |
442 | 442 | { |
443 | 443 | "cell_type": "markdown", |
444 | | - "id": "87b59e4b", |
| 444 | + "id": "4e601906", |
445 | 445 | "metadata": {}, |
446 | 446 | "source": [ |
447 | 447 | "### 📊 Analyze the generated data\n", |
|
454 | 454 | { |
455 | 455 | "cell_type": "code", |
456 | 456 | "execution_count": null, |
457 | | - "id": "5d347f4c", |
| 457 | + "id": "e8ac7b80", |
458 | 458 | "metadata": {}, |
459 | 459 | "outputs": [], |
460 | 460 | "source": [ |
|
464 | 464 | }, |
465 | 465 | { |
466 | 466 | "cell_type": "markdown", |
467 | | - "id": "d2fb84f2", |
| 467 | + "id": "62858e97", |
468 | 468 | "metadata": {}, |
469 | 469 | "source": [ |
470 | 470 | "### 🆙 Scale up!\n", |
|
477 | 477 | { |
478 | 478 | "cell_type": "code", |
479 | 479 | "execution_count": null, |
480 | | - "id": "71a31e85", |
| 480 | + "id": "9fb5bf0a", |
481 | 481 | "metadata": {}, |
482 | 482 | "outputs": [], |
483 | 483 | "source": [ |
|
487 | 487 | { |
488 | 488 | "cell_type": "code", |
489 | 489 | "execution_count": null, |
490 | | - "id": "501e9092", |
| 490 | + "id": "60585e66", |
491 | 491 | "metadata": {}, |
492 | 492 | "outputs": [], |
493 | 493 | "source": [ |
|
500 | 500 | { |
501 | 501 | "cell_type": "code", |
502 | 502 | "execution_count": null, |
503 | | - "id": "6f217b4a", |
| 503 | + "id": "ecde0529", |
504 | 504 | "metadata": {}, |
505 | 505 | "outputs": [], |
506 | 506 | "source": [ |
|
512 | 512 | }, |
513 | 513 | { |
514 | 514 | "cell_type": "markdown", |
515 | | - "id": "4da82b0f", |
| 515 | + "id": "1e3c3650", |
516 | 516 | "metadata": {}, |
517 | 517 | "source": [ |
518 | 518 | "## ⏭️ Next Steps\n", |
|
0 commit comments