Skip to content

Commit a35046b

Browse files
Darinochkavoorhs
andauthored
feat: added tutoral (#164)
* feat: added tutoral * add cross references * fix codestyle --------- Co-authored-by: voorhs <[email protected]>
1 parent e1753f5 commit a35046b

File tree

5 files changed

+105
-1
lines changed

5 files changed

+105
-1
lines changed

autointent/generation/intents/_description_generation.py

+2
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,8 @@ def generate_descriptions(
139139
prompt: Template for model prompt with placeholders for intent_name,
140140
user_utterances, and regex_patterns.
141141
model_name: OpenAI model identifier for generating descriptions.
142+
143+
See :ref:`intent_description_generation` tutorial.
142144
"""
143145
samples = []
144146
for split in dataset.values():

autointent/generation/utterances/_evolution/incremental_evolver.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ class IncrementalUtteranceEvolver(UtteranceEvolver):
4343
"""Incremental evolutionary strategy to augmenting utterances.
4444
4545
This method adds LLM-generated training samples until the quality
46-
of linear classification on resulting dataset is rising.
46+
of linear classification on resulting dataset stops rising.
4747
4848
Args:
4949
generator: Generator instance for generating utterances.

docs/source/augmentation_tutorials/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,4 @@ Data augmentation tutorials
88

99
balancer
1010
dspy_augmentation
11+
intent_description
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
.. _intent_description_generation:
2+
3+
Intent Description Generation
4+
#############################
5+
6+
This documentation covers the implementation and usage of the Intent Description Generation module. It explains the function of the module, the underlying mechanisms, and provides examples of usage.
7+
8+
The approach used in this module is based on the paper `Exploring Description-Augmented Dataless Intent Classification <https://arxiv.org/pdf/2407.17862>`_.
9+
10+
.. contents:: Table of Contents
11+
:depth: 2
12+
13+
Overview
14+
--------
15+
16+
The Intent Description Generation module is designed to automatically generate detailed and coherent descriptions of intents using large language models (LLMs). It enhances datasets by creating human-readable explanations for intents, supplemented by examples (utterances) and regex patterns.
17+
18+
How the Module Works
19+
--------------------
20+
21+
The module leverages prompt engineering to interact with LLMs, creating structured intent descriptions that are suitable for documentation, user interaction, and training purposes. Each generated description includes:
22+
23+
- **Intent Name**: Clearly identifies the intent.
24+
- **Examples (User Utterances)**: Demonstrates real-world user inputs.
25+
- **Regex Patterns**: Highlights relevant regex patterns associated with the intent.
26+
27+
The module uses a templated approach, defined through `PromptDescription`, to maintain consistency and clarity across descriptions.
28+
29+
Installation
30+
------------
31+
32+
Ensure you have the necessary dependencies installed:
33+
34+
.. code-block:: bash
35+
36+
pip install autointent openai
37+
38+
Usage
39+
-----
40+
41+
Here's an example demonstrating how to generate intent descriptions:
42+
43+
.. code-block:: python
44+
45+
import openai
46+
from autointent import Dataset
47+
from autointent.generation.intents import generate_descriptions
48+
from autointent.generation.chat_templates import PromptDescription
49+
50+
client = openai.AsyncOpenAI(
51+
api_key="your-api-key"
52+
)
53+
54+
dataset = Dataset.from_hub("AutoIntent/clinc150_subset")
55+
56+
prompt = PromptDescription(
57+
text="Describe intent {intent_name} with examples: {user_utterances} and patterns: {regex_patterns}",
58+
)
59+
60+
enhanced_dataset = generate_descriptions(
61+
dataset=dataset,
62+
client=client,
63+
prompt=prompt,
64+
model_name="gpt4o-mini",
65+
)
66+
67+
enhanced_dataset.to_csv("enhanced_clinc150.csv")
68+
69+
Prompt Customization
70+
--------------------
71+
72+
The `PromptDescription` can be customized to better fit specific requirements. It uses the following placeholders:
73+
74+
- ``{intent_name}``: The name of the intent being described.
75+
- ``{user_utterances}``: Example utterances related to the intent.
76+
- ``{regex_patterns}``: Associated regular expression patterns.
77+
78+
Adjusting the prompt allows tailoring descriptions to different contexts or detail levels.
79+
80+
Model Selection
81+
---------------
82+
83+
This module supports various LLMs available through OpenAI-compatible APIs. Configure your preferred model via the `model_name` parameter. Refer to your LLM provider’s documentation for available models.
84+
85+
Recommended models include:
86+
87+
- ``gpt4o-mini`` (for balanced performance and efficiency)
88+
- ``gpt-4`` (for maximum descriptive quality)
89+
90+
API Integration
91+
---------------
92+
93+
Ensure your OpenAI-compatible client is properly configured with an API endpoint and key:
94+
95+
.. code-block:: python
96+
97+
client = openai.AsyncOpenAI(
98+
base_url="your-api-base-url",
99+
api_key="your-api-key"
100+
)

user_guides/basic_usage/03_automl.py

+1
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,7 @@
103103

104104
# %%
105105
from autointent.configs import DataConfig
106+
106107
custom_pipeline.set_config(DataConfig(scheme="cv", n_folds=3))
107108

108109
# %% [markdown]

0 commit comments

Comments
 (0)