You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: contrib/hamilton/contrib/user/skrawcz/fine_tuning/README.md
+7-7
Original file line number
Diff line number
Diff line change
@@ -1,15 +1,17 @@
1
1
# Purpose of this module
2
2
This module shows you how to fine-tune an LLM model. This code is inspired by this [fine-tuning code](https://github.com/dagster-io/dagster_llm_finetune/tree/main).
3
3
4
-
Specifically the code here, from an approach standpoint, shows Supervised Fine-Tuning (SFT) for dialogue. This approach instructs the model to be more
4
+
Specifically the code here, shows Supervised Fine-Tuning (SFT) for dialogue. This approach instructs the model to be more
5
5
useful to directly respond to a question, rather than optimizing over an entire dialogue. SFT is the most common type of fine-tuning,
6
-
as the other two options: Pre-training for Completion, and RLHF, required more to work. Pre-training requires more computational power,
6
+
as the other two options, Pre-training for Completion, and RLHF, required more to work. Pre-training requires more computational power,
7
7
while RLHF requires higher-quality dialogue data.
8
8
9
9
This code should work on a regular CPU (in a docker container), which will allow you to test out the code locally without
10
10
any additional setup. This specific approach this code uses is [LoRA](https://arxiv.org/abs/2106.09685) (low-rank adaptation of large language models), which
11
11
means that only a subset of the LLM's parameters are tweaked and prevents over-fitting.
12
12
13
+
Note: if you have issues running this on MacOS, reach out, we might be able to help.
14
+
13
15
## What is fine-tuning?
14
16
Fine-tuning is when a pre-trained model, in this context a foundational model, is customized using additional data to
15
17
adjust its responses for a specific task. This is a good way to adjust an off-the-shelf, i.e. pretrained model, to provide
@@ -28,7 +30,7 @@ It shows a basic process of:
28
30
29
31
a. Loading data and tokenizing it and setting up some tokenization parameters.
30
32
31
-
b. Splitting data into training, validation, and inference sets.
33
+
b. Splitting data into training, validation, and hold out sets.
32
34
33
35
c. Fine-tuning the model using LoRA.
34
36
@@ -63,14 +65,12 @@ You would then pass in as _inputs_ to execution `"data_path"=PATH_TO_THIS_FILE`
63
65
that the transformers library supports for `AutoModelForSeq2SeqLM` models.
64
66
- Run the code.
65
67
66
-
Because there's no configuration that changes the shape of the DAG, you can run the code like this:
67
-
68
68
```python
69
69
# instantiate the driver with this module however you want
70
70
result = dr.execute(
71
71
[ # some suggested outputs
72
72
"save_best_models",
73
-
"inference_set_predictions",
73
+
"hold_out_set_predictions",
74
74
"training_and_validation_set_metrics",
75
75
"finetuned_model_on_validation_set",
76
76
],
@@ -122,7 +122,7 @@ docker run YOUR_IMAGE_NAME
122
122
-`{"start": "presaved"}` Use this if you want to load an already fine-tuned model and then just eval it.
123
123
124
124
# Limitations
125
-
The code here cannot guarantee groundbreaking performance for your specific use case,
125
+
The code here will likely not solve all your LLM troubles,
126
126
but it can show you how to fine-tune an LLM using parameter-efficient techniques such as LoRA.
127
127
128
128
This code is currently set up to work with dataset and transformer libraries. It could be modified to work with other libraries.
0 commit comments