Skip to content

Commit 99261f3

Browse files
Merge pull request #235 from stochasticai/dev
Dev
2 parents 5eac0c4 + 4587157 commit 99261f3

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+717
-162
lines changed

.pre-commit-config.yaml

+10
Original file line numberDiff line numberDiff line change
@@ -21,3 +21,13 @@ repos:
2121
rev: v0.19.1
2222
hooks:
2323
- id: gitlint
24+
- repo: https://github.com/PyCQA/autoflake
25+
rev: v2.1.1
26+
hooks:
27+
- id: autoflake
28+
args: ["--in-place", "--remove-all-unused-imports", "--ignore-init-module-imports"]
29+
- repo: https://github.com/MarcoGorelli/absolufy-imports
30+
rev: v0.3.1
31+
hooks:
32+
- id: absolufy-imports
33+
args: ["--application-directories=.:src"]

README.md

+66-6
Original file line numberDiff line numberDiff line change
@@ -35,11 +35,71 @@ With `xTuring` you can,
3535

3636
## 🌟 What's new?
3737
We are excited to announce the latest enhancements to our `xTuring` library:
38-
1. __`Falcon LLM` integration__ - You can use and fine-tune the _`Falcon-7B`_ model in different configurations: _off-the-shelf_, _off-the-shelf with INT8 precision_, _LoRA fine-tuning_, and _LoRA fine-tuning with INT8 precision_.
39-
2. __`GenericModel` wrapper__ - This new integration allows you to test and fine-tune any new model on `xTuring` without waiting for it to be integrated using class _`GenericModel`_.
38+
1. __`LLaMA 2` integration__ - You can use and fine-tune the _`LLaMA 2`_ model in different configurations: _off-the-shelf_, _off-the-shelf with INT8 precision_, _LoRA fine-tuning_, _LoRA fine-tuning with INT8 precision_ and _LoRA fine-tuning with INT4 precision_ using the `GenericModel` wrapper and/or you can use the `Llama2` class from `xturing.models` to test and finetune the model.
39+
```python
40+
from xturing.models import Llama2
41+
model = Llama2()
42+
43+
## or
44+
from xturing.models import BaseModel
45+
model = BaseModel.create('llama2')
46+
47+
```
48+
2. __`Evaluation`__ - Now you can evaluate any `Causal Language Model` on any dataset. The metrics currently supported is [`perplexity`](https://towardsdatascience.com/perplexity-in-language-models-87a196019a94).
49+
```python
50+
# Make the necessary imports
51+
from xturing.datasets import InstructionDataset
52+
from xturing.models import BaseModel
53+
54+
# Load the desired dataset
55+
dataset = InstructionDataset('../llama/alpaca_data')
56+
57+
# Load the desired model
58+
model = BaseModel.create('gpt2')
59+
60+
# Run the Evaluation of the model on the dataset
61+
result = model.evaluate(dataset)
62+
63+
# Print the result
64+
print(f"Perplexity of the evalution: {result}")
65+
66+
```
67+
3. __`INT4` Precision__ - You can now use and fine-tune any LLM with `INT4 Precision` using `GenericKbitModel`.
68+
```python
69+
# Make the necessary imports
70+
from xturing.datasets import InstructionDataset
71+
from xturing.models import GenericKbitModel
72+
73+
# Load the desired dataset
74+
dataset = InstructionDataset('../llama/alpaca_data')
75+
76+
# Load the desired model for INT4 bit fine-tuning
77+
model = GenericKbitModel('tiiuae/falcon-7b')
78+
79+
# Run the fine-tuning
80+
model.finetune(dataset)
81+
```
82+
4. __CPU inference__ - Now you can use just your CPU for inference of any LLM. _CAUTION : The inference process may be sluggish because CPUs lack the required computational capacity for efficient inference_.
83+
5. __Batch integration__ - By tweaking the 'batch_size' in the .generate() and .evaluate() functions, you can expedite results. Using a 'batch_size' greater than 1 typically enhances processing efficiency.
84+
```python
85+
# Make the necessary imports
86+
from xturing.datasets import InstructionDataset
87+
from xturing.models import GenericKbitModel
88+
89+
# Load the desired dataset
90+
dataset = InstructionDataset('../llama/alpaca_data')
91+
92+
# Load the desired model for INT4 bit fine-tuning
93+
model = GenericKbitModel('tiiuae/falcon-7b')
94+
95+
# Generate outputs on desired prompts
96+
outputs = model.generate(dataset = dataset, batch_size=10)
97+
98+
```
99+
100+
An exploration of the [Llama LoRA INT4 working example](examples/int4_finetuning/LLaMA_lora_int4.ipynb) is recommended for an understanding of its application.
40101

41-
You can check the [Falcon LoRA INT8 working example](examples/falcon/falcon_lora_int8.py) repository to see how it works.
42-
Also, you can check the [GenericModel working example](examples/generic/generic_model.py) repository to see how it works.
102+
For an extended insight, consider examining the [GenericModel working example](examples/generic/generic_model.py) available in the repository.
43103

44104
<br>
45105

@@ -170,8 +230,8 @@ model = BaseModel.load("x/distilgpt2_lora_finetuned_alpaca")
170230
- [x] INT4 LLaMA LoRA fine-tuning with INT4 generation
171231
- [x] Support for a `Generic model` wrapper
172232
- [x] Support for `Falcon-7B` model
173-
- [X] INT4 low-precision fine-tuning support
174-
- [ ] Evaluation of LLM models
233+
- [x] INT4 low-precision fine-tuning support
234+
- [x] Evaluation of LLM models
175235
- [ ] INT3, INT2, INT1 low-precision fine-tuning support
176236
- [ ] Support for Stable Diffusion
177237

docs/docs/intro.md

+7-4
Original file line numberDiff line numberDiff line change
@@ -39,13 +39,16 @@ You can quickly get started with xTuring by following the [Quickstart](/quicksta
3939

4040
| Model | Examples |
4141
| --- | --- |
42-
| LLaMA | [LLaMA 7B fine-tuning on Alpaca dataset with/without LoRA and with/without INT8](https://github.com/stochasticai/xturing/tree/main/examples/llama) |
42+
| Bloom | [Bloom fine-tuning on Alpaca dataset with/without LoRA and with/without INT8](https://github.com/stochasticai/xturing/tree/main/examples/bloom) |
43+
| Cerebras-GPT | [Cerebras-GPT fine-tuning on Alpaca dataset with/without LoRA and with/without INT8](https://github.com/stochasticai/xturing/tree/main/examples/cerebras) |
44+
| Falcon | [Falcon 7B fine-tuning on Alpaca dataset with/without LoRA and with/without INT8](https://github.com/stochasticai/xturing/tree/main/examples/falcon) |
45+
| Galactica | [Galactica fine-tuning on Alpaca dataset with/without LoRA and with/without INT8](https://github.com/stochasticai/xturing/tree/main/examples/galactica) |
46+
| Generic Wrapper | [Any large language model fine-tuning on Alpaca dataset with/without LoRA and with/without INT8](https://github.com/stochasticai/xturing/tree/main/examples/generic) |
4347
| GPT-J | [GPT-J 6B LoRA fine-tuning with/without INT8 ](https://github.com/stochasticai/xturing/tree/main/examples/gptj) |
4448
| GPT-2 | [GPT-2 fine-tuning on Alpaca dataset with/without LoRA and with/without INT8](https://github.com/stochasticai/xturing/tree/main/examples/gpt2) |
49+
| LLaMA | [LLaMA 7B fine-tuning on Alpaca dataset with/without LoRA and with/without INT8](https://github.com/stochasticai/xturing/tree/main/examples/llama) |
50+
| LLaMA 2 | [LLaMA 2 7B fine-tuning on Alpaca dataset with/without LoRA and with/without INT8](https://github.com/stochasticai/xturing/tree/main/examples/llama2) |
4551
| OPT | [OPT fine-tuning on Alpaca dataset with/without LoRA and with/without INT8](https://github.com/stochasticai/xturing/tree/main/examples/opt) |
46-
| Cerebras-GPT | [Cerebras-GPT fine-tuning on Alpaca dataset with/without LoRA and with/without INT8](https://github.com/stochasticai/xturing/tree/main/examples/cerebras) |
47-
| Galactica | [Galactica fine-tuning on Alpaca dataset with/without LoRA and with/without INT8](https://github.com/stochasticai/xturing/tree/main/examples/galactica) |
48-
| Bloom | [Bloom fine-tuning on Alpaca dataset with/without LoRA and with/without INT8](https://github.com/stochasticai/xturing/tree/main/examples/bloom) |
4952

5053
xTuring is licensed under [Apache 2.0](https://github.com/stochasticai/xturing/blob/main/LICENSE)
5154

examples/evaluation/evaluation.py

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Make the necessary imports
2+
from xturing.datasets import InstructionDataset
3+
from xturing.models import BaseModel
4+
5+
# Load the desired dataset
6+
dataset = InstructionDataset("../llama/alpaca_data")
7+
8+
# Load the desired model
9+
model = BaseModel.create("gpt2")
10+
11+
# Run the Evaluation of the model on the dataset
12+
result = model.evaluate(dataset)
13+
14+
# Print the result
15+
print(f"Perplexity of the evalution: {result}")

examples/int4_finetuning/LLaMA_lora_int4.ipynb

+3-4
Original file line numberDiff line numberDiff line change
@@ -31,8 +31,7 @@
3131
},
3232
"outputs": [],
3333
"source": [
34-
"!pip install xturing --upgrade\n",
35-
"!pip install xturing[int4] --upgrade"
34+
"!pip install xturing --upgrade"
3635
]
3736
},
3837
{
@@ -56,15 +55,15 @@
5655
"outputs": [],
5756
"source": [
5857
"from xturing.datasets.instruction_dataset import InstructionDataset\n",
59-
"from xturing.models import BaseModel\n",
58+
"from xturing.models import GenericLoraKbitModel\n",
6059
"from pytorch_lightning.loggers import WandbLogger\n",
6160
"\n",
6261
"# Initializes WandB integration \n",
6362
"wandb_logger = WandbLogger()\n",
6463
"\n",
6564
"instruction_dataset = InstructionDataset(\"../llama/alpaca_data\")\n",
6665
"# Initializes the model\n",
67-
"model = BaseModel.create(\"llama_lora_int4\")"
66+
"model = GenericLoraKbitModel('aleksickx/llama-7b-hf')"
6867
]
6968
},
7069
{

examples/llama2/llama2.py

+21
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Make the necessary imports
2+
from xturing.models import Llama2
3+
4+
# Load the model
5+
model = Llama2()
6+
# Generate ouputs from the model
7+
outputs = model.generate(texts=["How are you?"])
8+
# Print the generated outputs
9+
print(outputs)
10+
11+
## or
12+
13+
# Make the necessary imports
14+
from xturing.models import BaseModel
15+
16+
# Load the model
17+
model = BaseModel.create("llama2")
18+
# Generate ouputs from the model
19+
outputs = model.generate(texts=["How are you?"])
20+
# Print the generated outputs
21+
print(outputs)

examples/opt/opt_evaluate.py

+10
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
from xturing.datasets.instruction_dataset import InstructionDataset
2+
from xturing.models import BaseModel
3+
4+
instruction_dataset = InstructionDataset("../examples/llama/alpaca_data")
5+
# Initializes the model
6+
model = BaseModel.create("opt")
7+
# Call the evaluate function
8+
perplexity = model.evaluate(instruction_dataset, batch_size=5)
9+
10+
print(perplexity)

requirements-dev.txt

+2
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,4 @@
11
pre-commit
22
pytest
3+
autoflake
4+
absoulify-imports

src/xturing/cli/chat.py

-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
import time
21
from pathlib import Path
32

43
import click

src/xturing/config/finetuning_config.yaml

+31
Original file line numberDiff line numberDiff line change
@@ -193,6 +193,7 @@ llama:
193193
num_train_epochs: 3
194194
optimizer_name: cpu_adam
195195

196+
196197
llama_lora:
197198
learning_rate: 1e-4
198199
weight_decay: 0.01
@@ -227,6 +228,36 @@ llama_lora_kbit:
227228
intra_save_freq: 200
228229
groupsize: 128
229230

231+
llama2:
232+
learning_rate: 5e-5
233+
weight_decay: 0.01
234+
num_train_epochs: 3
235+
optimizer_name: cpu_adam
236+
237+
llama2_lora:
238+
learning_rate: 5e-5
239+
weight_decay: 0.01
240+
num_train_epochs: 3
241+
optimizer_name: cpu_adam
242+
243+
llama2_lora_int8:
244+
learning_rate: 5e-5
245+
weight_decay: 0.01
246+
num_train_epochs: 3
247+
optimizer_name: cpu_adam
248+
249+
llama2_int8:
250+
learning_rate: 5e-5
251+
weight_decay: 0.01
252+
num_train_epochs: 3
253+
optimizer_name: cpu_adam
254+
255+
llama2_lora_kbit:
256+
learning_rate: 5e-5
257+
weight_decay: 0.01
258+
num_train_epochs: 3
259+
optimizer_name: cpu_adam
260+
230261
opt:
231262
learning_rate: 5e-5
232263
weight_decay: 0.01

src/xturing/config/generation_config.yaml

+29
Original file line numberDiff line numberDiff line change
@@ -191,6 +191,35 @@ llama_lora_kbit:
191191
max_new_tokens: 256
192192
do_sample: false
193193

194+
# Contrastive search
195+
llama2:
196+
penalty_alpha: 0.6
197+
top_k: 4
198+
max_new_tokens: 256
199+
do_sample: false
200+
201+
# Contrastive search
202+
llama2_lora:
203+
penalty_alpha: 0.6
204+
top_k: 4
205+
max_new_tokens: 256
206+
do_sample: false
207+
208+
# Greedy search
209+
llama2_int8:
210+
max_new_tokens: 256
211+
do_sample: false
212+
213+
# Greedy search
214+
llama2_lora_int8:
215+
max_new_tokens: 256
216+
do_sample: false
217+
218+
# Greedy search
219+
llama2_lora_kbit:
220+
max_new_tokens: 256
221+
do_sample: false
222+
194223
# Contrastive search
195224
opt:
196225
penalty_alpha: 0.6

src/xturing/datasets/__init__.py

+7-4
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
1-
from .base import BaseDataset
2-
from .instruction_dataset import InstructionDataset, InstructionDatasetMeta
3-
from .text2image_dataset import Text2ImageDataset
4-
from .text_dataset import TextDataset, TextDatasetMeta
1+
from xturing.datasets.base import BaseDataset
2+
from xturing.datasets.instruction_dataset import (
3+
InstructionDataset,
4+
InstructionDatasetMeta,
5+
)
6+
from xturing.datasets.text2image_dataset import Text2ImageDataset
7+
from xturing.datasets.text_dataset import TextDataset, TextDatasetMeta
58

69
BaseDataset.add_to_registry(TextDataset.config_name, TextDataset)
710
BaseDataset.add_to_registry(InstructionDataset.config_name, InstructionDataset)

src/xturing/datasets/instruction_dataset.py

-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
import json
2-
import os
32
from dataclasses import dataclass
43
from pathlib import Path
54
from typing import List, Optional, Union

src/xturing/engines/__init__.py

+28-11
Original file line numberDiff line numberDiff line change
@@ -1,47 +1,63 @@
1-
from .base import BaseEngine
2-
from .bloom_engine import (
1+
from xturing.engines.base import BaseEngine
2+
from xturing.engines.bloom_engine import (
33
BloomEngine,
44
BloomInt8Engine,
55
BloomLoraEngine,
66
BloomLoraInt8Engine,
77
)
8-
from .cerebras_engine import (
8+
from xturing.engines.cerebras_engine import (
99
CerebrasEngine,
1010
CerebrasInt8Engine,
1111
CerebrasLoraEngine,
1212
CerebrasLoraInt8Engine,
1313
)
14-
from .distilgpt2_engine import DistilGPT2Engine, DistilGPT2LoraEngine
15-
from .falcon_engine import (
14+
from xturing.engines.distilgpt2_engine import DistilGPT2Engine, DistilGPT2LoraEngine
15+
from xturing.engines.falcon_engine import (
1616
FalconEngine,
1717
FalconInt8Engine,
1818
FalconLoraEngine,
1919
FalconLoraInt8Engine,
2020
FalconLoraKbitEngine,
2121
)
22-
from .galactica_engine import (
22+
from xturing.engines.galactica_engine import (
2323
GalacticaEngine,
2424
GalacticaInt8Engine,
2525
GalacticaLoraEngine,
2626
GalacticaLoraInt8Engine,
2727
)
28-
from .generic_engine import (
28+
from xturing.engines.generic_engine import (
2929
GenericEngine,
3030
GenericInt8Engine,
3131
GenericLoraEngine,
3232
GenericLoraInt8Engine,
3333
GenericLoraKbitEngine,
3434
)
35-
from .gpt2_engine import GPT2Engine, GPT2Int8Engine, GPT2LoraEngine, GPT2LoraInt8Engine
36-
from .gptj_engine import GPTJEngine, GPTJInt8Engine, GPTJLoraEngine, GPTJLoraInt8Engine
37-
from .llama_engine import (
35+
from xturing.engines.gpt2_engine import (
36+
GPT2Engine,
37+
GPT2Int8Engine,
38+
GPT2LoraEngine,
39+
GPT2LoraInt8Engine,
40+
)
41+
from xturing.engines.gptj_engine import (
42+
GPTJEngine,
43+
GPTJInt8Engine,
44+
GPTJLoraEngine,
45+
GPTJLoraInt8Engine,
46+
)
47+
from xturing.engines.llama2_engine import LLama2Engine
48+
from xturing.engines.llama_engine import (
3849
LLamaEngine,
3950
LLamaInt8Engine,
4051
LlamaLoraEngine,
4152
LlamaLoraInt8Engine,
4253
LlamaLoraKbitEngine,
4354
)
44-
from .opt_engine import OPTEngine, OPTInt8Engine, OPTLoraEngine, OPTLoraInt8Engine
55+
from xturing.engines.opt_engine import (
56+
OPTEngine,
57+
OPTInt8Engine,
58+
OPTLoraEngine,
59+
OPTLoraInt8Engine,
60+
)
4561

4662
BaseEngine.add_to_registry(BloomEngine.config_name, BloomEngine)
4763
BaseEngine.add_to_registry(BloomInt8Engine.config_name, BloomInt8Engine)
@@ -80,6 +96,7 @@
8096
BaseEngine.add_to_registry(LlamaLoraEngine.config_name, LlamaLoraEngine)
8197
BaseEngine.add_to_registry(LlamaLoraInt8Engine.config_name, LlamaLoraInt8Engine)
8298
BaseEngine.add_to_registry(LlamaLoraKbitEngine.config_name, LlamaLoraKbitEngine)
99+
BaseEngine.add_to_registry(LLama2Engine.config_name, LLama2Engine)
83100
BaseEngine.add_to_registry(OPTEngine.config_name, OPTEngine)
84101
BaseEngine.add_to_registry(OPTInt8Engine.config_name, OPTInt8Engine)
85102
BaseEngine.add_to_registry(OPTLoraEngine.config_name, OPTLoraEngine)

0 commit comments

Comments
 (0)