Skip to content

Commit bd5a891

Browse files
add back keras_nlp
1 parent 89da3b2 commit bd5a891

File tree

24 files changed

+7150
-0
lines changed

24 files changed

+7150
-0
lines changed

guides/ipynb/keras_nlp/getting_started.ipynb

Lines changed: 931 additions & 0 deletions
Large diffs are not rendered by default.

guides/ipynb/keras_nlp/transformer_pretraining.ipynb

Lines changed: 690 additions & 0 deletions
Large diffs are not rendered by default.

guides/ipynb/keras_nlp/upload.ipynb

Lines changed: 521 additions & 0 deletions
Large diffs are not rendered by default.

guides/keras_nlp/getting_started.py

Lines changed: 633 additions & 0 deletions
Large diffs are not rendered by default.

guides/keras_nlp/transformer_pretraining.py

Lines changed: 468 additions & 0 deletions
Large diffs are not rendered by default.

guides/keras_nlp/upload.py

Lines changed: 245 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,245 @@
1+
"""
2+
Title: Uploading Models with KerasNLP
3+
Author: [Samaneh Saadat](https://github.com/SamanehSaadat/), [Matthew Watson](https://github.com/mattdangerw/)
4+
Date created: 2024/04/29
5+
Last modified: 2024/04/29
6+
Description: An introduction on how to upload a fine-tuned KerasNLP model to model hubs.
7+
Accelerator: GPU
8+
"""
9+
10+
"""
11+
# Introduction
12+
13+
Fine-tuning a machine learning model can yield impressive results for specific tasks.
14+
Uploading your fine-tuned model to a model hub allows you to share it with the broader community.
15+
By sharing your models, you'll enhance accessibility for other researchers and developers,
16+
making your contributions an integral part of the machine learning landscape.
17+
This can also streamline the integration of your model into real-world applications.
18+
19+
This guide walks you through how to upload your fine-tuned models to popular model hubs such as
20+
[Kaggle Models](https://www.kaggle.com/models) and [Hugging Face Hub](https://huggingface.co/models).
21+
"""
22+
23+
"""
24+
# Setup
25+
26+
Let's start by installing and importing all the libraries we need. We use KerasNLP for this guide.
27+
"""
28+
29+
"""shell
30+
pip install -q --upgrade keras-nlp huggingface-hub kagglehub
31+
"""
32+
33+
import os
34+
35+
os.environ["KERAS_BACKEND"] = "jax"
36+
37+
import keras_nlp
38+
39+
40+
"""
41+
# Data
42+
43+
We can use the IMDB reviews dataset for this guide. Let's load the dataset from `tensorflow_dataset`.
44+
"""
45+
46+
import tensorflow_datasets as tfds
47+
48+
imdb_train, imdb_test = tfds.load(
49+
"imdb_reviews",
50+
split=["train", "test"],
51+
as_supervised=True,
52+
batch_size=4,
53+
)
54+
55+
"""
56+
We only use a small subset of the training samples to make the guide run faster.
57+
However, if you need a higher quality model, consider using a larger number of training samples.
58+
"""
59+
60+
imdb_train = imdb_train.take(100)
61+
62+
"""
63+
# Task Upload
64+
65+
A `keras_nlp.models.Task`, wraps a `keras_nlp.models.Backbone` and a `keras_nlp.models.Preprocessor` to create
66+
a model that can be directly used for training, fine-tuning, and prediction for a given text problem.
67+
In this section, we explain how to create a `Task`, fine-tune and upload it to a model hub.
68+
"""
69+
70+
"""
71+
## Load Model
72+
73+
If you want to build a Causal LM based on a base model, simply call `keras_nlp.models.CausalLM.from_preset`
74+
and pass a built-in preset identifier.
75+
"""
76+
77+
causal_lm = keras_nlp.models.CausalLM.from_preset("gpt2_base_en")
78+
79+
80+
"""
81+
## Fine-tune Model
82+
83+
After loading the model, you can call `.fit()` on the model to fine-tune it.
84+
Here, we fine-tune the model on the IMDB reviews which makes the model movie domain-specific.
85+
"""
86+
87+
# Drop labels and keep the review text only for the Causal LM.
88+
imdb_train_reviews = imdb_train.map(lambda x, y: x)
89+
90+
# Fine-tune the Causal LM.
91+
causal_lm.fit(imdb_train_reviews)
92+
93+
"""
94+
## Save the Model Locally
95+
96+
To upload a model, you need to first save the model locally using `save_to_preset`.
97+
"""
98+
99+
preset_dir = "./gpt2_imdb"
100+
causal_lm.save_to_preset(preset_dir)
101+
102+
"""
103+
Let's see the saved files.
104+
"""
105+
106+
os.listdir(preset_dir)
107+
108+
"""
109+
### Load a Locally Saved Model
110+
111+
A model that is saved to a local preset can be loaded using `from_preset`.
112+
What you save in, is what you get back out.
113+
"""
114+
115+
causal_lm = keras_nlp.models.CausalLM.from_preset(preset_dir)
116+
117+
"""
118+
You can also load the `keras_nlp.models.Backbone` and `keras_nlp.models.Tokenizer` objects from this preset directory.
119+
Note that these objects are equivalent to `causal_lm.backbone` and `causal_lm.preprocessor.tokenizer` above.
120+
"""
121+
122+
backbone = keras_nlp.models.Backbone.from_preset(preset_dir)
123+
tokenizer = keras_nlp.models.Tokenizer.from_preset(preset_dir)
124+
125+
"""
126+
## Upload the Model to a Model Hub
127+
128+
After saving a preset to a directory, this directory can be uploaded to a model hub such as Kaggle or Hugging Face directly from the KerasNLP library.
129+
To upload the model to Kaggle, the URI must start with `kaggle://` and to upload to Hugging Face, it should start with `hf://`.
130+
"""
131+
"""
132+
### Upload to Kaggle
133+
"""
134+
135+
"""
136+
To upload a model to Kaggle, first, we need to authenticate with Kaggle.
137+
This can in one of the following ways:
138+
1. Set environment variables `KAGGLE_USERNAME` and `KAGGLE_KEY`.
139+
2. Provide a local `~/.kaggle/kaggle.json`.
140+
3. Call `kagglehub.login()`.
141+
142+
Let's make sure we are logged in before continuing.
143+
"""
144+
145+
import kagglehub
146+
147+
if "KAGGLE_USERNAME" not in os.environ or "KAGGLE_KEY" not in os.environ:
148+
kagglehub.login()
149+
150+
151+
"""
152+
153+
To upload a model we can use `keras_nlp.upload_preset(uri, preset_dir)` API where `uri` has the format of
154+
`kaggle://<KAGGLE_USERNAME>/<MODEL>/Keras/<VARIATION>` for uploading to Kaggle and `preset_dir` is the directory that the model is saved in.
155+
156+
Running the following uploads the model that is saved in `preset_dir` to Kaggle:
157+
"""
158+
kaggle_username = kagglehub.whoami()["username"]
159+
kaggle_uri = f"kaggle://{kaggle_username}/gpt2/keras/gpt2_imdb"
160+
keras_nlp.upload_preset(kaggle_uri, preset_dir)
161+
162+
"""
163+
### Upload to Hugging Face
164+
"""
165+
166+
"""
167+
To upload a model to Hugging Face, first, we need to authenticate with Hugging Face.
168+
This can in one of the following ways:
169+
1. Set environment variables `HF_USERNAME` and `HF_TOKEN`.
170+
2. Call `huggingface_hub.notebook_login()`.
171+
172+
Let's make sure we are logged in before coninuing.
173+
"""
174+
175+
import huggingface_hub
176+
177+
if "HF_USERNAME" not in os.environ or "HF_TOKEN" not in os.environ:
178+
huggingface_hub.notebook_login()
179+
180+
"""
181+
182+
`keras_nlp.upload_preset(uri, preset_dir)` can be used to upload a model to Hugging Face if `uri` has the format of
183+
`kaggle://<HF_USERNAME>/<MODEL>`.
184+
185+
Running the following uploads the model that is saved in `preset_dir` to Hugging Face:
186+
"""
187+
188+
hf_username = huggingface_hub.whoami()["name"]
189+
hf_uri = f"hf://{hf_username}/gpt2_imdb"
190+
keras_nlp.upload_preset(hf_uri, preset_dir)
191+
192+
193+
"""
194+
## Load a User Uploaded Model
195+
196+
After verifying that the model is uploaded to Kaggle, we can load the model by calling `from_preset`.
197+
198+
```python
199+
causal_lm = keras_nlp.models.CausalLM.from_preset(
200+
f"kaggle://{kaggle_username}/gpt2/keras/gpt2_imdb"
201+
)
202+
```
203+
204+
We can also load the model uploaded to Hugging Face by calling `from_preset`.
205+
206+
```python
207+
causal_lm = keras_nlp.models.CausalLM.from_preset(f"hf://{hf_username}/gpt2_imdb")
208+
```
209+
"""
210+
211+
212+
"""
213+
# Classifier Upload
214+
215+
Uploading a classifier model is similar to Causal LM upload.
216+
To upload the fine-tuned model, first, the model should be saved to a local directory using `save_to_preset`
217+
API and then it can be uploaded via `keras_nlp.upload_preset`.
218+
"""
219+
220+
# Load the base model.
221+
classifier = keras_nlp.models.Classifier.from_preset(
222+
"bert_tiny_en_uncased", num_classes=2
223+
)
224+
225+
# Fine-tune the classifier.
226+
classifier.fit(imdb_train)
227+
228+
# Save the model to a local preset directory.
229+
preset_dir = "./bert_tiny_imdb"
230+
classifier.save_to_preset(preset_dir)
231+
232+
# Upload to Kaggle.
233+
keras_nlp.upload_preset(
234+
f"kaggle://{kaggle_username}/bert/keras/bert_tiny_imdb", preset_dir
235+
)
236+
237+
"""
238+
After verifying that the model is uploaded to Kaggle, we can load the model by calling `from_preset`.
239+
240+
```python
241+
classifier = keras_nlp.models.Classifier.from_preset(
242+
f"kaggle://{kaggle_username}/bert/keras/bert_tiny_imdb"
243+
)
244+
```
245+
"""

0 commit comments

Comments
 (0)