Skip to content

Commit 710b010

Browse files
LysandreJiksgugger
andcommitted
Migration guide from v3.x to v4.x (#8763)
* Migration guide from v3.x to v4.x * Better wording * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * Sylvain's comments * Better wording. Co-authored-by: Sylvain Gugger <[email protected]>
1 parent 87199de commit 710b010

File tree

1 file changed

+165
-0
lines changed

1 file changed

+165
-0
lines changed

docs/source/migration.md

+165
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,170 @@
11
# Migrating from previous packages
22

3+
## Migrating from transformers `v3.x` to `v4.x`
4+
5+
A couple of changes were introduced when the switch from version 3 to version 4 was done. Below is a summary of the
6+
expected changes:
7+
8+
#### 1. AutoTokenizers and pipelines now use fast (rust) tokenizers by default.
9+
10+
The python and rust tokenizers have roughly the same API, but the rust tokenizers have a more complete feature set.
11+
12+
This introduces two breaking changes:
13+
- The handling of overflowing tokens between the python and rust tokenizers is different.
14+
- The rust tokenizers do not accept integers in the encoding methods.
15+
16+
##### How to obtain the same behavior as v3.x in v4.x
17+
18+
- The pipelines now contain additional features out of the box. See the [token-classification pipeline with the `grouped_entities` flag](https://huggingface.co/transformers/main_classes/pipelines.html?highlight=textclassification#tokenclassificationpipeline).
19+
- The auto-tokenizers now return rust tokenizers. In order to obtain the python tokenizers instead, the user may use the `use_fast` flag by setting it to `False`:
20+
21+
In version `v3.x`:
22+
```py
23+
from transformers import AutoTokenizer
24+
25+
tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")
26+
```
27+
to obtain the same in version `v4.x`:
28+
```py
29+
from transformers import AutoTokenizer
30+
31+
tokenizer = AutoTokenizer.from_pretrained("bert-base-cased", use_fast=False)
32+
```
33+
34+
#### 2. SentencePiece is removed from the required dependencies
35+
36+
The requirement on the SentencePiece dependency has been lifted from the `setup.py`. This is done so that we may have a channel on anaconda cloud without relying on `conda-forge`. This means that the tokenizers that depend on the SentencePiece library will not be available with a standard `transformers` installation.
37+
38+
This includes the **slow** versions of:
39+
- `XLNetTokenizer`
40+
- `AlbertTokenizer`
41+
- `CamembertTokenizer`
42+
- `MBartTokenizer`
43+
- `PegasusTokenizer`
44+
- `T5Tokenizer`
45+
- `ReformerTokenizer`
46+
- `XLMRobertaTokenizer`
47+
48+
##### How to obtain the same behavior as v3.x in v4.x
49+
50+
In order to obtain the same behavior as version `v3.x`, you should install `sentencepiece` additionally:
51+
52+
In version `v3.x`:
53+
```bash
54+
pip install transformers
55+
```
56+
to obtain the same in version `v4.x`:
57+
```bash
58+
pip install transformers[sentencepiece]
59+
```
60+
or
61+
```bash
62+
pip install transformers sentencepiece
63+
```
64+
#### 3. The architecture of the repo has been updated so that each model resides in its folder
65+
66+
The past and foreseeable addition of new models means that the number of files in the directory `src/transformers` keeps growing and becomes harder to navigate and understand. We made the choice to put each model and the files accompanying it in their own sub-directories.
67+
68+
This is a breaking change as importing intermediary layers using a model's module directly needs to be done via a different path.
69+
70+
##### How to obtain the same behavior as v3.x in v4.x
71+
72+
In order to obtain the same behavior as version `v3.x`, you should update the path used to access the layers.
73+
74+
In version `v3.x`:
75+
```bash
76+
from transformers.modeling_bert import BertLayer
77+
```
78+
to obtain the same in version `v4.x`:
79+
```bash
80+
from transformers.models.bert.modeling_bert import BertLayer
81+
```
82+
83+
#### 4. Switching the `return_dict` argument to `True` by default
84+
85+
The [`return_dict` argument](https://huggingface.co/transformers/main_classes/output.html) enables the return of dict-like python objects containing the model outputs, instead of the standard tuples. This object is self-documented as keys can be used to retrieve values, while also behaving as a tuple as users may retrieve objects by index or by slice.
86+
87+
This is a breaking change as the limitation of that tuple is that it cannot be unpacked: `value0, value1 = outputs` will not work.
88+
89+
##### How to obtain the same behavior as v3.x in v4.x
90+
91+
In order to obtain the same behavior as version `v3.x`, you should specify the `return_dict` argument to `False`, either in the model configuration or during the forward pass.
92+
93+
In version `v3.x`:
94+
```bash
95+
model = BertModel.from_pretrained("bert-base-cased")
96+
outputs = model(**inputs)
97+
```
98+
to obtain the same in version `v4.x`:
99+
```bash
100+
model = BertModel.from_pretrained("bert-base-cased")
101+
outputs = model(**inputs, return_dict=False)
102+
```
103+
or
104+
```bash
105+
model = BertModel.from_pretrained("bert-base-cased", return_dict=False)
106+
outputs = model(**inputs)
107+
```
108+
109+
#### 5. Removed some deprecated attributes
110+
111+
Attributes that were deprecated have been removed if they had been deprecated for at least a month. The full list of deprecated attributes can be found in [#8604](https://github.com/huggingface/transformers/pull/8604).
112+
113+
Here is a list of these attributes/methods/arguments and what their replacements should be:
114+
115+
In several models, the labels become consistent with the other models:
116+
- `masked_lm_labels` becomes `labels` in `AlbertForMaskedLM` and `AlbertForPreTraining`.
117+
- `masked_lm_labels` becomes `labels` in `BertForMaskedLM` and `BertForPreTraining`.
118+
- `masked_lm_labels` becomes `labels` in `DistilBertForMaskedLM`.
119+
- `masked_lm_labels` becomes `labels` in `ElectraForMaskedLM`.
120+
- `masked_lm_labels` becomes `labels` in `LongformerForMaskedLM`.
121+
- `masked_lm_labels` becomes `labels` in `MobileBertForMaskedLM`.
122+
- `masked_lm_labels` becomes `labels` in `RobertaForMaskedLM`.
123+
- `lm_labels` becomes `labels` in `BartForConditionalGeneration`.
124+
- `lm_labels` becomes `labels` in `GPT2DoubleHeadsModel`.
125+
- `lm_labels` becomes `labels` in `OpenAIGPTDoubleHeadsModel`.
126+
- `lm_labels` becomes `labels` in `T5ForConditionalGeneration`.
127+
128+
In several models, the caching mechanism becomes consistent with the other models:
129+
- `decoder_cached_states` becomes `past_key_values` in all BART-like, FSMT and T5 models.
130+
- `decoder_past_key_values` becomes `past_key_values` in all BART-like, FSMT and T5 models.
131+
- `past` becomes `past_key_values` in all CTRL models.
132+
- `past` becomes `past_key_values` in all GPT-2 models.
133+
134+
Regarding the tokenizer classes:
135+
- The tokenizer attribute `max_len` becomes `model_max_length`.
136+
- The tokenizer attribute `return_lengths` becomes `return_length`.
137+
- The tokenizer encoding argument `is_pretokenized` becomes `is_split_into_words`.
138+
139+
Regarding the `Trainer` class:
140+
- The `Trainer` argument `tb_writer` is removed in favor of the callback `TensorBoardCallback(tb_writer=...)`.
141+
- The `Trainer` argument `prediction_loss_only` is removed in favor of the class argument `args.prediction_loss_only`.
142+
- The `Trainer` attribute `data_collator` should be a callable.
143+
- The `Trainer` method `_log` is deprecated in favor of `log`.
144+
- The `Trainer` method `_training_step` is deprecated in favor of `training_step`.
145+
- The `Trainer` method `_prediction_loop` is deprecated in favor of `prediction_loop`.
146+
- The `Trainer` method `is_local_master` is deprecated in favor of `is_local_process_zero`.
147+
- The `Trainer` method `is_world_master` is deprecated in favor of `is_world_process_zero`.
148+
149+
Regarding the `TFTrainer` class:
150+
- The `TFTrainer` argument `prediction_loss_only` is removed in favor of the class argument `args.prediction_loss_only`.
151+
- The `Trainer` method `_log` is deprecated in favor of `log`.
152+
- The `TFTrainer` method `_prediction_loop` is deprecated in favor of `prediction_loop`.
153+
- The `TFTrainer` method `_setup_wandb` is deprecated in favor of `setup_wandb`.
154+
- The `TFTrainer` method `_run_model` is deprecated in favor of `run_model`.
155+
156+
Regarding the `TrainerArgument` class:
157+
- The `TrainerArgument` argument `evaluate_during_training` is deprecated in favor of `evaluation_strategy`.
158+
159+
Regarding the Transfo-XL model:
160+
- The Transfo-XL configuration attribute `tie_weight` becomes `tie_words_embeddings`.
161+
- The Transfo-XL modeling method `reset_length` becomes `reset_memory_length`.
162+
163+
Regarding pipelines:
164+
- The `FillMaskPipeline` argument `topk` becomes `top_k`.
165+
166+
167+
3168
## Migrating from pytorch-transformers to 🤗 Transformers
4169

5170
Here is a quick summary of what you should take care of when migrating from `pytorch-transformers` to 🤗 Transformers.

0 commit comments

Comments
 (0)