|
1 | 1 | # Migrating from previous packages
|
2 | 2 |
|
| 3 | +## Migrating from transformers `v3.x` to `v4.x` |
| 4 | + |
| 5 | +A couple of changes were introduced when the switch from version 3 to version 4 was done. Below is a summary of the |
| 6 | +expected changes: |
| 7 | + |
| 8 | +#### 1. AutoTokenizers and pipelines now use fast (rust) tokenizers by default. |
| 9 | + |
| 10 | +The python and rust tokenizers have roughly the same API, but the rust tokenizers have a more complete feature set. |
| 11 | + |
| 12 | +This introduces two breaking changes: |
| 13 | +- The handling of overflowing tokens between the python and rust tokenizers is different. |
| 14 | +- The rust tokenizers do not accept integers in the encoding methods. |
| 15 | + |
| 16 | +##### How to obtain the same behavior as v3.x in v4.x |
| 17 | + |
| 18 | +- The pipelines now contain additional features out of the box. See the [token-classification pipeline with the `grouped_entities` flag](https://huggingface.co/transformers/main_classes/pipelines.html?highlight=textclassification#tokenclassificationpipeline). |
| 19 | +- The auto-tokenizers now return rust tokenizers. In order to obtain the python tokenizers instead, the user may use the `use_fast` flag by setting it to `False`: |
| 20 | + |
| 21 | +In version `v3.x`: |
| 22 | +```py |
| 23 | +from transformers import AutoTokenizer |
| 24 | + |
| 25 | +tokenizer = AutoTokenizer.from_pretrained("bert-base-cased") |
| 26 | +``` |
| 27 | +to obtain the same in version `v4.x`: |
| 28 | +```py |
| 29 | +from transformers import AutoTokenizer |
| 30 | + |
| 31 | +tokenizer = AutoTokenizer.from_pretrained("bert-base-cased", use_fast=False) |
| 32 | +``` |
| 33 | + |
| 34 | +#### 2. SentencePiece is removed from the required dependencies |
| 35 | + |
| 36 | +The requirement on the SentencePiece dependency has been lifted from the `setup.py`. This is done so that we may have a channel on anaconda cloud without relying on `conda-forge`. This means that the tokenizers that depend on the SentencePiece library will not be available with a standard `transformers` installation. |
| 37 | + |
| 38 | +This includes the **slow** versions of: |
| 39 | +- `XLNetTokenizer` |
| 40 | +- `AlbertTokenizer` |
| 41 | +- `CamembertTokenizer` |
| 42 | +- `MBartTokenizer` |
| 43 | +- `PegasusTokenizer` |
| 44 | +- `T5Tokenizer` |
| 45 | +- `ReformerTokenizer` |
| 46 | +- `XLMRobertaTokenizer` |
| 47 | + |
| 48 | +##### How to obtain the same behavior as v3.x in v4.x |
| 49 | + |
| 50 | +In order to obtain the same behavior as version `v3.x`, you should install `sentencepiece` additionally: |
| 51 | + |
| 52 | +In version `v3.x`: |
| 53 | +```bash |
| 54 | +pip install transformers |
| 55 | +``` |
| 56 | +to obtain the same in version `v4.x`: |
| 57 | +```bash |
| 58 | +pip install transformers[sentencepiece] |
| 59 | +``` |
| 60 | +or |
| 61 | +```bash |
| 62 | +pip install transformers sentencepiece |
| 63 | +``` |
| 64 | +#### 3. The architecture of the repo has been updated so that each model resides in its folder |
| 65 | + |
| 66 | +The past and foreseeable addition of new models means that the number of files in the directory `src/transformers` keeps growing and becomes harder to navigate and understand. We made the choice to put each model and the files accompanying it in their own sub-directories. |
| 67 | + |
| 68 | +This is a breaking change as importing intermediary layers using a model's module directly needs to be done via a different path. |
| 69 | + |
| 70 | +##### How to obtain the same behavior as v3.x in v4.x |
| 71 | + |
| 72 | +In order to obtain the same behavior as version `v3.x`, you should update the path used to access the layers. |
| 73 | + |
| 74 | +In version `v3.x`: |
| 75 | +```bash |
| 76 | +from transformers.modeling_bert import BertLayer |
| 77 | +``` |
| 78 | +to obtain the same in version `v4.x`: |
| 79 | +```bash |
| 80 | +from transformers.models.bert.modeling_bert import BertLayer |
| 81 | +``` |
| 82 | + |
| 83 | +#### 4. Switching the `return_dict` argument to `True` by default |
| 84 | + |
| 85 | +The [`return_dict` argument](https://huggingface.co/transformers/main_classes/output.html) enables the return of dict-like python objects containing the model outputs, instead of the standard tuples. This object is self-documented as keys can be used to retrieve values, while also behaving as a tuple as users may retrieve objects by index or by slice. |
| 86 | + |
| 87 | +This is a breaking change as the limitation of that tuple is that it cannot be unpacked: `value0, value1 = outputs` will not work. |
| 88 | + |
| 89 | +##### How to obtain the same behavior as v3.x in v4.x |
| 90 | + |
| 91 | +In order to obtain the same behavior as version `v3.x`, you should specify the `return_dict` argument to `False`, either in the model configuration or during the forward pass. |
| 92 | + |
| 93 | +In version `v3.x`: |
| 94 | +```bash |
| 95 | +model = BertModel.from_pretrained("bert-base-cased") |
| 96 | +outputs = model(**inputs) |
| 97 | +``` |
| 98 | +to obtain the same in version `v4.x`: |
| 99 | +```bash |
| 100 | +model = BertModel.from_pretrained("bert-base-cased") |
| 101 | +outputs = model(**inputs, return_dict=False) |
| 102 | +``` |
| 103 | +or |
| 104 | +```bash |
| 105 | +model = BertModel.from_pretrained("bert-base-cased", return_dict=False) |
| 106 | +outputs = model(**inputs) |
| 107 | +``` |
| 108 | + |
| 109 | +#### 5. Removed some deprecated attributes |
| 110 | + |
| 111 | +Attributes that were deprecated have been removed if they had been deprecated for at least a month. The full list of deprecated attributes can be found in [#8604](https://github.com/huggingface/transformers/pull/8604). |
| 112 | + |
| 113 | +Here is a list of these attributes/methods/arguments and what their replacements should be: |
| 114 | + |
| 115 | +In several models, the labels become consistent with the other models: |
| 116 | +- `masked_lm_labels` becomes `labels` in `AlbertForMaskedLM` and `AlbertForPreTraining`. |
| 117 | +- `masked_lm_labels` becomes `labels` in `BertForMaskedLM` and `BertForPreTraining`. |
| 118 | +- `masked_lm_labels` becomes `labels` in `DistilBertForMaskedLM`. |
| 119 | +- `masked_lm_labels` becomes `labels` in `ElectraForMaskedLM`. |
| 120 | +- `masked_lm_labels` becomes `labels` in `LongformerForMaskedLM`. |
| 121 | +- `masked_lm_labels` becomes `labels` in `MobileBertForMaskedLM`. |
| 122 | +- `masked_lm_labels` becomes `labels` in `RobertaForMaskedLM`. |
| 123 | +- `lm_labels` becomes `labels` in `BartForConditionalGeneration`. |
| 124 | +- `lm_labels` becomes `labels` in `GPT2DoubleHeadsModel`. |
| 125 | +- `lm_labels` becomes `labels` in `OpenAIGPTDoubleHeadsModel`. |
| 126 | +- `lm_labels` becomes `labels` in `T5ForConditionalGeneration`. |
| 127 | + |
| 128 | +In several models, the caching mechanism becomes consistent with the other models: |
| 129 | +- `decoder_cached_states` becomes `past_key_values` in all BART-like, FSMT and T5 models. |
| 130 | +- `decoder_past_key_values` becomes `past_key_values` in all BART-like, FSMT and T5 models. |
| 131 | +- `past` becomes `past_key_values` in all CTRL models. |
| 132 | +- `past` becomes `past_key_values` in all GPT-2 models. |
| 133 | + |
| 134 | +Regarding the tokenizer classes: |
| 135 | +- The tokenizer attribute `max_len` becomes `model_max_length`. |
| 136 | +- The tokenizer attribute `return_lengths` becomes `return_length`. |
| 137 | +- The tokenizer encoding argument `is_pretokenized` becomes `is_split_into_words`. |
| 138 | + |
| 139 | +Regarding the `Trainer` class: |
| 140 | +- The `Trainer` argument `tb_writer` is removed in favor of the callback `TensorBoardCallback(tb_writer=...)`. |
| 141 | +- The `Trainer` argument `prediction_loss_only` is removed in favor of the class argument `args.prediction_loss_only`. |
| 142 | +- The `Trainer` attribute `data_collator` should be a callable. |
| 143 | +- The `Trainer` method `_log` is deprecated in favor of `log`. |
| 144 | +- The `Trainer` method `_training_step` is deprecated in favor of `training_step`. |
| 145 | +- The `Trainer` method `_prediction_loop` is deprecated in favor of `prediction_loop`. |
| 146 | +- The `Trainer` method `is_local_master` is deprecated in favor of `is_local_process_zero`. |
| 147 | +- The `Trainer` method `is_world_master` is deprecated in favor of `is_world_process_zero`. |
| 148 | + |
| 149 | +Regarding the `TFTrainer` class: |
| 150 | +- The `TFTrainer` argument `prediction_loss_only` is removed in favor of the class argument `args.prediction_loss_only`. |
| 151 | +- The `Trainer` method `_log` is deprecated in favor of `log`. |
| 152 | +- The `TFTrainer` method `_prediction_loop` is deprecated in favor of `prediction_loop`. |
| 153 | +- The `TFTrainer` method `_setup_wandb` is deprecated in favor of `setup_wandb`. |
| 154 | +- The `TFTrainer` method `_run_model` is deprecated in favor of `run_model`. |
| 155 | + |
| 156 | +Regarding the `TrainerArgument` class: |
| 157 | +- The `TrainerArgument` argument `evaluate_during_training` is deprecated in favor of `evaluation_strategy`. |
| 158 | + |
| 159 | +Regarding the Transfo-XL model: |
| 160 | +- The Transfo-XL configuration attribute `tie_weight` becomes `tie_words_embeddings`. |
| 161 | +- The Transfo-XL modeling method `reset_length` becomes `reset_memory_length`. |
| 162 | + |
| 163 | +Regarding pipelines: |
| 164 | +- The `FillMaskPipeline` argument `topk` becomes `top_k`. |
| 165 | + |
| 166 | + |
| 167 | + |
3 | 168 | ## Migrating from pytorch-transformers to 🤗 Transformers
|
4 | 169 |
|
5 | 170 | Here is a quick summary of what you should take care of when migrating from `pytorch-transformers` to 🤗 Transformers.
|
|
0 commit comments