You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+39-5
Original file line number
Diff line number
Diff line change
@@ -11,6 +11,19 @@ Parameter-efficient transfer learning (PETL) methods only tune a small number of
11
11
12
12

13
13
14
+
## Updates
15
+
16
+
**Mar 24, 2022**
17
+
18
+
Our MAM adapter and parallel adapter are integrated into the [adapter-transformers](https://github.com/Adapter-Hub/adapter-transformers) package (thanks to their developers!), please check their [release blog](https://adapterhub.ml/blog/2022/03/adapter-transformers-v3-unifying-efficient-fine-tuning/) on the details. With adapter-transformers, you may apply MAM adapter or parallel adapter to a wide variety of tasks and pretrained models easily, for example, the code below sets up a MAM adapter based on a pretrained model:
19
+
```python
20
+
# this is a usage case based on the adapter-transformer package, not this repo
21
+
from transformers.adapters import MAMConfig
22
+
23
+
config = MAMConfig()
24
+
model.add_adapter("mam_adapter", config=config)
25
+
```
26
+
14
27
## Dependencies
15
28
16
29
This repo is a fork of the [huggingface transformers](https://github.com/huggingface/transformers) repo (forked on June 23, 2021), and the code is tested on [PyTorch](https://pytorch.org) 1.9.0. Please follow the instructions below to install dependencies after you set up PyTorch:
@@ -61,7 +74,7 @@ ffn_adapter_init_option="lora"
61
74
ffn_adapter_scalar="4"
62
75
ffn_bn=512 # ffn bottleneck dim
63
76
64
-
# ----- prefix tuning baseline -----
77
+
# ----- prefix tuning baseline -----
65
78
# attn_mode="prefix"
66
79
# attn_option="concat"
67
80
# attn_composition="add"
@@ -74,7 +87,7 @@ ffn_bn=512 # ffn bottleneck dim
74
87
# ffn_adapter_scalar="4"
75
88
# ffn_bn=512 # ffn bottleneck dim
76
89
77
-
# ----- Houlsby Adapter -----
90
+
# ----- Houlsby Adapter -----
78
91
# attn_mode="adapter"
79
92
# attn_option="sequential"
80
93
# attn_composition="add"
@@ -87,7 +100,8 @@ ffn_bn=512 # ffn bottleneck dim
87
100
# ffn_adapter_scalar="1"
88
101
# ffn_bn=200 # ffn bottleneck dim
89
102
90
-
# ----- FFN Scaled Parallel Adapter -----
103
+
104
+
# ----- FFN Scaled Parallel Adapter -----
91
105
# attn_mode="none"
92
106
# attn_option="parallel"
93
107
# attn_composition="add"
@@ -100,7 +114,7 @@ ffn_bn=512 # ffn bottleneck dim
100
114
# ffn_adapter_scalar="4"
101
115
# ffn_bn=512 # ffn bottleneck dim
102
116
103
-
# ----- Prompt Tuning -----
117
+
# ----- Prompt Tuning -----
104
118
# attn_mode="prompt_tuning"
105
119
# attn_option="parallel"
106
120
# attn_composition="add"
@@ -113,7 +127,7 @@ ffn_bn=512 # ffn bottleneck dim
113
127
# ffn_adapter_scalar="4"
114
128
# ffn_bn=512 # ffn bottleneck dim
115
129
116
-
# ----- bitfit -----
130
+
# ----- bitfit -----
117
131
# attn_mode="bitfit"
118
132
# attn_option="parallel"
119
133
# attn_composition="add"
@@ -125,6 +139,26 @@ ffn_bn=512 # ffn bottleneck dim
125
139
# ffn_adapter_init_option="lora"
126
140
# ffn_adapter_scalar="4"
127
141
# ffn_bn=512 # ffn bottleneck dim
142
+
143
+
# ----- lora -----
144
+
# attn_mode="lora"
145
+
# attn_option="none"
146
+
# attn_composition="add"
147
+
# attn_bn=16
148
+
149
+
# # set ffn_mode to be 'lora' to use
150
+
# # lora at ffn as well
151
+
152
+
# ffn_mode="none"
153
+
# ffn_option="none"
154
+
# ffn_adapter_layernorm_option="none"
155
+
# ffn_adapter_init_option="bert"
156
+
# ffn_adapter_scalar="1"
157
+
# ffn_bn=16
158
+
159
+
# lora_alpha=32
160
+
# lora_dropout=0.1
161
+
# lora_init="lora"
128
162
```
129
163
130
164
There are more variations than what is shown above. Please see a complete explanation of these arguments [here](https://github.com/jxhe/unified-parameter-efficient-tuning/blob/25b44ac0e6f70e116af15cb866faa9ddc13b6c77/petl/options.py#L45) in `petl/options.py`. The results of all the variants reported in the paper could be reproduced by changing these values in the scripts.
0 commit comments