Description
Search before asking
- I have searched the Multimodal Maestro issues and found no similar bug report.
Bug
The default LoRA config used in maestro
is
maestro/maestro/trainer/models/florence_2/checkpoints.py
Lines 50 to 57 in cecc78f
LoRA config used in the Florence-2 fine-tuning on custom dataset Roboflow notebook is the following:
config = LoraConfig(
r=8,
lora_alpha=8,
target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
task_type="CAUSAL_LM",
lora_dropout=0.05,
bias="none",
inference_mode=False,
use_rslora=True,
init_lora_weights="gaussian",
revision=REVISION
)
For the poker-cards-fmjio
dataset, the default LoRA config of maestro
results in a mAP50 value of 0.20, but the Roboflow notebook config results in a mAP50 value of 0.52. I experimentally found a config that results in a mAP50 value of 0.71. Please see Minimal Reproducible Example
for more.
Environment
- multimodel-maestro = 1.0.0
- OS: Ubuntu 20.04
- Python: 3.10.15
Minimal Reproducible Example
I used 3 variants of LoRA config and results are as described below:
Configs
Maestro default
config = LoraConfig(
r=8,
lora_alpha=16,
target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
task_type="CAUSAL_LM",
lora_dropout=0.05,
bias="none",
)
Maestro default + Gaussian init
config = LoraConfig(
r=8,
lora_alpha=16,
target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
task_type="CAUSAL_LM",
lora_dropout=0.05,
bias="none",
init_lora_weights="gaussian",
)
Roboflow notebook default
config = LoraConfig(
r=8,
lora_alpha=8,
target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
task_type="CAUSAL_LM",
lora_dropout=0.05,
bias="none",
inference_mode=False,
use_rslora=True,
init_lora_weights="gaussian",
)
Roboflow notebook default except lora_alpha=16
config = LoraConfig(
r=8,
lora_alpha=16,
target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
task_type="CAUSAL_LM",
lora_dropout=0.05,
bias="none",
inference_mode=False,
use_rslora=True,
init_lora_weights="gaussian",
)
Metrics
I used the Roboflow notebook to run the pipeline for 10 epochs and compute the metrics. I have used the new evaluation API as follows:
mean_average_precision = sv.metrics.MeanAveragePrecision().update(predictions, targets).compute()
map50_95 = mean_average_precision.map50_95
map50 = mean_average_precision.map50
p = sv.metrics.Precision().update(predictions, targets).compute()
precision_at_50 = p.precision_at_50
r = sv.metrics.Recall().update(predictions, targets).compute()
recall_at_50 = r.recall_at_50
Results
Config | mAP50 | mAP50-95 | Precision50 | Recall50 |
---|---|---|---|---|
Maestro default | 0.20 | 0.18 | 0.21 | 0.14 |
Maestro default + Gaussian init | 0.32 | 0.30 | 0.54 | 0.35 |
Roboflow notebook default | 0.52 | 0.47 | 0.66 | 0.58 |
Roboflow notebook default except lora_alpha=16 | 0.71 | 0.65 | 0.78 | 0.75 |
Conclusion
Using lora_alpha=16 in Roboflow notebook default
LoRA config results in much better performance with same number of epochs.
Questions
- Should maestro give LoRA config control to users? Probably users can then play with the values and find the best config that works for them. It might be defined as
toml
orjson
or any other format file and then users provide the path of the config to maestro CLI.
Additional
No response
Are you willing to submit a PR?
- Yes I'd like to help by submitting a PR!
Activity