Skip to content

Florence-2 | Default LoRA config produces suboptimal results | Found a better config #162

Open
@patel-zeel

Description

Search before asking

  • I have searched the Multimodal Maestro issues and found no similar bug report.

Bug

The default LoRA config used in maestro is

config = LoraConfig(
r=8,
lora_alpha=16,
lora_dropout=0.05,
bias="none",
target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
task_type="CAUSAL_LM",
)

LoRA config used in the Florence-2 fine-tuning on custom dataset Roboflow notebook is the following:

config = LoraConfig(
    r=8,
    lora_alpha=8,
    target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
    task_type="CAUSAL_LM",
    lora_dropout=0.05,
    bias="none",
    inference_mode=False,
    use_rslora=True,
    init_lora_weights="gaussian",
    revision=REVISION
)

For the poker-cards-fmjio dataset, the default LoRA config of maestro results in a mAP50 value of 0.20, but the Roboflow notebook config results in a mAP50 value of 0.52. I experimentally found a config that results in a mAP50 value of 0.71. Please see Minimal Reproducible Example for more.

Environment

  • multimodel-maestro = 1.0.0
  • OS: Ubuntu 20.04
  • Python: 3.10.15

Minimal Reproducible Example

I used 3 variants of LoRA config and results are as described below:

Configs

Maestro default

config = LoraConfig(
  r=8,
  lora_alpha=16,
  target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
  task_type="CAUSAL_LM",
  lora_dropout=0.05,
  bias="none",
)

Maestro default + Gaussian init

config = LoraConfig(
  r=8,
  lora_alpha=16,
  target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
  task_type="CAUSAL_LM",
  lora_dropout=0.05,
  bias="none",
  init_lora_weights="gaussian",
)

Roboflow notebook default

config = LoraConfig(
  r=8,
  lora_alpha=8,
  target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
  task_type="CAUSAL_LM",
  lora_dropout=0.05,
  bias="none",
  inference_mode=False,
  use_rslora=True,
  init_lora_weights="gaussian",
)

Roboflow notebook default except lora_alpha=16

config = LoraConfig(
  r=8,
  lora_alpha=16,
  target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "linear", "Conv2d", "lm_head", "fc2"],
  task_type="CAUSAL_LM",
  lora_dropout=0.05,
  bias="none",
  inference_mode=False,
  use_rslora=True,
  init_lora_weights="gaussian",
)

Metrics

I used the Roboflow notebook to run the pipeline for 10 epochs and compute the metrics. I have used the new evaluation API as follows:

mean_average_precision = sv.metrics.MeanAveragePrecision().update(predictions, targets).compute()
map50_95 = mean_average_precision.map50_95
map50 = mean_average_precision.map50

p = sv.metrics.Precision().update(predictions, targets).compute()
precision_at_50 = p.precision_at_50

r = sv.metrics.Recall().update(predictions, targets).compute()
recall_at_50 = r.recall_at_50

Results

Config mAP50 mAP50-95 Precision50 Recall50
Maestro default 0.20 0.18 0.21 0.14
Maestro default + Gaussian init 0.32 0.30 0.54 0.35
Roboflow notebook default 0.52 0.47 0.66 0.58
Roboflow notebook default except lora_alpha=16 0.71 0.65 0.78 0.75

Conclusion

Using lora_alpha=16 in Roboflow notebook default LoRA config results in much better performance with same number of epochs.

Questions

  1. Should maestro give LoRA config control to users? Probably users can then play with the values and find the best config that works for them. It might be defined as toml or json or any other format file and then users provide the path of the config to maestro CLI.

Additional

No response

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions