Skip to content

How to design negative samples for Florence-2 model training? #52

Closed
@David-19940718

Description

@David-19940718

Search before asking

  • I have searched the Multimodal Maestro issues and found no similar feature requests.

Question

Hi, @skylargivens,

We currently have a good understanding of how to create positive samples for the Florence-2 model, using a format like this:

{
  "image": "IMG_20220316_144445_jpg.rf.a79f523e54855af2323f0cfdb9a4dedc.jpg",
  "prefix": "<OD>",
  "suffix": "5 of hearts<loc_54><loc_213><loc_291><loc_598>6 of hearts<loc_205><loc_251><loc_471><loc_670>7 of hearts<loc_363><loc_309><loc_688><loc_797>8 of hearts<loc_598><loc_395><loc_973><loc_974>"
}

However, I'm unclear on how to properly design negative samples for training. Negative samples are crucial for improving the model's ability to discriminate and reduce false positives. Some questions I have:

  1. Should negative samples use the same image but with incorrect object descriptions?
  2. Do we need to use completely unrelated images and descriptions?
  3. How do we handle the location tags for negative samples?
  4. What's the recommended ratio of positive to negative samples in the training set?

Any guidance or best practices for creating effective negative samples would be greatly appreciated. This will help ensure we're training the Florence-2 model optimally for object detection tasks.

Additional

If there are any existing resources, documentation, or examples specifically for Florence-2 negative sample creation, please point me in that direction. Also, if there are any tools or scripts the team recommends for generating or augmenting negative samples, that information would be very helpful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions