在私域数据集上微调后做caption with grouding，但是bbox框不到目标，差距很大

微调数据集大致格式如下：{
    "id": "00000026",
    "conversations": [
      {
        "from": "user",
        "value": "<img>images/train/ECD0016015_B200A008C0.jpg</img>\nPlease locate the anomalous structure in the image, state which terrain type it belongs to, and summarize its geographic features."
      },
      {
        "from": "assistant",
        "value": "<ref>Anomalous geological formation in mountainous terrain</ref><box>(647,573),(799,743)</box> The anomalous structure lies within the annotated region, showing topographic irregularities, spectral variations, an increased elevation-to-base ratio, and irregular surface textures."
      }
    ]
  }
微调后的模型在同样这张图上做推理的时候输出的box里面坐标是(405,777),(544,942)，和grounding坐标相差很大，完全框不到目标，但是生成的文本又是对的，这是模型本身的能力问题还是我没有找到最合适的参数？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

在私域数据集上微调后做caption with grouding，但是bbox框不到目标，差距很大 #522

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

在私域数据集上微调后做caption with grouding，但是bbox框不到目标，差距很大 #522

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions