Skip to content

官方要求的微调数据集中用的<img></img>,但是模型生成的内容出现了<|im_end|>、<|im_start|>、<|endoftext|> #521

@Missfoxsy

Description

@Missfoxsy

我按照官方给出的微调数据集格式做了一个数据集,如下:
{
"id": "cervical_00000004",
"conversations": [
{
"from": "user",
"value": "Picture 1: cevical_cell_dataset/images/train/ECD0016015_B2004008C10.jpg\nWrite an English summary and specify the grounded coordinates:"
},
{
"from": "assistant",
"value": "positive cervical cell within an epithelial cell cluster(276,268),(609,748) This area contains atypical cells with high N/C ratios and dark staining nuclei; adjacent squamous epithelium is richer in cytoplasm and reflects normal maturation."
}
]
},
但是微调后模型的输出结果里面出现了<|im_end|><|im_start|><|endoftext|>,并且生成的内容与数据集无关,如下:
The figure(212,782),(785,999) shows the left ventral surface of the brain at the beginning of the left hemisphere, with the left lobe partially enlarged and the left hemisphere nonvisual. In this region the atrophy is more marked, but still the surface tissue remains somewhat preserved, and the encephalopathy is more advanced than in other places.
图中 H顶:The figure(212,782),(785,999) shows the left ventral surface of the brain at the beginning of the left hemisphere, with the left lobe partially enlarged and the left hemisphere nonvisual. In this region the atrophy is more marked, but still the surface tissue remains somewhat preserved, and the encephalopathy is more advanced than in other places.<|im_end|>
<|im_start|>
H描述:“The figure(212,782),(785,999)-This caption(212,782),(785,999) shows the left ventral surface of the brain at the beginning of the left hemisphere, with the left lobe partially enlarged and the left hemisphere nonvisual. In this region the atrophy is more marked, but still the surface tissue remains somewhat preserved, and the encephalopathy is more advanced than in other places.<|im_end|>
<|endoftext|>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions