Skip to content

dataset seems imcomplete #91

@BroJunn

Description

@BroJunn

Dear devs of Tinnyllava,

When I finetune with "LLAVA" dataset, I got this error.

FileNotFoundError: [Errno 2] No such file or directory: '~/data/tinyllava/train/ocr_vqa/images/1574770225.jpg'

Even after redownloading the ocr_vqa, the same error exists. I followed the whole instruction here. And download ocr_vqa with this script here.
PS: The pretrain part of llava dataset was fine.

I provide you more information about this subdataset below. Hope you could give me some suggest to fix this since it seems like a single case from my side.

(base) [tum_piz8108@hkn1993 train]$ ls ocr_vqa/images/ -1 | wc -l
207572
(base) [tum_piz8108@hkn1993 train]$ find ocr_vqa/images/ -type f -name "*.jpg" | wc -l
206671

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions