Thank you very much for your work. In your paper, you mentioned: Finally, we construct {image, question, answer} (IQA) triples by assigning images to the annotated QA pairs of a visual entity, followed by human verification and clarification of the questions if multiple objects are present in the image. This implies that you filtered out images containing multiple objects, leaving only those with a single object.
I would like to know if you have a backup of the dataset before the filtering process, as I am specifically interested in images with multiple objects. Alternatively, could you recommend any datasets under the Knowledge-based Visual Question Answering (KB-VQA) task that include images with multiple objects?
Thank you so much!
Thank you very much for your work. In your paper, you mentioned:
Finally, we construct {image, question, answer} (IQA) triples by assigning images to the annotated QA pairs of a visual entity, followed by human verification and clarification of the questions if multiple objects are present in the image.This implies that you filtered out images containing multiple objects, leaving only those with a single object.I would like to know if you have a backup of the dataset before the filtering process, as I am specifically interested in images with multiple objects. Alternatively, could you recommend any datasets under the Knowledge-based Visual Question Answering (KB-VQA) task that include images with multiple objects?
Thank you so much!