In this repository we have provided the model for captioning chest x-ray images to facilitate examination of these images. Colons can be used to align columns.
| Problme | X-Ray image captioning |
|---|---|
| Dataset | Indiana university chest x-ray |
| Dataset link | https://www.kaggle.com/datasets/raddar/chest-xrays-indiana-university |
| Train set | 2559 pairs of frontal chest x-ray and corresponding captions |
| Validation set | 320 pairs of frontal chest x-ray and corresponding captions |
| Test set | 320 pairs of frontal chest x-ray and corresponding captions |
| Models | 1. CNN-LSTM 2. Blip(https://huggingface.co/Salesforce/blip-image-captioning-base) 3. GIT(https://huggingface.co/microsoft/git-large) 4. Blip2(future work) |
| Evaluation metric | BLEU score |
Our base model is based on encoder-decoder architecture. In the image encoder part, we will use pre-trained CheXnet which is a DenseNet with 121 layers and has been trained on 112,000 chest X-ray images. To obtain text embeddings, we have used the Glove model.

This model has used ViT-L/16 for the image encoder part and Bert model for the text encoder part. Pretrained on COCO dataset.
GIT is a Transformer decoder conditioned on both CLIP image tokens and text tokens. The model is trained using "teacher forcing" on a lot of (image, text) pairs. The goal for the model is simply to predict the next text token, giving the image tokens and previous text tokens.
Pretrained on COCO dataset.
| Model | directory path |
|---|---|
| CNN-LSTM | Base_model/Base_model.ipynb |
| Blip | Blip/Blip_model.ipynb |
| Blip results(images with generated captions) | Blip/Blip_test.ipynb |
| GIT | GIT/GIT_test.ipynb |
| GIT results(images with generated captions) | GIT/GIT_test.ipynb |
| Model | average Bleu |
|---|---|
| CNN-LSTM Beam-serach | 0.1521 |
| CNN-LSTM Greedy-search | 0.1482 |
| Blip | 0.205 |
| GIT | 0.212 |
| Name | |
|---|---|
| Mehrab Moradzadeh | [email protected] |
| Ali Derakhsesh | [email protected] |
| Mohammad Taha Teimuri Jervakani | [email protected] |
| Abolfazl malekahmadi | [email protected] |
| Alireza Aghaei | [email protected] |

