Skip to content

teodor-cotet/ImageCaptioning

Repository files navigation

Image-Captioning using VGG for feature extraction

Using Flickr8k dataset 1GB. for each photo 5 descriptions are available.

Used Keras with Tensorflow backend for the code. VGG is used for extracting the features.

No Beam search is yet implemented.

You can download the weights here

Examples

"epoch1" "epoch7" "epoch12"

Dependencies

  • Keras 1.2.2
  • Tensorflow 0.12.1
  • numpy
  • matplotlib

References

[1] Vinyals, Oriol, et al. "Show and tell: A neural image caption generator." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015. Show and Tell: A Neural Image Caption Generator

[2] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014). VGG

About

Image Captioning on Flickr8k dataset using RNN and CNN

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published