Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
*.npz
*.pkl
*.pyc
188 changes: 0 additions & 188 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,188 +0,0 @@
# SalGAN: Visual Saliency Prediction with Generative Adversarial Networks

| ![Junting Pan][JuntingPan-photo] | ![Cristian Canton Ferrer][CristianCanton-photo] | ![Kevin McGuinness][KevinMcGuinness-photo] | ![Noel O'Connor][NoelOConnor-photo] | ![Jordi Torres][JordiTorres-photo] |![Elisa Sayrol][ElisaSayrol-photo] | ![Xavier Giro-i-Nieto][XavierGiro-photo] |
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
| [Junting Pan][JuntingPan-web] | [Cristian Canton Ferrer][CristianCanton-web] | [Kevin McGuinness][KevinMcGuinness-web] | [Noel O'Connor][NoelOConnor-web] | [Jordi Torres][JordiTorres-web] | [Elisa Sayrol][ElisaSayrol-web] | [Xavier Giro-i-Nieto][XavierGiro-web] |

[JuntingPan-web]: https://www.linkedin.com/in/junting-pan
[CristianCanton-web]: https://cristiancanton.github.io/
[KevinMcGuinness-web]: https://www.insight-centre.org/users/kevin-mcguinness
[JordiTorres-web]: jorditorres.org
[ElisaSayrol-web]: https://imatge.upc.edu/web/people/elisa-sayrol
[NoelOConnor-web]: https://www.insight-centre.org/users/noel-oconnor
[XavierGiro-web]: https://imatge.upc.edu/web/people/xavier-giro

[JuntingPan-photo]: https://raw.githubusercontent.com/imatge-upc/saliency-2016-cvpr/master/authors/JuntingPan.jpg "Junting Pan"
[KevinMcGuinness-photo]: https://raw.githubusercontent.com/imatge-upc/saliency-salgan-2017/junting/authors/Kevin160x160%202.jpg?token=AFOjyZmLlX3ZgpkNe60Vn3ruTsq01rD9ks5YdAaiwA%3D%3D "Kevin McGuinness"
[CristianCanton-photo]: https://raw.githubusercontent.com/imatge-upc/saliency-salgan-2017/junting/authors/CristianCanton.jpg?token=AFOjyS9qMOnUPVLZpqN80ChO0R-x0SI5ks5Yc3qJwA%3D%3D "Cristian Canton"
[JordiTorres-photo]: https://raw.githubusercontent.com/imatge-upc/saliency-salgan-2017/junting/authors/JordiTorres.jpg?token=AFOjyUaOhEyX2MGayU2C4tExpQeT0jFUks5Yc3vcwA%3D%3D
[ElisaSayrol-photo]: https://raw.githubusercontent.com/imatge-upc/saliency-2016-cvpr/master/authors/ElisaSayrol.jpg "Elisa Sayrol"
[NoelOConnor-photo]: https://raw.githubusercontent.com/imatge-upc/saliency-2016-cvpr/master/authors/NoelOConnor.jpg "Noel O'Connor"
[XavierGiro-photo]: https://raw.githubusercontent.com/imatge-upc/saliency-2016-cvpr/master/authors/XavierGiro.jpg "Xavier Giro-i-Nieto"

A joint collaboration between:

| ![logo-insight] | ![logo-dcu] | ![logo-microsoft] |![logo-bsc] | ![logo-gpi] |
|:-:|:-:|:-:|:-:|:-:|
| [Insight Centre for Data Analytics][insight-web] | [Dublin City University (DCU)][dcu-web] | [Microsoft][microsoft-web]|[Barcelona Supercomputing Center][bsc-web] | [UPC Image Processing Group][gpi-web] |

[insight-web]: https://www.insight-centre.org/
[dcu-web]: http://www.dcu.ie/
[microsoft-web]: https://www.microsoft.com/en-us/research/
[bsc-web]: https://www.bsc.es/
[upc-web]: http://www.upc.edu/?set_language=en
[etsetb-web]: https://www.etsetb.upc.edu/en/
[gpi-web]: https://imatge.upc.edu/web/


[logo-insight]: https://raw.githubusercontent.com/imatge-upc/saliency-2016-cvpr/master/logos/insight.jpg "Insight Centre for Data Analytics"
[logo-dcu]: https://raw.githubusercontent.com/imatge-upc/saliency-2016-cvpr/master/logos/dcu.png "Dublin City University"
[logo-microsoft]: https://raw.githubusercontent.com/imatge-upc/saliency-salgan-2017/junting/logos/microsoft.jpg?token=AFOjyc8Q1kkjcWIP-yen0FTEo0lsWPk6ks5Yc3j4wA%3D%3D "Microsoft"
[logo-bsc]: https://raw.githubusercontent.com/imatge-upc/saliency-salgan-2017/junting/logos/bsc320x86.jpg?token=AFOjyWSHWWVvzTXnYh1DiFvH2VoWykA3ks5Yc6Q1wA%3D%3D
[logo-upc]: https://raw.githubusercontent.com/imatge-upc/saliency-2016-cvpr/master/logos/upc.jpg "Universitat Politecnica de Catalunya"
[logo-etsetb]: https://raw.githubusercontent.com/imatge-upc/saliency-2016-cvpr/master/logos/etsetb.png "ETSETB TelecomBCN"
[logo-gpi]: https://raw.githubusercontent.com/imatge-upc/saliency-2016-cvpr/master/logos/gpi.png "UPC Image Processing Group"


## Abstract

We introduce SalGAN, a deep convolutional neural network for visual saliency prediction trained with adversarial examples.
The first stage of the network consists of a generator model whose weights are learned by back-propagation computed from a binary cross entropy (BCE) loss over downsampled versions of the saliency maps. The resulting prediction is processed by a discriminator network trained to solve a binary classification task between the saliency maps generated by the generative stage and the ground truth ones. Our experiments show how adversarial training allows reaching state-of-the-art performance across different metrics when combined with a widely-used loss function like BCE.

## Slides

<center>
<iframe src="//www.slideshare.net/slideshow/embed_code/key/5cXl80Fm2c3ksg" width="595" height="485" frameborder="0" marginwidth="0" marginheight="0" scrolling="no" style="border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;" allowfullscreen> </iframe> <div style="margin-bottom:5px"> <strong> <a href="//www.slideshare.net/xavigiro/salgan-visual-saliency-prediction-with-generative-adversarial-networks" title="SalGAN: Visual Saliency Prediction with Generative Adversarial Networks" target="_blank">SalGAN: Visual Saliency Prediction with Generative Adversarial Networks</a> </strong> from <strong><a target="_blank" href="//www.slideshare.net/xavigiro">Xavier Giro</a></strong> </div>
</center>

## Publication

Find the pre-print version of our work on [arXiv](https://arxiv.org/abs/1701.01081).

![Image of the paper](https://raw.githubusercontent.com/imatge-upc/saliency-salgan-2017/master/figs/thumbnails.jpg)

Please cite with the following Bibtex code:

```
@InProceedings{Pan_2017_SalGAN,
author = {Pan, Junting and Canton, Cristian and McGuinness, Kevin and O'Connor, Noel E. and Torres, Jordi and Sayrol, Elisa and Giro-i-Nieto, Xavier and},
title = {SalGAN: Visual Saliency Prediction with Generative Adversarial Networks},
booktitle = {arXiv},
month = {January},
year = {2017}
}
```

You may also want to refer to our publication with the more human-friendly Chicago style:

*Junting Pan, Cristian Canton, Kevin McGuinness, Noel E. O'Connor, Jordi Torres, Elisa Sayrol and Xavier Giro-i-Nieto. "SalGAN: Visual Saliency Prediction with Generative Adversarial Networks." arXiv. 2017.*



## Models

The SalGAN presented in our work can be downloaded from the links provided below the figure:

SalGAN Architecture
![architecture-fig]

* [[SalGAN Generator Model (127 MB)]](https://imatge.upc.edu/web/sites/default/files/resources/1720/saliency/2017-salgan/gen_modelWeights0090.npz)
* [[SalGAN Discriminator (3.4 MB)]](https://imatge.upc.edu/web/sites/default/files/resources/1720/saliency/2017-salgan/discrim_modelWeights0090.npz)

[architecture-fig]: https://raw.githubusercontent.com/imatge-upc/saliency-salgan-2017/junting/figs/fullarchitecture.jpg?token=AFOjyaH8cuBFWpldWWzo_TKVB-zekfxrks5Yc4NQwA%3D%3D "SALGAN architecture"
[shallow-model]: https://imatge.upc.edu/web/sites/default/files/resources/1720/saliency/2016-cvpr/shallow_net.pickle
[deep-model]: https://imatge.upc.edu/web/sites/default/files/resources/1720/saliency/2016-cvpr/deep_net_model.caffemodel
[deep-prototxt]: https://imatge.upc.edu/web/sites/default/files/resources/1720/saliency/2016-cvpr/deep_net_deploy.prototxt

## Visual Results

![Qualitative saliency predictions](https://raw.githubusercontent.com/imatge-upc/saliency-salgan-2017/junting/figs/qualitative.jpg?token=AFOjyaO0uT7l7qGzV7IyrcSgi8ieeayTks5Yc4s2wA%3D%3D)


## Datasets

### Training
As explained in our paper, our networks were trained on the training and validation data provided by [SALICON](http://salicon.net/).

### Test
Two different dataset were used for test:
* Test partition of [SALICON](http://salicon.net/) dataset.
* [MIT300](http://saliency.mit.edu/datasets.html).


## Software frameworks

Our paper presents two convolutional neural networks, one correspends to the Generator (Saliency Prediction Network) and the another is the Discriminator for the adversarial training. To compute saliency maps only the Generator is needed.

### SalGAN on Lasagne

SalGAN is implemented in [Lasagne](https://github.com/Lasagne/Lasagne), which at its time is developed over [Theano](http://deeplearning.net/software/theano/).
```
pip install -r https://github.com/imatge-upc/saliency-salgan-2017/blob/junting/requirements.txt
```

### Usage

To train our model from scrath you need to run the following command:
```
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32,lib.cnmem=1,optimizer_including=cudnn python 02-train.py
```
In order to run the test script to predict saliency maps, you can run the following command after specifying the path to you images and the path to the output saliency maps:
```
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32,lib.cnmem=1,optimizer_including=cudnn python 03-predict.py
```
With the provided model weights you should obtain the follwing result:

| ![Image Stimuli] | ![Saliency Map] |
|:-:|:-:|

[Image Stimuli]:https://raw.githubusercontent.com/imatge-upc/saliency-salgan-2017/master/images/i112.jpg
[Saliency Map]:https://raw.githubusercontent.com/imatge-upc/saliency-salgan-2017/master/saliency/i112.jpg

Download the pretrained VGG-16 weights from: [vgg16.pkl](https://s3.amazonaws.com/lasagne/recipes/pretrained/imagenet/vgg16.pkl)


## Acknowledgements

We would like to especially thank Albert Gil Moreno and Josep Pujal from our technical support team at the Image Processing Group at the UPC.

| ![AlbertGil-photo] | ![JosepPujal-photo] |
|:-:|:-:|
| [Albert Gil](AlbertGil-web) | [Josep Pujal](JosepPujal-web) |

[AlbertGil-photo]: https://raw.githubusercontent.com/imatge-upc/saliency-2016-cvpr/master/authors/AlbertGil.jpg "Albert Gil"
[JosepPujal-photo]: https://raw.githubusercontent.com/imatge-upc/saliency-2016-cvpr/master/authors/JosepPujal.jpg "Josep Pujal"

[AlbertGil-web]: https://imatge.upc.edu/web/people/albert-gil-moreno
[JosepPujal-web]: https://imatge.upc.edu/web/people/josep-pujal

| | |
|:--|:-:|
| We gratefully acknowledge the support of [NVIDIA Corporation](http://www.nvidia.com/content/global/global.php) with the donation of the GeoForce GTX [Titan Z](http://www.nvidia.com/gtx-700-graphics-cards/gtx-titan-z/) and [Titan X](http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-titan-x) used in this work. | ![logo-nvidia] |
| The Image ProcessingGroup at the UPC is a [SGR14 Consolidated Research Group](https://imatge.upc.edu/web/projects/sgr14-image-and-video-processing-group) recognized and sponsored by the Catalan Government (Generalitat de Catalunya) through its [AGAUR](http://agaur.gencat.cat/en/inici/index.html) office. | ![logo-catalonia] |
| This work has been developed in the framework of the projects [BigGraph TEC2013-43935-R](https://imatge.upc.edu/web/projects/biggraph-heterogeneous-information-and-graph-signal-processing-big-data-era-application) and [Malegra TEC2016-75976-R](https://imatge.upc.edu/web/projects/malegra-multimodal-signal-processing-and-machine-learning-graphs), funded by the Spanish Ministerio de Economía y Competitividad and the European Regional Development Fund (ERDF). | ![logo-spain] |
| This publication has emanated from research conducted with the financial support of Science Foundation Ireland (SFI) under grant number SFI/12/RC/2289. | ![logo-ireland] |

[logo-nvidia]: https://raw.githubusercontent.com/imatge-upc/saliency-2016-cvpr/master/logos/nvidia.jpg "Logo of NVidia"
[logo-catalonia]: https://raw.githubusercontent.com/imatge-upc/saliency-2016-cvpr/master/logos/generalitat.jpg "Logo of Catalan government"
[logo-spain]: https://raw.githubusercontent.com/imatge-upc/saliency-2016-cvpr/master/logos/MEyC.png "Logo of Spanish government"
[logo-ireland]: https://raw.githubusercontent.com/imatge-upc/saliency-2016-cvpr/master/logos/sfi.png "Logo of Science Foundation Ireland"

## Contact

If you have any general doubt about our work or code which may be of interest for other researchers, please use the [public issues section](https://github.com/imatge-upc/saliency-salgan-2017/issues) on this github repo. Alternatively, drop us an e-mail at <mailto:[email protected]>.

<!---
Javascript code to enable Google Analytics
-->

<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

ga('create', 'UA-7678045-13', 'auto');
ga('send', 'pageview');

</script>
Binary file removed authors/AlbertGil.jpg
Binary file not shown.
Binary file removed authors/CristianCanton.jpg
Binary file not shown.
Binary file removed authors/ElisaSayrol.jpg
Binary file not shown.
Binary file removed authors/JordiTorres.jpg
Binary file not shown.
Binary file removed authors/JosepPujal.jpg
Binary file not shown.
Binary file removed authors/JuntingPan.jpg
Binary file not shown.
Binary file removed authors/Kevin160x160 2.jpg
Binary file not shown.
Binary file removed authors/KevinMcGuinness.jpg
Binary file not shown.
Binary file removed authors/KevinMcGuinness.png
Binary file not shown.
Binary file removed authors/NoelOConnor.jpg
Binary file not shown.
Binary file removed authors/NoelOConnor.png
Binary file not shown.
Binary file removed authors/XavierGiro.jpg
Binary file not shown.
3 changes: 3 additions & 0 deletions data/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
*
*/
!.gitignore
Binary file removed figs/fullarchitecture.jpg
Binary file not shown.
Binary file removed figs/qualitative.jpg
Binary file not shown.
Binary file removed figs/thumbnail.jpg
Binary file not shown.
Binary file removed figs/thumbnails.jpg
Binary file not shown.
Binary file removed images/i112.jpg
Binary file not shown.
Binary file removed logos/MEyC.png
Binary file not shown.
Binary file removed logos/bsc320x86.jpg
Binary file not shown.
Binary file removed logos/cvpr2016.jpg
Binary file not shown.
Binary file removed logos/dcu.png
Binary file not shown.
Binary file removed logos/etsetb.png
Binary file not shown.
Binary file removed logos/generalitat.jpg
Binary file not shown.
Binary file removed logos/gpi.png
Binary file not shown.
Binary file removed logos/insight.jpg
Binary file not shown.
Binary file removed logos/microsoft.jpg
Diff not rendered.
Binary file removed logos/nvidia.jpg
Diff not rendered.
Binary file removed logos/sfi.png
Diff not rendered.
Binary file removed logos/upc.jpg
Diff not rendered.
Binary file removed saliency/i112.jpg
Diff not rendered.
58 changes: 58 additions & 0 deletions scripts/00-data_preparation.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
import os
import numpy as np
from PIL import Image
from PIL import ImageOps
from scipy import misc
import scipy.io
from skimage import io
import cv2
import sys
import cPickle as pickle
import glob
import random
from tqdm import tqdm
from eliaLib import dataRepresentation
from constants import *
from PIL import Image
from PIL import ImageOps
import pdb


def augment_data():
listImgFiles = [k.split('/')[-1].split('.')[0] for k in glob.glob(os.path.join(pathToImages, '*'))]
listFilesTrain = [k for k in listImgFiles if 'train' in k]
listFilesVal = [k for k in listImgFiles if 'train' not in k]
for filenames in tqdm(listFilesTrain):
for angle in [90, 180, 270]:
src_im = Image.open(os.path.join(pathToImages,filenames+'.png'))
gt_im = Image.open(os.path.join(pathToMaps,filenames+'mask.png'))
rot_im = src_im.rotate(angle,expand=True)
rot_gt = gt_im.rotate(angle,expand=True)
rot_im.save(os.path.join(pathToImages,filenames+'_'+str(angle)+'.png'))
rot_gt.save(os.path.join(pathToMaps,filenames+'_'+str(angle)+'mask.png'))
vert_im = ImageOps.flip(src_im)
vert_gt = ImageOps.flip(gt_im)
horz_im = ImageOps.mirror(src_im)
horz_gt = ImageOps.mirror(gt_im)
vert_im.save(os.path.join(pathToImages,filenames+'_vert.png'))
vert_gt.save(os.path.join(pathToMaps,filenames+'_vertmask.png'))
horz_im.save(os.path.join(pathToImages,filenames+'_horz.png'))
horz_gt.save(os.path.join(pathToMaps,filenames+'_horzmask.png'))

def split_data(fraction):
listImgFiles = [k.split('/')[-1].split('.')[0] for k in glob.glob(os.path.join(pathToImages, '*'))]
numSamples = len(listImgFiles)
train_ind = np.random.choice(np.arange(0,numSamples),(1,np.ceil(fraction*numSamples)),replace=False).squeeze()
val_ind = np.array(np.setdiff1d(np.arange(0,numSamples),train_ind)).squeeze()

for k in train_ind:
os.rename(os.path.join(pathToImages,listImgFiles[k] + '.png'),os.path.join(pathToImages,'train_'+listImgFiles[k]+'.png'))
os.rename(os.path.join(pathToMaps,listImgFiles[k] + 'mask.png'),os.path.join(pathToMaps,'train_'+listImgFiles[k]+'mask.png'))
for k in val_ind:
os.rename(os.path.join(pathToImages,listImgFiles[k] + '.png'),os.path.join(pathToImages,'val_'+listImgFiles[k]+'.png'))
os.rename(os.path.join(pathToMaps,listImgFiles[k] + 'mask.png'),os.path.join(pathToMaps,'val_'+listImgFiles[k]+'mask.png'))
def main():
# split_data(0.8)
augment_data()
if __name__== "__main__":
main()
Loading