Vincent Fortin and I are using the UTK Faces dataset to for the project in the Machine Learning I project.
Unbalanced classes is one of the most frequent struggle when dealing with real data. Is it better to down/upsample, or do nothing at all? Another approach is to generate samples resembling the smallest class. In this project, we are using Variational AutoEncoders (VAEs) and Generative Adversarial Networks (GANs) to generate samples of the smallest class. Using human faces, we will determine if a convolutional neural network (CNN) will be trained better with generated samples, or without.
- First we trained a VAE to generate human faces
- Then we trained a ConvNet with Pytorch but it didn't work.
- So we tried with Keras to see if our architecture was the problem. It's not. We reached 90% accuracy.
- Here is the Adversarial Auto Encoder. The results are very clear.
- Here is the Wasserstein GAN.
- The Softmax GAN worked out pretty well.
- The Deep Convolutional GAN has worked but its performance is quite low.
- Finally fixed the Pytorch CNN, with 92% accuracy!
- The CNN was able to classify generated samples, when trained on the original samples, with 100% accuracy.
- Train a Tensorflow convolutional neural network as classifier
- Create a GAN to generate human faces
- Explore other generative methods
- Train CNNs to see if the accuracy is better with the generative methods
- Fix the Pytorch CNN
- Use Keras and Pydot to plot the chosen architecture
- Use generated samples as test set to see if there is untapped information
- Create various sample generators
- Establish a benchmark CNN classifier, trained with 10% of the female samples (smaller class)
- Train classifiers on 10% of the female samples, and add generated samples. Finally, compare performance.
- VAE
- GAN
- other
- Compare performance, plot
- Determine if the generated samples have information that is not contained in the original pictures
This is the output (generated faces) of the adversarial autoencoder.
