Open
Description
DALLE on FashionGen
- I trained Dall-E + VQGAN on the FashionGen dataset (https://arxiv.org/abs/1806.08317) on Google Colab and got decent results.
- Without the VQGAN training on the FashionGen dataset, DALLE is really bad at generating faces which makes clothing generations looking extremely strange.
Text to image generation and re-ranking by CLIP
Best 16 of 48 generations ranked by CLIP
Generations from the training set (Including their Groundtruths)
Generations based on custom prompts (withouttheir Groundtruths)
Model specifications
VAE
Trained VQGAN for 1 epoch on Fashion-Gen dataset
Embeddings: 1024
Batch size: 5
DALLE
Trained DALLE for 1 epoch on Fashion-Gen dataset
dim = 312
text_seq_len = 80
depth = 36
heads = 12
dim_head = 64
reversible = 0
attn_types =('full', 'axial_row', 'axial_col', 'conv_like')
Optimization
Optimizer: Adam
Learning rate: 4.5e-4
Gradient Clipping: 0.5
Batch size: 7
Metadata
Metadata
Assignees
Labels
No labels