Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
d20cb93
Bump flask from 0.12.2 to 1.0 in /src/helper_files
dependabot[bot] Jul 19, 2019
cff8fd6
Merge pull request #3 from ALFA-group/dependabot/pip/src/helper_files…
hembergerik Aug 14, 2019
75b41cd
Create install_ubuntu.md
hembergerik Aug 23, 2019
86a5a70
Update network_data_loader.py
hembergerik Aug 23, 2019
f7b86cf
Merge branch 'master' into eh/ORNL
jamaltoutouh Sep 5, 2019
f943b78
Merge pull request #5 from ALFA-group/eh/ORNL
jamaltoutouh Sep 5, 2019
8437179
Update install_ubuntu.md
hembergerik Sep 13, 2019
df2a4b9
Fixing Dokerfile python call
Nov 6, 2019
9ace06a
Fixing updating pytorch version to 0.4.1
Nov 6, 2019
119772e
Merge pull request #7 from ALFA-group/jt/main-fixes
hembergerik Nov 6, 2019
8a755cf
Adding mixture weights optimization at the end of training
Nov 6, 2019
26d281f
Fixing L2 distance computation
Nov 6, 2019
ae1c6ff
Merge pull request #9 from ALFA-group/jt/main-fixes
jamaltoutouh Nov 6, 2019
651f999
Merge conflicts solving
Nov 6, 2019
c4f77b9
Tests and refactoring
Nov 6, 2019
ee302e7
Merge branch 'master' into jt/mixture-weight-optimization
hembergerik Nov 6, 2019
816ec65
Merge pull request #8 from ALFA-group/jt/mixture-weight-optimization
hembergerik Nov 6, 2019
0d9b3e9
LipizzanerGANTrainer fix
Nov 6, 2019
bd29073
Data dieting implementation
Nov 6, 2019
500257c
Refactoring
Nov 7, 2019
660398a
Data dieting configruation files for tests
Nov 7, 2019
c796dc0
Merge pull request #10 from ALFA-group/jt/data-dieting
hembergerik Nov 7, 2019
d50123c
Adding last version of mustangs
Nov 7, 2019
7cdee1a
Added the last version of Mustangs
Nov 7, 2019
ca0a19e
Refactoring Mustangs
Nov 7, 2019
8b80350
Merge pull request #11 from ALFA-group/jt/mustangs
hembergerik Nov 7, 2019
d51fff4
Store cells checkpoint
Nov 19, 2019
d376e82
Pull request comments
Nov 20, 2019
6940bcd
Pull request comments
Nov 20, 2019
3500988
Merge pull request #13 from ALFA-group/jt/save-checkpoints
hembergerik Nov 20, 2019
65c0bc4
Store the network source of the individuals
Nov 20, 2019
7fd70b6
Added information about the possition on the grid of the neighborhoods
Nov 20, 2019
f7a5243
Review comments
Nov 20, 2019
120670a
Merge pull request #14 from ALFA-group/jt/save-checkpoints
hembergerik Nov 20, 2019
2491dbf
Merge pull request #15 from ALFA-group/jt/store-grid-topology
hembergerik Nov 22, 2019
a8a06ca
Add mnist_labels files
floresd9 Jan 17, 2020
3bdc59a
Merge pull request #18 from ALFA-group/master
floresd9 Jan 17, 2020
664b445
Add multi-label per cell functionality
floresd9 Jan 22, 2020
ab94bd4
Finish merging
floresd9 Jan 22, 2020
45d782b
Fix import sampler
floresd9 Jan 22, 2020
5bae07d
Set up df/satori branch
floresd9 Jan 23, 2020
d81a793
Merge label selection into satori
floresd9 Jan 23, 2020
0baa836
Fully merge df/label-selection
floresd9 Jan 23, 2020
5828d67
up to date transforms
floresd9 Jan 28, 2020
e5402e9
Non-square grid error handling
floresd9 Jan 31, 2020
2d0df85
Check for local host on Summit
stevenryoung Jan 31, 2020
4b228f9
Fix bug in non-square grid error handling
floresd9 Mar 4, 2020
7d092f7
Calculate scores in clients for efficiency
floresd9 May 5, 2020
73467ff
MNIST dataset
floresd9 May 5, 2020
055e18f
Final rectangle grid handling
floresd9 May 5, 2020
5322588
Do not set client finish event too soon
floresd9 Feb 4, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions gan-script.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ echo "Client PIDS:"
cat ${PID_FILE}
sleep 5

echo "Start master on GPU 4"
export CUDA_VISIBLE_DEVICES=4;
echo "Start master on GPU"
# export CUDA_VISIBLE_DEVICES=4;
python main.py train --distributed --master -f configuration/quickstart/mnist.yml

echo "Begin kill clients"
Expand Down
45 changes: 45 additions & 0 deletions install_ubuntu.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Lipizzaner

## Setup
```
git clone https://github.com/ALFA-group/lipizzaner-gan.git
cd lipizzaner-gan/
python3 --version
sudo apt-get update
sudo apt-get install python3-venv
python3 -m venv ~/my367
source ~/my367/bin/activate
sudo apt-get install python3-dev
sudo apt-get install gcc
pip install -r ./src/helper_files/requirements.txt
```

## MNIST
```
cd src/
python main.py train --distributed --client & sleep 5;
python main.py train --distributed --client & sleep 5;
python main.py train --distributed --client & sleep 5;
python main.py train --distributed --client &
ps
wget http://0.0.0.0:5000/status
cat status
python main.py train --distributed --master -f configuration/quickstart/mnist.yml
```

## Theoretical GAN
```
cd ../theoretical_experiments
sudo apt-get install python3-tk
python gaussian_gan.py
```

## Network traffic
```
cd ../src/data/network_data/
sudo apt-get install argus-client
./collect_network_traffic.sh
python analyze_network_file.py --pcap_file=network_capture.pcap --sequence_length=30
cd ../../
python main.py train --distributed --master -f configuration/lipizzaner-gan/network_traffic.yml
```
3 changes: 1 addition & 2 deletions lipi-mnist-satori.lsf
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,8 @@
#BSUB -R "span[ptile=4]"
#BSUB -gpu "num=4"
#BSUB -q "normal"
#BSUB -x

HOME2=/nobackup/users/ehemberg
HOME2=/nobackup/users/floresd
PYTHON_VIRTUAL_ENVIRONMENT=lipi
CONDA_ROOT=$HOME2/anaconda3
source ${CONDA_ROOT}/etc/profile.d/conda.sh
Expand Down
2 changes: 1 addition & 1 deletion src/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -27,4 +27,4 @@ COPY helper_files/requirements.txt ./helper_files/
RUN pip install -r ./helper_files/requirements.txt

COPY . .
CMD [ "sh", "-c", "python3.6 train ./main.py --distributed --${role} -f ${config_file}" ]
CMD [ "sh", "-c", "python3.6 ./main.py train --distributed --${role} -f ${config_file}" ]
47 changes: 47 additions & 0 deletions src/configuration/lipizzaner-gan/mnist_labels.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
trainer:
name: lipizzaner_gan
n_iterations: 200
calculate_net_weights_dist: True
# independent_probability, exact_proportion
mixture_generator_samples_mode: exact_proportion
params:
population_size: 1
tournament_size: 2
n_replacements: 1
default_adam_learning_rate: 0.0002
# Hyperparameter mutation
alpha: 0.0001
mutation_probability: 0.5
discriminator_skip_each_nth_step: 1
mixture_sigma: 0.01
enable_selection: True
score:
enabled: True
type: fid
score_sample_size: 1000
cuda: True
fitness:
fitness_sample_size: 1000
fitness_mode: average # worse, best, average
dataloader:
dataset_name: mnist_labels
use_batch: True
batch_size: 50
n_batches: 0
shuffle: True
labels:
- 1
- 2
- 3
- 4
- 5
labels_per_cell: 3
network:
name: four_layer_perceptron
loss: bceloss
master:
calculate_score: True
# Same amount of data as original CIFAR contains
score_sample_size: 50000
cuda: True
general: !include ../general.yml
19 changes: 19 additions & 0 deletions src/configuration/quickstart-data-dieting/general.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
logging:
enabled: True
log_level: INFO
log_server: # Fill in connection string with read/write access here
image_format: jpg
print_discriminator: False
losswise:
enabled: False
api_key: # Fill in API key
output_dir: ./output
distribution:
auto_discover: False
master_node:
exit_clients_on_disconnect: True
client_nodes:
- address: 127.0.0.1 # Fill in IP address here
port: 5000-5003
seed: 1
num_workers: 0 # how many subprocesses to use for data loading
45 changes: 45 additions & 0 deletions src/configuration/quickstart-data-dieting/mnist.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
trainer:
name: lipizzaner_gan
n_iterations: 2
calculate_net_weights_dist: True
# independent_probability, exact_proportion
mixture_generator_samples_mode: exact_proportion
params:
population_size: 1
tournament_size: 2
n_replacements: 1
default_adam_learning_rate: 0.0002
# Hyperparameter mutation
alpha: 0.0001
mutation_probability: 0.5
discriminator_skip_each_nth_step: 1
#mixture_sigma: 0.01
enable_selection: True
score:
enabled: True
type: fid
score_sample_size: 10000
cuda: True
fitness:
fitness_sample_size: 1000
fitness_mode: average # worse, best, average
optimize_mixture:
es_generations: 50
es_score_sample_size: 10000
es_random_init: False
mixture_sigma: 0.01
dataloader:
dataset_name: mnist
use_batch: True
batch_size: 100
n_batches: 0
shuffle: True
sampling_ratio: 0.5
network:
name: four_layer_perceptron
loss: bceloss
master:
calculate_score: True
score_sample_size: 50000
cuda: True
general: !include general.yml
19 changes: 19 additions & 0 deletions src/configuration/quickstart-weights-optimization/general.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
logging:
enabled: True
log_level: INFO
log_server: # Fill in connection string with read/write access here
image_format: jpg
print_discriminator: False
losswise:
enabled: False
api_key: # Fill in API key
output_dir: ./output
distribution:
auto_discover: False
master_node:
exit_clients_on_disconnect: True
client_nodes:
- address: 127.0.0.1 # Fill in IP address here
port: 5000-5003
seed: 1
num_workers: 0 # how many subprocesses to use for data loading
44 changes: 44 additions & 0 deletions src/configuration/quickstart-weights-optimization/mnist.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
trainer:
name: lipizzaner_gan
n_iterations: 2
calculate_net_weights_dist: True
# independent_probability, exact_proportion
mixture_generator_samples_mode: exact_proportion
params:
population_size: 1
tournament_size: 2
n_replacements: 1
default_adam_learning_rate: 0.0002
# Hyperparameter mutation
alpha: 0.0001
mutation_probability: 0.5
discriminator_skip_each_nth_step: 1
#mixture_sigma: 0.01
enable_selection: True
score:
enabled: True
type: fid
score_sample_size: 10000
cuda: True
fitness:
fitness_sample_size: 1000
fitness_mode: average # worse, best, average
optimize_mixture:
es_generations: 50
es_score_sample_size: 10000
es_random_init: False
mixture_sigma: 0.01
dataloader:
dataset_name: mnist
use_batch: True
batch_size: 100
n_batches: 10
shuffle: True
network:
name: four_layer_perceptron
loss: bceloss
master:
calculate_score: True
score_sample_size: 50000
cuda: True
general: !include general.yml
13 changes: 9 additions & 4 deletions src/configuration/quickstart/mnist.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
trainer:
name: lipizzaner_gan
n_iterations: 5
n_iterations: 200
calculate_net_weights_dist: True
# independent_probability, exact_proportion
mixture_generator_samples_mode: exact_proportion
params:
population_size: 1
tournament_size: 1
tournament_size: 2
n_replacements: 1
default_adam_learning_rate: 0.0002
# Hyperparameter mutation
Expand All @@ -19,14 +19,19 @@ trainer:
enabled: True
type: fid
score_sample_size: 1000
cuda: False
cuda: True
fitness:
fitness_sample_size: 1000
fitness_mode: average # worse, best, average
optimize_mixture:
es_generations: 10
es_score_sample_size: 10000
es_random_init: False
mixture_sigma: 0.01
dataloader:
dataset_name: mnist
use_batch: True
batch_size: 400
batch_size: 100
n_batches: 0
shuffle: True
network:
Expand Down
46 changes: 46 additions & 0 deletions src/configuration/quickstart/mnist_labels.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
trainer:
name: lipizzaner_gan
n_iterations: 5
calculate_net_weights_dist: True
# independent_probability, exact_proportion
mixture_generator_samples_mode: exact_proportion
params:
population_size: 1
tournament_size: 2
n_replacements: 1
default_adam_learning_rate: 0.0002
# Hyperparameter mutation
alpha: 0.0001
mutation_probability: 0.5
discriminator_skip_each_nth_step: 1
mixture_sigma: 0.01
enable_selection: True
score:
enabled: True
type: fid
score_sample_size: 1000
cuda: True
fitness:
fitness_sample_size: 1000
fitness_mode: average # worse, best, average
dataloader:
dataset_name: mnist_labels
use_batch: True
batch_size: 400
n_batches: 0
shuffle: True
labels:
- 1
- 2
- 3
- 4
- 5
labels_per_cell: 3
network:
name: four_layer_perceptron
loss: bceloss
master:
calculate_score: True
score_sample_size: 50000
cuda: True
general: !include general.yml
11 changes: 11 additions & 0 deletions src/configuration/tests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
### Configuration files to test new Lipizzaner functionalities

- **checkpointing**: It is a new feature to store the current status of each client node (cell) in its output folder. The stored information includes the *genome* (the network), the current iteration, learning rate, and all the information needed to resume the experiment from this given checkpoint.
- **weights-optimization**: It is used to test the optimization of the mixture weights at the enf of the training process. It basically call this function without performing any training epoch.
- **data-dieting**: It allows the selection of the portion of the training dataset to be used in each cell to train the networks. The samples of this reduced dataset are randomly picked.
- **mustangs**: Applyong Mustangs, the idea of randomlu picking a loss function from three different ones (BCE, MSE, and heuristic losses). It is introduced in **Spatial evolutionary generative adversarial networks**


Jamal Toutouh, Erik Hemberg, and Una-May O'Reilly. 2019. Spatial evolutionary generative adversarial networks. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '19), Manuel López-Ibáñez (Ed.). ACM, New York, NY, USA, 472-480. DOI: https://doi.org/10.1145/3321707.3321860


20 changes: 20 additions & 0 deletions src/configuration/tests/checkpointing/general.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
logging:
enabled: True
log_level: INFO
log_server: # Fill in connection string with read/write access here
image_format: jpg
print_discriminator: False
losswise:
enabled: False
api_key: # Fill in API key
output_dir: ./output
distribution:
auto_discover: False
master_node:
exit_clients_on_disconnect: True
client_nodes:
- address: 127.0.0.1 # Fill in IP address here
port: 5000-5003
seed: 1
num_workers: 0 # How many subprocesses to use for data loading
checkpoint_period: 2 # Number of iterations to be performed between the checkpoint (0 no check points are stored)
Loading