awslabs
diff --git a/‎.travis.yml‎
Lines changed: 13 additions & 2 deletions b/‎.travis.yml‎
Lines changed: 13 additions & 2 deletions
diff --git a/‎LICENSE‎
Lines changed: 8 additions & 0 deletions b/‎LICENSE‎
Lines changed: 8 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 7 additions & 3 deletions b/‎README.md‎
Lines changed: 7 additions & 3 deletions
diff --git a/‎benchmark/README.md‎
Lines changed: 131 additions & 0 deletions b/‎benchmark/README.md‎
Lines changed: 131 additions & 0 deletions
diff --git a/‎benchmark/__init__.py‎ b/‎benchmark/__init__.py‎
diff --git a/‎benchmark/benchmark_result/CNN_result.md‎
Lines changed: 92 additions & 0 deletions b/‎benchmark/benchmark_result/CNN_result.md‎
Lines changed: 92 additions & 0 deletions
diff --git a/‎benchmark/benchmark_result/mxnet_backend_training_speed.png‎
9.61 KB b/‎benchmark/benchmark_result/mxnet_backend_training_speed.png‎
9.61 KB
diff --git a/‎benchmark/scripts/__init__.py‎ b/‎benchmark/scripts/__init__.py‎
@@ -21,6 +21,10 @@ matrix:
           env: KERAS_BACKEND=cntk PYTHONWARNINGS=ignore
         - python: 3.6
           env: KERAS_BACKEND=cntk PYTHONWARNINGS=ignore
+        - python: 2.7
+          env: KERAS_BACKEND=mxnet PYTHONWARNINGS=ignore
+        - python: 3.6
+          env: KERAS_BACKEND=mxnet PYTHONWARNINGS=ignore
 install:
   # code below is taken from http://conda.pydata.org/docs/travis.html
   # We do this conditionally because it saves us some downloading if the
@@ -38,7 +42,7 @@ install:
   # Useful for debugging any issues with conda
   - conda info -a
 
-  - conda create -q -n test-environment python=$TRAVIS_PYTHON_VERSION pytest pandas
+  - conda create -q -n test-environment python=$TRAVIS_PYTHON_VERSION nose scipy matplotlib pandas pytest h5py
   - source activate test-environment
   - pip install --only-binary=numpy,scipy numpy nose scipy matplotlib h5py theano
   - conda install mkl mkl-service
@@ -57,7 +61,11 @@ install:
 
   # install TensorFlow (CPU version).
   - pip install tensorflow
-  
+
+  # install Apache MXNet (CPU version).
+  - pip install mxnet
+  - pip install --upgrade numpy
+
   # install cntk
   - if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
       pip install https://cntk.ai/PythonWheel/CPU-Only/cntk-2.3.1-cp27-cp27mu-linux_x86_64.whl;
@@ -78,6 +86,9 @@ install:
   - if [[ "$KERAS_BACKEND" != "cntk" ]]; then
       echo '    keras/backend/cntk_backend.py' >> .coveragerc;
     fi
+  - if [[ "$KERAS_BACKEND" != "mxnet" ]]; then
+      echo '    keras/backend/mxnet_backend.py' >> .coveragerc;
+    fi
 
   # detect whether core files are changed or not
   - export CORE_CHANGED=False;
 
@@ -12,6 +12,14 @@ All contributions by Microsoft:
 Copyright (c) 2017 - 2018, Microsoft, Inc.
 All rights reserved.
 
+All contributions by Amazon:
+Copyright (c) 2017 Amazon.com, Inc. or its affiliates
+All rights reserved.
+
+All contributions by Amazon:
+Copyright (c) 2017 Amazon.com, Inc. or its affiliates
+All rights reserved.
+
 All other contributions:
 Copyright (c) 2015 - 2018, the respective contributors.
 All rights reserved.
 
@@ -2,12 +2,15 @@
 
 ![Keras logo](https://s3.amazonaws.com/keras.io/img/keras-logo-2018-large-1200.png)
 
-[![Build Status](https://travis-ci.org/keras-team/keras.svg?branch=master)](https://travis-ci.org/keras-team/keras)
+| ubuntu/python-2.7 | ubuntu/python-3.5 |
+|---------|---------|
+| ![Python3 Build Status](https://codebuild.us-east-1.amazonaws.com/badges?uuid=eyJlbmNyeXB0ZWREYXRhIjoidHBzRFVlMG5SMGFQRTVzMUhxejNIK2dZRU1kb3p2c0JIbTVObDZtdDgxYThYdjRCZlg0RGF1eCsrSUtGQmgwYkFkZzJaT1BrdHpqcVJqcWE2aSt6QmRnPSIsIml2UGFyYW1ldGVyU3BlYyI6IklPMmRORld4TDYrdWNrWDciLCJtYXRlcmlhbFNldFNlcmlhbCI6MX0%3D&branch=master) | ![Python2 Build Status](https://codebuild.us-east-1.amazonaws.com/badges?uuid=eyJlbmNyeXB0ZWREYXRhIjoibHFOTlladW1VK050SFBST1N0UUtNOGdOV24vM25hVUJDQVVKNitvSFpXTFZ4RzlvUXppdHU4RytRR3hLdk1nSDd2VHlTSlZ5ZTlCUC9GdWdscHZRRFBNPSIsIml2UGFyYW1ldGVyU3BlYyI6IjZrQksycy9aWWV5QXh1MkoiLCJtYXRlcmlhbFNldFNlcmlhbCI6MX0%3D&branch=master) |
+
 [![license](https://img.shields.io/github/license/mashape/apistatus.svg?maxAge=2592000)](https://github.com/keras-team/keras/blob/master/LICENSE)
 
 ## You have just found Keras.
 
-Keras is a high-level neural networks API, written in Python and capable of running on top of [TensorFlow](https://github.com/tensorflow/tensorflow), [CNTK](https://github.com/Microsoft/cntk), or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. *Being able to go from idea to result with the least possible delay is key to doing good research.*
+Keras is a high-level neural networks API, written in Python and capable of running on top of [TensorFlow](https://github.com/tensorflow/tensorflow), [CNTK](https://github.com/Microsoft/cntk), [Apache MXNet](https://github.com/apache/incubator-mxnet/), or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. *Being able to go from idea to result with the least possible delay is key to doing good research.*
 
 Use Keras if you need a deep learning library that:
 
@@ -117,6 +120,7 @@ Before installing Keras, please install one of its backend engines: TensorFlow,
 - [TensorFlow installation instructions](https://www.tensorflow.org/install/).
 - [Theano installation instructions](http://deeplearning.net/software/theano/install.html#install).
 - [CNTK installation instructions](https://docs.microsoft.com/en-us/cognitive-toolkit/setup-cntk-on-your-machine).
+- [MXNet installation instructions](http://mxnet.incubator.apache.org/install/index.html).
 
 You may also consider installing the following **optional dependencies**:
 
@@ -155,7 +159,7 @@ sudo python setup.py install
 ------------------
 
 
-## Using a different backend than TensorFlow
+## Switching from TensorFlow to CNTK, MXNet or Theano
 
 By default, Keras will use TensorFlow as its tensor manipulation library. [Follow these instructions](https://keras.io/backend/) to configure the Keras backend.
 
 
@@ -0,0 +1,131 @@
+# Keras Benchmarks
+
+## Overview
+The benchmark module aims to provide a performance comparison on different Keras backends using various models and 
+dataset on CPU, 1 GPU and multi-GPU machines.
+Currently supported backends: TensorFlow, Apache MXNet 
+
+## Setup
+To install MXNet backend refer to 
+[Installation](https://github.com/awslabs/keras-apache-mxnet/wiki/Installation#1-install-keras-with-apache-mxnet-backend)
+
+To switch between different backends refer to 
+[configure Keras backend](https://github.com/awslabs/keras-apache-mxnet/wiki/Installation#2-configure-keras-backend)
+
+## CNN Benchmarks
+We provide benchmark scripts to run on CIFAR-10, ImageNet and Synthetic Dataset(randomly generated)
+
+### CIFAR-10 Dataset
+[CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) dataset has 60000 32x32 color images in 10 classes.
+The [training scripts](https://github.com/awslabs/keras-apache-mxnet/blob/master/benchmark/image-classification/benchmark_resnet.py)
+ will automatically download the dataset, you need to provide dataset name, resnet version 
+(1 or 2), number of layers (20, 56, or 110), number of GPUs to use. 
+
+Example Usage:
+
+`python benchmark_resnet.py --dataset cifar10 --version 1 --layers 56 --gpus 4`
+
+
+### ImageNet Dataset
+First, download ImageNet Dataset from [here](http://image-net.org/download), there are total 1.4 million images 
+with 1000 classes, each class is in a subfolder. In this script, each image is processed to size 256x256
+
+Since ImageNet Dataset is too large, there are two training mode for data that does not fit into memory: 
+[`train_on_batch`](https://keras.io/models/sequential/#train_on_batch) and 
+[`fit_generator`](https://keras.io/models/sequential/#fit_generator), 
+we recommend train_on_batch since it's more efficient on multi_gpu.
+(Refer to [Keras Document](https://keras.io/getting-started/faq/#how-can-i-use-keras-with-datasets-that-dont-fit-in-memory) 
+and Keras Issue [#9502](https://github.com/keras-team/keras/issues/9502), 
+[#9204](https://github.com/keras-team/keras/issues/9204), [#9647](https://github.com/keras-team/keras/issues/9647))
+
+Compare to CIFAR-10, you need to provide additional params: training mode and path to imagenet dataset.
+
+Example usage:
+
+`python benchmark_resnet.py --dataset imagenet --mxnet_backend_training_speed.pngversion 1 -layers 56 --gpus 4 --train_mode train_on_batch --data_path home/ubuntu/imagenet/train/`
+
+### Synthetic Dataset
+We used benchmark scripts from 
+[TensorFlow Benchmark](https://github.com/tensorflow/benchmarks/tree/keras-benchmarks/scripts/keras_benchmarks) 
+official repo, and modified slightly for our use case.
+
+Directly run the shell script to launch the benchmark, provide one of the configurations in config.json and whether 
+you want to benchmark inference speed (True or False). 
+
+Example Usage:
+
+`sh run_<backend-type>_backend.sh gpu_config False`
+
+### CNN Benchmark Results
+Here we list the result of MXNet backend training speed on CIFAR-10, ImageNet and Synthetic Data using 
+ResNet50V1 model, on CPU, 1, 4, 8 GPUs using AWS instances. 
+Hardware specifications of the instances can be found [here](https://aws.amazon.com/ec2/instance-types/)
+
+For more detailed benchmark results, please refer to [CNN results](https://github.com/awslabs/keras-apache-mxnet/tree/keras2_mxnet_backend/benchmark/benchmark_result/CNN_result.md). 
+
+|||
+|  ------ | ------ |
+|  Keras Version | 2.1.5 |
+|  MXNet Version | 1.1.0 |
+|  Data Format | Channel first |
+
+|  Instance | GPU used | Package | CIFAR-10 | ImageNet | Synthetic Data |
+|  ------ | ------ | ------ | ------ | ------ | ------ |
+|  C5.18xLarge | 0  | mxnet-mkl | 87 | N/A | 9 |
+|  P3.8xLarge | 1 | mxnet-cu90 | N/A | 165 | 229 |
+|  P3.8xLarge | 4 | mxnet-cu90 | 1792 | 538 | 728 |
+|  P3.16xLarge | 8 | mxnet-cu90 | 1618 | 728 | 963 |
+
+![MXNet backend training speed](https://github.com/roywei/keras/blob/benchmark_result/benchmark/benchmark_result/mxnet_backend_training_speed.png)
+
+Note: X-axis is number of GPUs used, Y-axis is training speed(images/second)
+
+## RNN Benchmarks
+
+We provide benchmark scripts to run on Synthetic(randomly generated), Nietzsche, and WikiText-2 character level Dataset.
+
+Directly run the shell script to launch the benchmark, provide one of the configurations in config.json and whether you want to benchmark inference speed (True or False). 
+
+Example Usage:
+
+`sh run_<backend-type>_backend.sh gpu_config False`
+
+### Synthetic Dataset
+
+We used benchmark scripts from [TensorFlow Benchmark](https://github.com/tensorflow/benchmarks/tree/keras-benchmarks/scripts/keras_benchmarks) official repo, and modified slightly for our use case.
+
+### Nietzsche Dataset
+
+We have used an official Keras LSTM example scripts [lstm_text_generation.py](https://github.com/keras-team/keras/blob/master/examples/lstm_text_generation.py), and modified slightly for our use case.
+
+### WikiText-2 Dataset
+
+We have used an official WikiText-2 character level Dataset from this [link](https://einstein.ai/research/the-wikitext-long-term-dependency-language-modeling-dataset).
+
+The `lstm_text_generation_wikitext2.py` includes a dataset that is hosted on S3 bucket from this [link](https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-raw-v1.zip) (This is a WikiText-2 raw character level data).
+
+### RNN Benchmark Results
+
+Here, we list the result on Synthetic, Nietzsche, and WikiText-2 dataset using Sequential model(LSTM) on Amazon AWS C5.xLarge(CPU) instance and P3.8xLarge(1, 4 GPUs) with MXNet backend. Batch size is 128. For more details about the instance configuration, please refer [P3](https://aws.amazon.com/ec2/instance-types/p3/) and [C5](https://aws.amazon.com/ec2/instance-types/c5/).
+
+| Instance   | GPUs | Data Set   | Speed/Epoch <br />(Lower is better) |
+| ---------- | ---- | ---------- | ----------------------------------- |
+| C5.xLarge  | 0    | Synthetic  | 91 sec - 2ms/step                   |
+| P3.8xLarge | 1    | Synthetic  | 13 sec - 264us/step                 |
+| P3.8xLarge | 4    | Synthetic  | 12 sec - 241us/step                 |
+| C5.xLarge  | 0    | Nietzsche  | 352 sec -  2ms/step                 |
+| P3.8xLarge | 1    | Nietzsche  | 53 sec - 265us/step                 |
+| P3.8xLarge | 4    | Nietzsche  | 47 sec - 236us/step                 |
+| C5.xLarge  | 0    | WikiText-2 | 6410 sec - 2ms/step                 |
+| P3.8xLarge | 1    | WikiText-2 | 882 sec - 264us/step                |
+| P3.8xLarge | 4    | WikiText-2 | 794 sec - 235us/step                |
+
+
+
+## Credits
+
+Synthetic Data scripts modified from 
+[TensorFlow Benchmarks](https://github.com/tensorflow/benchmarks/tree/keras-benchmarks)
+
+## Reference
+[1] [TensorFlow Benchmarks](https://github.com/tensorflow/benchmarks/tree/keras-benchmarks)
@@ -0,0 +1,92 @@
+# Detailed CNN Benchmark Results
+## CIFAR-10 Dataset
+### Configauration
+|||
+|---|---|
+|  Data Set | [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) |
+|  Keras Version | 2.1.5 |
+| TensorFlow Version | 1.7.0 |
+| MXNet Version | 1.1.0 |
+|  Training Method | [`fit`](https://keras.io/models/model/#fit) |
+|  Training Scripts | [Simple CNN Script](https://github.com/awslabs/keras-apache-mxnet/blob/master/examples/CIFAR-10_cnn.py), [ResNet Script](https://github.com/awslabs/keras-apache-mxnet/blob/master/benchmark/image-classification/benchmark_resnet.py) |
+
+### Results
+
+|  Instance Type | GPU used | Model | Backend | Package | Batch Size | Data Format | Speed (images/s) |
+|  ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ |
+|  C5.xLarge | 0  | Simple CNN | MXNet | mxnet-mkl | 32 | channel last | 253 |
+|  C5.xLarge | 0 | Simple CNN | MXNet | mxnet-mkl | 32 | channel first | 223 |
+|  C5.xLarge | 0 | Simple CNN | TensorFlow | tensorflow | 32 | channel last | 309 |
+|  C5.xLarge | 0 | Simple CNN | TensorFlow | tensorflow | 32 | channel first | 101 |
+|  C5.18xLarge | 0 | Simple CNN | MXNet | mxnet-mkl | 32 | channel last | 845 |
+|  C5.18xLarge | 0 | Simple CNN | MXNet | mxnet-mkl | 32 | channel first | 936 |
+|  C5.18xLarge | 0 | ReNet50V1 | TensorFlow | tensorflow | 32 | channel last | 59 |
+|  C5.18xLarge | 0 | ReNet50V1 | TensorFlow | tensorflow | 32 | channel first | 41 |
+|  C5.18xLarge | 0 | ReNet50V1 | MXNet | mxnet-mkl |32 | channel last | 48 |
+|  C5.18xLarge | 0 | ReNet50V1 | MXNet | mxnet-mkl | 32 | channel first | 87 |
+|  P3.8xLarge | 4 | ReNet50V1 | TensorFlow | tensorflow-gpu |128 | channel last | 1020 |
+|  P3.8xLarge | 4 | ReNet50V1 | MXNet | mxnet-cu90 | 128 | channel first | 1792 |
+|  P3.8xLarge | 8 | ReNet50V1 | TensorFlow | tensorflow-gpu |256 | channel last | 962 |
+|  P3.16xLarge | 8 | ReNet50V1 | MXNet | mxnet-cu90 | 256 | channel first | 1618 |
+
+## ImageNet Dataset
+
+### Configuration
+|||
+|---|---|
+|  Data Set | [ImageNet](http://image-net.org) |
+| Model | ResNet50V1|
+|  Keras Version | 2.1.3 |
+| TensorFlow Version | 1.6.0rc1 |
+| MXNet Version | 1.1.0 |
+|  Training Method | [`train_on_batch`](https://keras.io/models/sequential/#train_on_batch), [`fit_generator`](https://keras.io/models/sequential/#fit_generator) |
+|  Training Scripts | [ResNet Script](https://github.com/awslabs/keras-apache-mxnet/blob/master/benchmark/image-classification/benchmark_resnet.py) |
+
+### Results
+
+|  Instance | GPU used | Backend | Package | Method | Batch Size | Data Format | Speed (images/s) |
+|  ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ |
+|  P3.8xLarge | 1 |  TensorFlow | tensorflow-gpu | `train_on_batch` | 32 | channel last | 50 |
+|  P3.8xLarge | 1 |  MXNet | mxnet-cu90 | `train_on_batch` | 32 | channel first | 165 |
+|  P3.8xLarge | 4 |  TensorFlow | tensorflow-gpu | `train_on_batch` | 128 | channel last | 162 |
+|  P3.8xLarge | 4 |  MXNet | mxnet-cu90 | `train_on_batch` | 128 | channel first | 538 |
+|  P3.16xLarge | 8 |  TensorFlow | tensorflow-gpu | `train_on_batch` | 256 | channel last | 212 |
+|  P3.16xLarge | 8 |  MXNet | mxnet-cu90 | `train_on_batch` | 256 | channel first | 728 |
+|  P3.8xLarge | 1 | TensorFlow | tensorflow-gpu | `fit_generator` | 32 | channel last | 53 |
+|  P3.8xLarge | 1 |  MXNet | mxnet-cu90 | `fit_generator` | 32 | channel first | 73 |
+|  P3.8xLarge | 4 |  TensorFlow | tensorflow-gpu | `fit_generator` | 128 | channel last | 173 |
+|  P3.8xLarge | 4 |  MXNet | mxnet-cu90 | `fit_generator` | 128 | channel first | 197  |
+
+## Synthetic Dataset
+
+### Configuration
+|||
+|---|---|
+|  Data Set | Random 256x256 color images, 1000 classes |
+| Model | ResNet50V1|
+|  Keras Version | 2.1.3 |
+| TensorFlow Version | 1.6.0rc1 |
+| MXNet Version | 1.1.0 |
+|  Training Method |[`fit`](https://keras.io/models/model/#fit) |
+|  Training Scripts | [ResNet Script](https://github.com/awslabs/keras-apache-mxnet/tree/keras2_mxnet_backend/benchmark/synthetic) |
+
+### Results
+
+|  Instance | GPU used | Backend | Package | Batch Size | Data Format | Speed (images/s) |
+|  ------ | ------ | ------ | ------ | ------ | ------ | ------ |
+|  C5.18xLarge | 0 |	TensorFlow|	tensorflow |32| channel first |4|
+|  C5.18xLarge |	0 |	MXNet	| mxnet-mkl	| 32 |	channel first|	9|
+|  P3.8xLarge | 1 | TensorFlow | tensorflow-gpu | 32 | channel first | 198|
+|  P3.8xLarge | 1 | MXNet | mxnet-cu90 | 32 | channel first | 229 |
+|  P3.8xLarge | 4 | TensorFlow | tensorflow-gpu | 128 | channel first | 448 |
+|  P3.8xLarge | 4 | MXNet | mxnet-cu90 | 128 | channel first | 728 |
+|  P3.16xLarge | 8 | TensorFlow | tensorflow-gpu | 256 | channel first | 346 |
+|  P3.16xLarge | 8 | MXNet | mxnet-cu90 | 256 | channel first | 963 |
+|  C5.18xLarge | 0 |	TensorFlow|	tensorflow |32| channel last | 4 |
+|  C5.18xLarge | 0 |	MXNet	| mxnet-mkl	| 32 |	channel last | 3 |
+|  P3.8xLarge | 1 | TensorFlow | tensorflow-gpu | 32 | channel last | 164|
+|  P3.8xLarge | 1 | MXNet | mxnet-cu90 | 32 | channel last | 18 |
+|  P3.8xLarge | 4 | TensorFlow | tensorflow-gpu | 128 | channel last | 409 |
+|  P3.8xLarge | 4 | MXNet | mxnet-cu90 | 128 | channel last | 73 |
+|  P3.16xLarge | 8 | TensorFlow | tensorflow-gpu | 256 | channel last | 164 |
+|  P3.16xLarge | 8 | MXNet | mxnet-cu90 | 256 | channel last | 18 |