LBANN
diff --git a/‎docs/callbacks.rst‎
Lines changed: 1 addition & 1 deletion b/‎docs/callbacks.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/conf.py‎
Lines changed: 1 addition & 1 deletion b/‎docs/conf.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/index.rst‎
Lines changed: 6 additions & 0 deletions b/‎docs/index.rst‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎docs/layers.rst‎
Lines changed: 1 addition & 1 deletion b/‎docs/layers.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/layers/activation_layers.rst‎
Lines changed: 14 additions & 4 deletions b/‎docs/layers/activation_layers.rst‎
Lines changed: 14 additions & 4 deletions
diff --git a/‎docs/layers/image_layers.rst‎
Lines changed: 7 additions & 5 deletions b/‎docs/layers/image_layers.rst‎
Lines changed: 7 additions & 5 deletions
diff --git a/‎docs/layers/io_layers.rst‎
Lines changed: 1 addition & 1 deletion b/‎docs/layers/io_layers.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/layers/learning_layers.rst‎
Lines changed: 12 additions & 11 deletions b/‎docs/layers/learning_layers.rst‎
Lines changed: 12 additions & 11 deletions
diff --git a/‎docs/layers/loss_layers.rst‎
Lines changed: 15 additions & 11 deletions b/‎docs/layers/loss_layers.rst‎
Lines changed: 15 additions & 11 deletions
diff --git a/‎docs/layers/math_layers.rst‎
Lines changed: 9 additions & 9 deletions b/‎docs/layers/math_layers.rst‎
Lines changed: 9 additions & 9 deletions
@@ -64,7 +64,7 @@ Profobuf (Advanced)
    callback {
      timer {
      }
-     print_atatistics {
+     print_statistics {
        batch_interval: 5
      }
      save_model {
 
@@ -63,7 +63,7 @@
 #
 # This is also used if you do content translation via gettext catalogs.
 # Usually you set "language" from the command line for these cases.
-language = None
+language = 'en'
 
 # List of patterns, relative to source directory, that match files and
 # directories to ignore when looking for source files.
 
@@ -48,6 +48,12 @@ Users are advised to view `the Doxygen API Documentation
 
    layers
 
+.. toctree::
+   :maxdepth: 2
+   :caption: LBANN Operators
+
+   operators
+
 .. toctree::
    :maxdepth: 1
    :caption: Data Ingestion
 
@@ -200,7 +200,7 @@ LBANN Layers List
    :maxdepth: 2
 
    I/O Layers <layers/io_layers>
-   Operator Layers <layers/operator_layers>
+   Operator Layer <layers/operator_layer>
    Transform Layers <layers/transform_layers>
    Learning Layers <layers/learning_layers>
    Loss Layers <layers/loss_layers>
 
@@ -28,7 +28,9 @@ ________________________________________
 Elu
 ----------------------------------------
 
-Exponential linear unit
+The :python:`Elu` layer is similar to :python:`Relu` but with negative
+values that cause the mean of the :python:`Elu` activation function to
+shift toward 0.
 
 .. math::
 
@@ -59,7 +61,7 @@ ________________________________________
 Identity
 ----------------------------------------
 
-Output the input tensor
+The :python:`Identity` layer outputs the input tensor.
 
 This layer is very cheap since it just involves setting up tensor
 views.
@@ -77,6 +79,10 @@ ________________________________________
 LeakyRelu
 ----------------------------------------
 
+:python:`LeakyRelu` modifies the :python:`Relu` function to allow for
+        a small, non-zero gradient when the unit is saturated and not
+        active.
+
 .. math::
 
    \text{LeakyReLU}(x; \alpha) =
@@ -106,7 +112,7 @@ ________________________________________
 LogSoftmax
 ----------------------------------------
 
-Logarithm of softmax function
+:python:`LogSoftmax` is the logarithm of the softmax function.
 
 .. math::
 
@@ -125,7 +131,8 @@ ________________________________________
 Relu
 ----------------------------------------
 
-Rectified linear unit
+The :python:`Relu` layer outputs input directly if positive, otherwise
+outputs zero.
 
 .. math::
 
@@ -144,6 +151,9 @@ ________________________________________
 Softmax
 ----------------------------------------
 
+The :python:`Softmax` layer turns a vector of K real values into a
+vector of K real values that sum to 1.
+
 .. math::
 
    \text{softmax}(x)_i = \frac{e^{x_i}}{\sum_j e^{x_j}}
 
@@ -26,7 +26,8 @@ ________________________________________
 BilinearResize
 ----------------------------------------
 
-Resize image with bilinear interpolation
+The :python:`BilinearResize` layer resizes image with bilinear
+interpolation
 
 Expects a 3D input tensor, which is interpreted as an image in CHW
 format. Gradients are not propagated during backprop.
@@ -48,11 +49,11 @@ ________________________________________
 CompositeImageTransformation
 ----------------------------------------
 
-Rotate a image clockwise around its center, then shear , then
-translate
+The :python:`CompositeImageTransformation` layer rotates an image
+clockwise around its center, then shear, then translate.
 
 Expects 4 inputs: a 3D image tensor in CHW format, a scalar rotation
-angle, a tensor for (X,Y) shear factor, a tensor  for (X,Y) translate.
+angle, a tensor for (X,Y) shear factor, a tensor for (X,Y) translate.
 
 Arguments: None
 
@@ -67,7 +68,8 @@ ________________________________________
 Rotation
 ----------------------------------------
 
-Rotate a image clockwise around its center
+The :python:`Rotation` layer rotates an image clockwise around its
+center.
 
 Expects two inputs: a 3D image tensor in CHW format and a scalar
 rotation angle.
 
@@ -22,7 +22,7 @@ ________________________________________
 Input
 ---------------------------
 
-The input layer is a data tensor from data reader
+The :python:`Input` layer is a data tensor from data reader.
 
 Arguments:
 
 
@@ -31,8 +31,8 @@ ________________________________________
 ChannelwiseFullyConnected
 ----------------------------------------
 
-The ChannelwiseFullyConnected layer applies an affine transformation
-to tensor channels.
+The :python:`ChannelwiseFullyConnected` layer applies an affine
+transformation to tensor channels.
 
 The input tensor is sliced along the first tensor dimension (the
 "channel" dimension for image data in CHW format) and the same affine
@@ -78,9 +78,8 @@ ________________________________________
 ChannelwiseScaleBias
 ----------------------------------------
 
-The ChannelwiseScaleBias layer applies per-channel scale and bias.
-
-The input tensor is sliced along the first tensor dimension (the
+The :python:`ChannelwiseScaleBias` layer applies per-channel scale and
+bias. The input tensor is sliced along the first tensor dimension (the
 "channel" dimension, assuming image data in CHW format) and scale and
 bias terms are applied independently to each slice. More precisely,
 given input and output tensors
@@ -110,7 +109,7 @@ ________________________________________
 Convolution
 ----------------------------------------
 
-The Convolution layer applies convolution (more precisely,
+The :python:`Convolution` layer applies convolution (more precisely,
 cross-correlation) to the input tensor. This is primarily optimized
 for image data in CHW format.
 
@@ -189,7 +188,8 @@ ________________________________________
 Deconvolution
 ----------------------------------------
 
-This operation is the transpose of standard deep learning convolution.
+The :python:`Deconvolution` layer is the transpose of standard deep
+learning convolution.
 
 Pedantic comments: this operation is commonly called "deconvolution"
 in the deep learning community, but it is not a true deconvolution.
@@ -236,7 +236,7 @@ ________________________________________
 Embedding
 ----------------------------------------
 
-The Embedding layer is a lookup table to embedding vectors.
+The :python:`Embedding` layer is a lookup table to embedding vectors.
 
 Takes a scalar input, interprets it as an index, and outputs the
 corresponding vector. The number of embedding vectors and the size of
@@ -268,7 +268,8 @@ ________________________________________
 EntrywiseScaleBias
 ----------------------------------------
 
-The EntrywiseScaleBias layer applies entry-wise scale and bias.
+The :python:`EntrywiseScaleBias` layer applies entry-wise scale and
+bias.
 
 Scale and bias terms are applied independently to each tensor
 entry. More precisely, given input, output, scale, and bias tensors
@@ -297,7 +298,7 @@ ________________________________________
 FullyConnected
 ----------------------------------------
 
-Affine transformation
+The :python:`FullyConnected` layer is an affine transformation.
 
 Flattens the input tensor, multiplies with a weights matrix, and
 optionally applies an entry-wise bias. Following a row-vector
@@ -337,7 +338,7 @@ ________________________________________
 GRU
 ----------------------------------------
 
-Stacked gated recurrent unit
+The :python:`GRU` layer is a stacked gated recurrent unit.
 
 Expects two inputs: a 2D input sequence (
 :math:`\text{sequence_length}\times\text{input_size}`) and a 2D
 
@@ -29,7 +29,7 @@ ________________________________________
 CategoricalAccuracy
 ----------------------------------------
 
-The Categorical Accuracy Layer is a 0-1 loss function.
+The :python:`CategoricalAccuracy` Layer is a 0-1 loss function.
 
 Requires two inputs, which are respectively interpreted as prediction
 scores and as a one-hot label vector. The output is one if the top
@@ -52,7 +52,8 @@ ________________________________________
 CrossEntropy
 ----------------------------------------
 
-Cross entropy between probability vectors.
+The :python:`CrossEntropy` layer measures the probability and error
+between vectors.
 
 Given a predicted distribution :math:`y` and ground truth distribution
 :math:`\hat{y}`,
@@ -76,7 +77,7 @@ ________________________________________
 L1Norm
 ----------------------------------------
 
-L1 vector norm
+The :python:`L1Norm` layer is the L1 norm of a vector.
 
 .. math::
 
@@ -95,7 +96,7 @@ ________________________________________
 L2Norm2
 ----------------------------------------
 
-Square of L2 vector norm
+The :python:`L2Norm2` layer is the square of L2 vector norm.
 
 .. math::
 
@@ -114,7 +115,8 @@ ________________________________________
 MeanAbsoluteError
 ----------------------------------------
 
-Given a prediction :math:`y` and ground truth :math:`\hat{y}`,
+The :python:`MeanAbsoluteError` given a prediction :math:`y` and
+ground truth :math:`\hat{y}`:
 
 .. math::
 
@@ -134,7 +136,8 @@ ________________________________________
 MeanSquaredError
 ----------------------------------------
 
-Given a prediction :math:`y` and ground truth :math:`\hat{y}`,
+The :python:`MeanSquaredError` layer given a prediction :math:`y` and
+ground truth :math:`\hat{y}`:
 
 .. math::
 
@@ -154,11 +157,12 @@ ________________________________________
 TopKCategoricalAccuracy
 ----------------------------------------
 
-Requires two inputs, which are respectively interpreted as prediction
-scores and as a one-hot label vector. The output is one if the
-corresponding label matches one of the top-k prediction scores and is
-otherwise zero. Ties in the top-k prediction scores are broken in
-favor of entries with smaller indices.
+The :python:`TopKCategoricalAccuracy` layer requires two inputs, which
+are respectively interpreted as prediction scores and as a one-hot
+label vector. The output is one if the corresponding label matches one
+of the top-k prediction scores and is otherwise zero. Ties in the
+top-k prediction scores are broken in favor of entries with smaller
+indices.
 
 Arguments:
 
 
@@ -24,14 +24,14 @@ ________________________________________
 DFTAbs
 ----------------------------------------
 
-Absolute value of discrete Fourier transform. One-, two-, or
-three-dimensional data is allowed. The implementation is meant to be
-as flexible as possible. We use FFTW for the CPU implementation;
-whichever types your implementation of FFTW supports will be supported
-in this layer at runtime. The GPU implementation uses cuFFT on NVIDIA
-GPUs and will support float and double at runtime (assuming CUDA
-support is enabled). A future implementation will support rocFFT for
-AMD GPUs.
+The :python:`DFTAbs` layer performs the absolute value of discrete
+Fourier transform. One-, two-, or  three-dimensional data is
+allowed. The implementation is meant to be as flexible as possible. We
+use FFTW for the CPU implementation; whichever types your
+implementation of FFTW supports will be supported in this layer at
+runtime. The GPU implementation uses cuFFT on NVIDIA GPUs and will
+support float and double at runtime (assuming CUDA support is
+enabled). A future implementation will support rocFFT for AMD GPUs.
 
 Currently, LBANN only supports outputting the same type that is used
 as input. As such, in forward propagation, this will do a DFT and then
@@ -52,7 +52,7 @@ ________________________________________
 MatMul
 ----------------------------------------
 
-The MatMul layer performs Matrix multiplication.
+The :python:`MatMul` layer performs Matrix multiplication.
 
 Performs matrix product of two 2D input tensors. If the input tensors
 are 3D, then matrix products are computed independently over the first
Original file line number	Diff line number	Diff line change
`@@ -64,7 +64,7 @@ Profobuf (Advanced)`
`64`	`64`	`callback {`
`65`	`65`	`timer {`
`66`	`66`	`}`
`67`		`- print_atatistics {`
	`67`	`+ print_statistics {`
`68`	`68`	`batch_interval: 5`
`69`	`69`	`}`
`70`	`70`	`save_model {`
Original file line number	Diff line number	Diff line change
`@@ -63,7 +63,7 @@`
`63`	`63`	`#`
`64`	`64`	`# This is also used if you do content translation via gettext catalogs.`
`65`	`65`	`# Usually you set "language" from the command line for these cases.`
`66`		`-language = None`
	`66`	`+language = 'en'`
`67`	`67`
`68`	`68`	`# List of patterns, relative to source directory, that match files and`
`69`	`69`	`# directories to ignore when looking for source files.`