Skip to content

Commit 622aa88

Browse files
graham63benson31bvanessen
authored
Added documentation for operator layers (#2211)
* Added documentation for operator layers * Update docs/operators.rst Co-authored-by: Tom Benson <[email protected]> * Update docs/operators.rst Co-authored-by: Tom Benson <[email protected]> * Update docs/operators.rst Co-authored-by: Tom Benson <[email protected]> * Cleaned up layer documentation, added Python layer names in code tags * fixed typo in callbacks example * Documented full list of operators * Added python front end names to operator table. Improved some descriptions/math examples * Fixed code references in operator_layer.rst * Cleaned up FIXMEs * Update docs/layers/loss_layers.rst --------- Co-authored-by: Tom Benson <[email protected]> Co-authored-by: Brian C. Van Essen <[email protected]>
1 parent c74d1cf commit 622aa88

16 files changed

+1553
-273
lines changed

docs/callbacks.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ Profobuf (Advanced)
6464
callback {
6565
timer {
6666
}
67-
print_atatistics {
67+
print_statistics {
6868
batch_interval: 5
6969
}
7070
save_model {

docs/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@
6363
#
6464
# This is also used if you do content translation via gettext catalogs.
6565
# Usually you set "language" from the command line for these cases.
66-
language = None
66+
language = 'en'
6767

6868
# List of patterns, relative to source directory, that match files and
6969
# directories to ignore when looking for source files.

docs/index.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,12 @@ Users are advised to view `the Doxygen API Documentation
4848

4949
layers
5050

51+
.. toctree::
52+
:maxdepth: 2
53+
:caption: LBANN Operators
54+
55+
operators
56+
5157
.. toctree::
5258
:maxdepth: 1
5359
:caption: Data Ingestion

docs/layers.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -200,7 +200,7 @@ LBANN Layers List
200200
:maxdepth: 2
201201

202202
I/O Layers <layers/io_layers>
203-
Operator Layers <layers/operator_layers>
203+
Operator Layer <layers/operator_layer>
204204
Transform Layers <layers/transform_layers>
205205
Learning Layers <layers/learning_layers>
206206
Loss Layers <layers/loss_layers>

docs/layers/activation_layers.rst

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,9 @@ ________________________________________
2828
Elu
2929
----------------------------------------
3030

31-
Exponential linear unit
31+
The :python:`Elu` layer is similar to :python:`Relu` but with negative
32+
values that cause the mean of the :python:`Elu` activation function to
33+
shift toward 0.
3234

3335
.. math::
3436
@@ -59,7 +61,7 @@ ________________________________________
5961
Identity
6062
----------------------------------------
6163

62-
Output the input tensor
64+
The :python:`Identity` layer outputs the input tensor.
6365

6466
This layer is very cheap since it just involves setting up tensor
6567
views.
@@ -77,6 +79,10 @@ ________________________________________
7779
LeakyRelu
7880
----------------------------------------
7981

82+
:python:`LeakyRelu` modifies the :python:`Relu` function to allow for
83+
a small, non-zero gradient when the unit is saturated and not
84+
active.
85+
8086
.. math::
8187
8288
\text{LeakyReLU}(x; \alpha) =
@@ -106,7 +112,7 @@ ________________________________________
106112
LogSoftmax
107113
----------------------------------------
108114

109-
Logarithm of softmax function
115+
:python:`LogSoftmax` is the logarithm of the softmax function.
110116

111117
.. math::
112118
@@ -125,7 +131,8 @@ ________________________________________
125131
Relu
126132
----------------------------------------
127133

128-
Rectified linear unit
134+
The :python:`Relu` layer outputs input directly if positive, otherwise
135+
outputs zero.
129136

130137
.. math::
131138
@@ -144,6 +151,9 @@ ________________________________________
144151
Softmax
145152
----------------------------------------
146153

154+
The :python:`Softmax` layer turns a vector of K real values into a
155+
vector of K real values that sum to 1.
156+
147157
.. math::
148158
149159
\text{softmax}(x)_i = \frac{e^{x_i}}{\sum_j e^{x_j}}

docs/layers/image_layers.rst

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,8 @@ ________________________________________
2626
BilinearResize
2727
----------------------------------------
2828

29-
Resize image with bilinear interpolation
29+
The :python:`BilinearResize` layer resizes image with bilinear
30+
interpolation
3031

3132
Expects a 3D input tensor, which is interpreted as an image in CHW
3233
format. Gradients are not propagated during backprop.
@@ -48,11 +49,11 @@ ________________________________________
4849
CompositeImageTransformation
4950
----------------------------------------
5051

51-
Rotate a image clockwise around its center, then shear , then
52-
translate
52+
The :python:`CompositeImageTransformation` layer rotates an image
53+
clockwise around its center, then shear, then translate.
5354

5455
Expects 4 inputs: a 3D image tensor in CHW format, a scalar rotation
55-
angle, a tensor for (X,Y) shear factor, a tensor for (X,Y) translate.
56+
angle, a tensor for (X,Y) shear factor, a tensor for (X,Y) translate.
5657

5758
Arguments: None
5859

@@ -67,7 +68,8 @@ ________________________________________
6768
Rotation
6869
----------------------------------------
6970

70-
Rotate a image clockwise around its center
71+
The :python:`Rotation` layer rotates an image clockwise around its
72+
center.
7173

7274
Expects two inputs: a 3D image tensor in CHW format and a scalar
7375
rotation angle.

docs/layers/io_layers.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ ________________________________________
2222
Input
2323
---------------------------
2424

25-
The input layer is a data tensor from data reader
25+
The :python:`Input` layer is a data tensor from data reader.
2626

2727
Arguments:
2828

docs/layers/learning_layers.rst

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -31,8 +31,8 @@ ________________________________________
3131
ChannelwiseFullyConnected
3232
----------------------------------------
3333

34-
The ChannelwiseFullyConnected layer applies an affine transformation
35-
to tensor channels.
34+
The :python:`ChannelwiseFullyConnected` layer applies an affine
35+
transformation to tensor channels.
3636

3737
The input tensor is sliced along the first tensor dimension (the
3838
"channel" dimension for image data in CHW format) and the same affine
@@ -78,9 +78,8 @@ ________________________________________
7878
ChannelwiseScaleBias
7979
----------------------------------------
8080

81-
The ChannelwiseScaleBias layer applies per-channel scale and bias.
82-
83-
The input tensor is sliced along the first tensor dimension (the
81+
The :python:`ChannelwiseScaleBias` layer applies per-channel scale and
82+
bias. The input tensor is sliced along the first tensor dimension (the
8483
"channel" dimension, assuming image data in CHW format) and scale and
8584
bias terms are applied independently to each slice. More precisely,
8685
given input and output tensors
@@ -110,7 +109,7 @@ ________________________________________
110109
Convolution
111110
----------------------------------------
112111

113-
The Convolution layer applies convolution (more precisely,
112+
The :python:`Convolution` layer applies convolution (more precisely,
114113
cross-correlation) to the input tensor. This is primarily optimized
115114
for image data in CHW format.
116115

@@ -189,7 +188,8 @@ ________________________________________
189188
Deconvolution
190189
----------------------------------------
191190

192-
This operation is the transpose of standard deep learning convolution.
191+
The :python:`Deconvolution` layer is the transpose of standard deep
192+
learning convolution.
193193

194194
Pedantic comments: this operation is commonly called "deconvolution"
195195
in the deep learning community, but it is not a true deconvolution.
@@ -236,7 +236,7 @@ ________________________________________
236236
Embedding
237237
----------------------------------------
238238

239-
The Embedding layer is a lookup table to embedding vectors.
239+
The :python:`Embedding` layer is a lookup table to embedding vectors.
240240

241241
Takes a scalar input, interprets it as an index, and outputs the
242242
corresponding vector. The number of embedding vectors and the size of
@@ -268,7 +268,8 @@ ________________________________________
268268
EntrywiseScaleBias
269269
----------------------------------------
270270

271-
The EntrywiseScaleBias layer applies entry-wise scale and bias.
271+
The :python:`EntrywiseScaleBias` layer applies entry-wise scale and
272+
bias.
272273

273274
Scale and bias terms are applied independently to each tensor
274275
entry. More precisely, given input, output, scale, and bias tensors
@@ -297,7 +298,7 @@ ________________________________________
297298
FullyConnected
298299
----------------------------------------
299300

300-
Affine transformation
301+
The :python:`FullyConnected` layer is an affine transformation.
301302

302303
Flattens the input tensor, multiplies with a weights matrix, and
303304
optionally applies an entry-wise bias. Following a row-vector
@@ -337,7 +338,7 @@ ________________________________________
337338
GRU
338339
----------------------------------------
339340

340-
Stacked gated recurrent unit
341+
The :python:`GRU` layer is a stacked gated recurrent unit.
341342

342343
Expects two inputs: a 2D input sequence (
343344
:math:`\text{sequence_length}\times\text{input_size}`) and a 2D

docs/layers/loss_layers.rst

Lines changed: 15 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ ________________________________________
2929
CategoricalAccuracy
3030
----------------------------------------
3131

32-
The Categorical Accuracy Layer is a 0-1 loss function.
32+
The :python:`CategoricalAccuracy` Layer is a 0-1 loss function.
3333

3434
Requires two inputs, which are respectively interpreted as prediction
3535
scores and as a one-hot label vector. The output is one if the top
@@ -52,7 +52,8 @@ ________________________________________
5252
CrossEntropy
5353
----------------------------------------
5454

55-
Cross entropy between probability vectors.
55+
The :python:`CrossEntropy` layer measures the probability and error
56+
between vectors.
5657

5758
Given a predicted distribution :math:`y` and ground truth distribution
5859
:math:`\hat{y}`,
@@ -76,7 +77,7 @@ ________________________________________
7677
L1Norm
7778
----------------------------------------
7879

79-
L1 vector norm
80+
The :python:`L1Norm` layer is the L1 norm of a vector.
8081

8182
.. math::
8283
@@ -95,7 +96,7 @@ ________________________________________
9596
L2Norm2
9697
----------------------------------------
9798

98-
Square of L2 vector norm
99+
The :python:`L2Norm2` layer is the square of L2 vector norm.
99100

100101
.. math::
101102
@@ -114,7 +115,8 @@ ________________________________________
114115
MeanAbsoluteError
115116
----------------------------------------
116117

117-
Given a prediction :math:`y` and ground truth :math:`\hat{y}`,
118+
The :python:`MeanAbsoluteError` given a prediction :math:`y` and
119+
ground truth :math:`\hat{y}`:
118120

119121
.. math::
120122
@@ -134,7 +136,8 @@ ________________________________________
134136
MeanSquaredError
135137
----------------------------------------
136138

137-
Given a prediction :math:`y` and ground truth :math:`\hat{y}`,
139+
The :python:`MeanSquaredError` layer given a prediction :math:`y` and
140+
ground truth :math:`\hat{y}`:
138141

139142
.. math::
140143
@@ -154,11 +157,12 @@ ________________________________________
154157
TopKCategoricalAccuracy
155158
----------------------------------------
156159

157-
Requires two inputs, which are respectively interpreted as prediction
158-
scores and as a one-hot label vector. The output is one if the
159-
corresponding label matches one of the top-k prediction scores and is
160-
otherwise zero. Ties in the top-k prediction scores are broken in
161-
favor of entries with smaller indices.
160+
The :python:`TopKCategoricalAccuracy` layer requires two inputs, which
161+
are respectively interpreted as prediction scores and as a one-hot
162+
label vector. The output is one if the corresponding label matches one
163+
of the top-k prediction scores and is otherwise zero. Ties in the
164+
top-k prediction scores are broken in favor of entries with smaller
165+
indices.
162166

163167
Arguments:
164168

docs/layers/math_layers.rst

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -24,14 +24,14 @@ ________________________________________
2424
DFTAbs
2525
----------------------------------------
2626

27-
Absolute value of discrete Fourier transform. One-, two-, or
28-
three-dimensional data is allowed. The implementation is meant to be
29-
as flexible as possible. We use FFTW for the CPU implementation;
30-
whichever types your implementation of FFTW supports will be supported
31-
in this layer at runtime. The GPU implementation uses cuFFT on NVIDIA
32-
GPUs and will support float and double at runtime (assuming CUDA
33-
support is enabled). A future implementation will support rocFFT for
34-
AMD GPUs.
27+
The :python:`DFTAbs` layer performs the absolute value of discrete
28+
Fourier transform. One-, two-, or three-dimensional data is
29+
allowed. The implementation is meant to be as flexible as possible. We
30+
use FFTW for the CPU implementation; whichever types your
31+
implementation of FFTW supports will be supported in this layer at
32+
runtime. The GPU implementation uses cuFFT on NVIDIA GPUs and will
33+
support float and double at runtime (assuming CUDA support is
34+
enabled). A future implementation will support rocFFT for AMD GPUs.
3535

3636
Currently, LBANN only supports outputting the same type that is used
3737
as input. As such, in forward propagation, this will do a DFT and then
@@ -52,7 +52,7 @@ ________________________________________
5252
MatMul
5353
----------------------------------------
5454

55-
The MatMul layer performs Matrix multiplication.
55+
The :python:`MatMul` layer performs Matrix multiplication.
5656

5757
Performs matrix product of two 2D input tensors. If the input tensors
5858
are 3D, then matrix products are computed independently over the first

0 commit comments

Comments
 (0)