-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Convolution
U-FAREAST\fseide edited this page May 27, 2016
·
31 revisions
Convolution() computes the convolution of a weight matrix with an image. There is a simplified syntax for 2D convolutions and more advanced syntax for N-dimensional convolutions.
The 2D convolution syntax is:
Convolution(w, image,
kernelWidth, kernelHeight,
horizontalStride, verticalStride,
zeroPadding=false, maxTempMemSizeInSamples=0, imageLayout="cudnn" /* or "HWC"*/ )
where:
-
w- convolution weight matrix, it has the dimensions of[outputChannels, kernelWidth * kernelHeight * inputChannels]. -
image- the input image. -
kernelWidth- width of the kernel -
kernelHeight- height of the kernel -
horizontalStride- stride in horizontal direction -
verticalStride- stride in vertical direction -
zeroPadding- [named optional] specifies whether the sides of the image should be padded with zeros. Default is false. -
maxTempMemSizeInSamples- [named optional] maximum amount of auxiliary memory (in samples) that should be reserved to perform convolution operations. Some convolution engines (e.g. cuDNN and GEMM-based engines) can benefit from using workspace as it may improve performance. However, sometimes this may lead to higher memory utilization. Default is 0 which means the same as the input samples. -
imageLayout- [named optional] the storage format of each image. By default it’sHWC, which means each image is stored as[channel, width, height]in column major. If you use cuDNN to speed up training, you should set it tocudnn, which means each image is stored as[width, height, channel]. Note thatcudnnlayout will work both on GPU and CPU so it is recommended to use it by default.
Example (ConvReLULayer NDL macro):
ConvReLULayer(inp, outMap, inWCount, kW, kH, hStride, vStride, wScale, bValue) =
[
W = LearnableParameter (outMap, inWCount, init="gaussian", initValueScale=wScale)
b = ImageParameter (1, 1, outMap, init="fixedValue", value=bValue, imageLayout="$imageLayout$")
c = Convolution (W, inp, kW, kH, outMap, hStride, vStride, zeroPadding=true, imageLayout="$imageLayout$")
y = RectifiedLinear (c + b)
].y
Note: If you are using the deprecated NDLNetworkBuilder, the optional imageLayout parameter defaults to "HWC" instead; and there should be no trailing .y in the example.
N-dimensional convolution allows to create convolutions of any dimensions, stride, sharing or padding. The syntax is:
Convolution(w, input,
{kernel dimensions},
mapCount = {map dimensions},
stride = {stride dimensions},
sharing = {sharing flags},
autoPadding = {padding flags (boolean)},
lowerPad = {lower padding (int)},
upperPad = {upper padding (int)},
maxTempMemSizeInSamples = 0,
imageLayout = "cudnn")
Where:
-
w- convolution weight matrix, it has the dimensions of[kernelCount, kernelDimensionsProduct]. -
input- convolution input -
kernel dimensions- dimensions of the kernel -
stride- [named, optional, default is 1] stride dimensions -
sharing- [named, optional, default is true] sharing flags for each input dimension -
autoPadding- [named, optional, default is true] automatic padding flags for each input dimension -
lowerPad- [named, optional, default is 0] precise lower padding for each input dimension -
upperPad- [named, optional, default is 0] precise upper padding for each input dimension -
maxTempMemSizeInSamples- [named optional] maximum amount of auxiliary memory (in samples) that should be reserved to perform convolution operations. Some convolution engines (e.g. cuDNN and GEMM-based engines) can benefit from using workspace as it may improve performance. However, sometimes this may lead to higher memory utilization. Default is 0 which means the same size as the number of input samples. -
imageLayout- [named optional] the storage format of each image. The only supported value iscudnn, which means each image is stored as[width, height, channel](column-major notation).
All dimensions arrays are colon-separated. Note: If you use the deprecated NDLNetworkBuilder, these must be comma-separated instead.