-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Convolution
Frank Seide edited this page Jul 27, 2016
·
31 revisions
Convolution() computes the convolution of a weight matrix with an image or tensor. This operation is used in image-processing applications and language processing.
Convolution() supports any dimensions, stride, sharing or padding. The syntax is:
Convolution(w, input,
{kernel dimensions},
mapCount = {map dimensions},
stride = {stride dimensions},
sharing = {sharing flags},
autoPadding = {padding flags (boolean)},
lowerPad = {lower padding (int)},
upperPad = {upper padding (int)},
maxTempMemSizeInSamples = 0,
imageLayout = "cudnn")
Where:
-
w- convolution weight matrix, it has the dimensions of[mapCount, kernelDimensionsProduct]. -
input- convolution input -
kernel dimensions- dimensions of the kernel -
mapCount- [named, optional, default is 0] depth of feature map. 0 means use the row dimension ofw -
stride- [named, optional, default is 1] stride dimensions -
sharing- [named, optional, default is true] sharing flags for each input dimension -
autoPadding- [named, optional, default is true] automatic padding flags for each input dimension -
lowerPad- [named, optional, default is 0] precise lower padding for each input dimension -
upperPad- [named, optional, default is 0] precise upper padding for each input dimension -
maxTempMemSizeInSamples- [named optional] maximum amount of auxiliary memory (in samples) that should be reserved to perform convolution operations. Some convolution engines (e.g. cuDNN and GEMM-based engines) can benefit from using workspace as it may improve performance. However, sometimes this may lead to higher memory utilization. Default is 0 which means the same as the input samples.
All values of the form {...} must actually be given as a colon-separated sequence of values, e.g. (5:5) for the kernel dimensions. (If you use the deprecated NDLNetworkBuilder, these must be comma-separated and enclosed in { } instead.)
Example (ConvReLULayer NDL macro):
ConvReLULayer(inp, outMap, inWCount, kW, kH, hStride, vStride, wScale, bValue) =
[
W = LearnableParameter (outMap, inWCount, init="gaussian", initValueScale=wScale)
b = ImageParameter (1, 1, outMap, init="fixedValue", value=bValue)
c = Convolution (W, inp, (kW:kH), stride=(hStride:vStride), autoPadding=true)
y = RectifiedLinear (c + b)
].y
Note: If you are using the deprecated NDLNetworkBuilder, there should be no trailing .y in the example.
The 2D convolution syntax is:
Convolution(w, image,
kernelWidth, kernelHeight,
horizontalStride, verticalStride,
zeroPadding=false, maxTempMemSizeInSamples=0, imageLayout="cudnn" /* or "HWC"*/ )
where:
-
w- convolution weight matrix, it has the dimensions of[mapCount, kernelWidth * kernelHeight * inputChannels]. -
image- the input image. -
mapCount- depth of output feature map (number of output channels) -
kernelWidth- width of the kernel -
kernelHeight- height of the kernel -
horizontalStride- stride in horizontal direction -
verticalStride- stride in vertical direction -
zeroPadding- [named optional] specifies whether the sides of the image should be padded with zeros. Default is false. -
maxTempMemSizeInSamples- [named optional] maximum amount of auxiliary memory (in samples) that should be reserved to perform convolution operations. Some convolution engines (e.g. cuDNN and GEMM-based engines) can benefit from using workspace as it may improve performance. However, sometimes this may lead to higher memory utilization. Default is 0 which means the same as the input samples. -
imageLayout- [named optional] the storage format of each image. By default it’sHWC, which means each image is stored as[channel, width, height]in column major. If you use cuDNN to speed up training, you should set it tocudnn, which means each image is stored as[width, height, channel]. Note thatcudnnlayout will work both on GPU and CPU so it is recommended to use it by default.