50% off | Unlimited Data & AI Learning

Last chance! 50% off unlimited learning

Sale ends in


torch (version 0.1.1)

nn_conv2d: Conv2D module

Description

Applies a 2D convolution over an input signal composed of several input planes.

Usage

nn_conv2d(
  in_channels,
  out_channels,
  kernel_size,
  stride = 1,
  padding = 0,
  dilation = 1,
  groups = 1,
  bias = TRUE,
  padding_mode = "zeros"
)

Arguments

in_channels

(int): Number of channels in the input image

out_channels

(int): Number of channels produced by the convolution

kernel_size

(int or tuple): Size of the convolving kernel

stride

(int or tuple, optional): Stride of the convolution. Default: 1

padding

(int or tuple, optional): Zero-padding added to both sides of the input. Default: 0

dilation

(int or tuple, optional): Spacing between kernel elements. Default: 1

groups

(int, optional): Number of blocked connections from input channels to output channels. Default: 1

bias

(bool, optional): If TRUE, adds a learnable bias to the output. Default: TRUE

padding_mode

(string, optional): 'zeros', 'reflect', 'replicate' or 'circular'. Default: 'zeros'

Shape

  • Input: (N,Cin,Hin,Win)

  • Output: (N,Cout,Hout,Wout) where Hout=Hin+2×padding[0]dilation[0]×(kernel\_size[0]1)1stride[0]+1 Wout=Win+2×padding[1]dilation[1]×(kernel\_size[1]1)1stride[1]+1

Attributes

  • weight (Tensor): the learnable weights of the module of shape (out\_channels,in\_channelsgroups, kernel\_size[0],kernel\_size[1]). The values of these weights are sampled from U(k,k) where k=groupsCini=01kernel\_size[i]

  • bias (Tensor): the learnable bias of the module of shape (out_channels). If bias is TRUE, then the values of these weights are sampled from U(k,k) where k=groupsCini=01kernel\_size[i]

Details

In the simplest case, the output value of the layer with input size (N,Cin,H,W) and output (N,Cout,Hout,Wout) can be precisely described as:

out(Ni,Coutj)=bias(Coutj)+k=0Cin1weight(Coutj,k)input(Ni,k)

where is the valid 2D cross-correlation operator, N is a batch size, C denotes a number of channels, H is a height of input planes in pixels, and W is width in pixels.

  • stride controls the stride for the cross-correlation, a single number or a tuple.

  • padding controls the amount of implicit zero-paddings on both sides for padding number of points for each dimension.

  • dilation controls the spacing between the kernel points; also known as the <U+00E0> trous algorithm. It is harder to describe, but this link_ has a nice visualization of what dilation does.

  • groups controls the connections between inputs and outputs. in_channels and out_channels must both be divisible by groups. For example,

    • At groups=1, all inputs are convolved to all outputs.

    • At groups=2, the operation becomes equivalent to having two conv layers side by side, each seeing half the input channels, and producing half the output channels, and both subsequently concatenated.

    • At groups= in_channels, each input channel is convolved with its own set of filters, of size: out_channelsin_channels.

The parameters kernel_size, stride, padding, dilation can either be:

  • a single int -- in which case the same value is used for the height and width dimension

  • a tuple of two ints -- in which case, the first int is used for the height dimension, and the second int for the width dimension

Examples

Run this code
# NOT RUN {
if (torch_is_installed()) {

# With square kernels and equal stride
m <- nn_conv2d(16, 33, 3, stride = 2)
# non-square kernels and unequal stride and with padding
m <- nn_conv2d(16, 33, c(3, 5), stride=c(2, 1), padding=c(4, 2))
# non-square kernels and unequal stride and with padding and dilation
m <- nn_conv2d(16, 33, c(3, 5), stride=c(2, 1), padding=c(4, 2), dilation=c(3, 1))
input <- torch_randn(20, 16, 50, 100)
output <- m(input)  

}
# }

Run the code above in your browser using DataLab