Convolutional Neural Networks dimension

1 minute read

convolution operation: share the convolution core

output size is: where input size is $n\times n$, convolution core size is $f\times f$.

con

terms: channels, strides, padding

  • channels: look following picture, the input is $6\times 6\times 3$ RGB picture, and we use $3\times 3\times 3$ convolution core, the last 3 is the channel number of convolution core, which is same as the input picture channel number. During convolution, we multiply the input and add it together , so the output size is $4\times 4\times 1$.

cnn

and we often not only use only one convolution core, we use multi-cores to get multi-features form input. The following image shows the situation that we use two cores, so the output size is $4\times 4\times 2$. Then the input channel of following convolution layer is 2.

cnn

cnn

output size: where $C_o$ is the number of convolution cores(output channel).

  • padding: add data to the border of input, often to make sure the output size is same as input.

    padding

    output size: where $p$ is padding length of one side. e.g. the last picture $p = 1$.

    if you want to make sure output size is same as input, set $p = (f-1)/2$, since at this moment, $n-f+2p+1 = n$.

  • strides: stripe is the moving step length of convolution core.

strides

strdes

output size: where $s$ is stride length.

demo

cnn

So lets get the demo’s output size:


Tags:

Updated:

Comments