Refresher on CNN Activation Map
There are a lot of explanations elsewhere, here I’d like to share some example questions in an interview setting.
For a 32 by 32 by 3 input image, if we were to use 10 convolution filter size 5 by 5 and stride 1, what is the output activation map volume size look like, when there is padding size of 2?
Here are some tips for readers’ reference:
To calculate the output activation map volume size of a convolutional neural network (CNN) with a given input image size, filter size, stride, and padding, you can use the following formula:
Let’s apply this formula to the given values:
- Input image size: 32 by 32 by 3
- Filter size: 5 by 5
- Stride: 1
- Padding: 2
We then have:
So the resulting output activation map will have a size of 32 by 32 for each filter. Since we have 10 filters, the final output activation map volume size would be 32 by 32 by 10.
Let’s check out explanation by Serena Yeung from Stanford:
Happy practicing!