convolutional neural network
What is a CNN (Convolutional Neural Network)?
It is a convoluted Neural Network.
A convolution is a combined integration of two functions: it shows you how one function modifies the other or the shape of the other.
First the Machine translates the images to 0s and 1s because of black and white.
As humans, the Machine looks for specific features to recognize the image.
We apply a Feature Detector (Filter or Kernel) to our Input Image and we get a Feature Map (Convoluted Feature or Activation Map) which contains the information of how frequent that feature is in the Input Image.
Multiple Feature Maps create the Convolutional Layer.
We apply a Rectifier Function ('relu') because we want to increase non-linearity in our CNN: images themselves are highly non-linear (different objects in the image, background stuff, transitions from pixels).
When we apply our Feature Detector we risk to create something linear (that's why we need to break this linearity by applying the 'relu' function).
For example if there's a linear color progression from white to black in an image, you can break this by entirely removing the black.
Max Pooling, Pooled Feature Maps, Flattening & Dense/Full Connection
Max Pooling(Down Sampling) helps us have:
- Spacial Invariance: the Neural Network doesn't care in which part of the image it finds and learns the Features or if they are distorted. In this way, on new and similar images it will recognize the feature. We have some level of flexibility
- We're able to preserve the features: if the image is rotated the Pooled Feature Map will keep the feature's values
- We're reducing the size (75%): we reduce the n° of parameters that are going into our Neural Network & we prevent overfitting
We apply Max Pooling (2x2 with a stride of 2) on our Feature Map:
- Mask the Feature Map with 2x2 Max Pooling
- Take the biggest number from the 2x2 Pooling and record it in the Pooled Feature Map
- Go to the next stride, repeat
What is a stride?
A stride is how many columns and rows you jump by.
Flattening the Pooled Feature Map
We take the Pooled Feature Map and we Flatten() it to an array, so we can use it as an input for our Neural Network.