Machine Learning & Deep Learning Fundamentals

with deeplizard.

Learnable Parameters in a Convolutional Neural Network (CNN) explained

April 28, 2018 by

Blog

Learnable parameters in a CNN

What’s going on everyone? Last time, we learned about learnable parameters in a fully connected network of dense layers. Now, we’re going to talk about these parameters in the scenario when our network is a convolutional neural network, or CNN.

We’ll first start out by discussing what the learnable parameters within a convolutional neural network are, and then see how the total number of learnable parameters within a CNN is calculated. And after we see how this is done, we’ll illustrate the calculation using a simple convolutional neural network.

neural network 3 layers.png

What are the learnable parameters in a CNN?

Alright, what are the learnable parameters in a CNN? Well, it turns out, that generally, they’re the same parameters we saw in a standard fully connected network. That is, the weights and biases. But, we have to consider how, architecturally, the two types of networks are different, and how that’s going to affect our calculation. Let’s explore that now.

How the number of learnable parameters is calculated

So, just as with a standard network, with a CNN, we’ll calculate the number of parameters per layer, and then we’ll sum up the parameters in each layer to get the total amount of learnable parameters in the entire network.

// pseudocode
let sum = 0;
network.layers.forEach(function(layer) {
    sum += layer.getLearnableParameters().length;
}

For a dense layer, this is what we determined would tell us the number of learnable parameters:

inputs * outputs + biases

Now, let’s consider what a convolutional layer has that a dense layer doesn’t.

A convolutional layer has filters, also known as kernels. As the architects of our network, we determine how many filters are in a convoltuional layer as well as how large these filters are, and we need to consider these things in our calculation.

With this in mind, we’ll modify our formula for determining the number of learnable parameters in a convolutional layer.

So, what is the input going to be for a given convolutional layer. Well that’s going to depend on what type of layer the previous layer was.

  • If the previous layer was a dense layer, the input to the conv layer is just the number of nodes in the previous dense layer.
  • If the previous layer was a convolutional layer, the input will be the number of filters from that previous convolutional layer.

Now, what’s the output of a convolutional layer?

  • With a dense layer, it was just the number of nodes.
  • With a convolutional layer, the output will be the number of filters times the size of the filters.

We’ll see this illustrated in just a sec. Finally, the number of biases, well that’ll just be equal to the number of filters in the layer.

So overall, we have the same general setup for the number of learnable parameters in the layer being calculated as the number of inputs times the number of output plus the number of biases.

inputs * outputs + biases

Just with a convolutional layer, the inputs and outputs themselves are considering the number of filters and the size of the filters. Let’s check ourselves by seeing this calculation in action with a simple CNN.

Calculating the number of learnable parameters in a CNN

Suppose we have a CNN made up of an input layer, two hidden convolutional layers, and a dense output layer.

  • input layer
  • hidden convolutional layer
  • hidden convolutional layer
  • dense output layer

Our input layer is made up of input data from images of size 20x20x3, where 20x20 specifies the width and height of the images, and 3 specifies the number of channels. The three channels indicate that our images are in RGB color scale, and these three channels will represent the input features in this layer.

Our first convolutional layer is made up of 2 filters of size 3x3. Our second convolutional layer is made up of 3 filters of size 3x3. And our output layer is a dense layer with 2 nodes.

We’ll assume that the network contains bias terms and that we’re using zero padding throughout the network to maintain the dimensions of the images. Check the zero padding video if you’re unfamiliar with this concept.

  • input layer - images of size 20x20x3
  • hidden convolutional layer - 2 filters of size 3x3
  • hidden convolutional layer - 3 filters of size 3x3
  • dense output layer - 2 nodes

Input layer

Now, the same rule applies here for the input layer that we talked about last time. The input layer has no learnable parameters since it just contains the input data.

Conv layer 1

Moving on to the first hidden convolutional layer, how many inputs do we have coming into this layer? We have 3 from our input layer. How many outputs? Well, let’s see. Remember, the number of outputs is the number of filters times the filter size. So we have two filters, each of size 3x3. So 2*3*3 = 18. Multiplying our three inputs by our 18 outputs, we have 54 weights. Now how many biases? Just two, since the number of biases is equal to the number of filters. So that gives us 56 total learnable parameters in this layer.

Conv layer 2

Now let’s move to our next convolutional layer. How many inputs are coming in to this layer? We have two from the number of filters in the previous layer. How many outputs? Well, we have three filters, again of size 3x3. So that’s 3*3*3 = 27 outputs. Multiplying our two inputs by the 27 outputs, we have 54 weights in this layer. Adding three bias terms from the three filters, we have 57 learnable parameters in this layer .

Output layer

Onto the output layer. How many inputs? We may think just three, right, since that’s the number of filters in the last convolutional layer? But, that’s not quite right. If you’ve followed the Keras series, you know that before passing output from a convolutional layer to a dense layer, that we have to flatten the output by multiplying the dimensions of the data from the conv layer by the number of filters in that layer. In our case, the data is image data.

Since we’re assuming that this network uses zero padding, the dimensions of our images of size 20x20 haven’t changed by the time we get to this layer. So multiplying 20x20 by the three filters gives us a total of 1200 inputs coming in to our output layer.

Now, since this output layer is a dense layer, the number of outputs is just equal to the number of nodes in this layer, so we have two outputs. Multiplying 1200*2 gives us 2400 weights. Adding in our two biases from this layer, we have 2402 learnable parameters in this layer.

The result

Summing up the parameters from all the layers gives us a total of 2515 learnable parameters within the entire network.

So we can see that the process for determining the number of learnable parameters in a convolutional network is generally the same as a standard fully connected network, but we have to do a little extra work by considering some extras, like the number of channels being used in image data, the number of filters, the filter sizes, and flattening convolutional output.

Next up in the Keras series

We’ll be implementing this in code using Keras in the Keras series, so be sure to check that out as well, and in the mean time, let me know your thoughts. See ya soon.

Description

Here, we’re going to learn about the learnable parameters in a convolutional neural network. Last time, we learned about learnable parameters in a fully connected network of dense layers. Now, we’re going to talk about these parameters in the scenario when our network is a convolutional neural network, or CNN. We’ll first start out by discussing what the learnable parameters within a convolutional neural network are. We’ll then see how the total number of learnable parameters within a CNN is calculated. And after we see how this is done, we’ll illustrate the calculation using a simple convolutional neural network. Follow deeplizard: YouTube: https://www.youtube.com/deeplizard Twitter: https://twitter.com/deeplizard Facebook: https://www.facebook.com/Deeplizard-145413762948316 Steemit: https://steemit.com/@deeplizard Instagram: https://www.instagram.com/deeplizard/ Support deeplizard on Patreon: https://www.patreon.com/deeplizard Checkout products deeplizard suggests on Amazon: https://www.amazon.com/shop/deeplizard Support deeplizard by browsing with Brave: https://brave.com/dee530 Support deeplizard with crypto: Bitcoin: 1AFgm3fLTiG5pNPgnfkKdsktgxLCMYpxCN Litecoin: LTZ2AUGpDmFm85y89PFFvVR5QmfX6Rfzg3 Ether: 0x9105cd0ecbc921ad19f6d5f9dd249735da8269ef Recommended books on AI: The Most Human Human: What Artificial Intelligence Teaches Us About Being Alive: http://amzn.to/2GtjKqu Life 3.0: Being Human in the Age of Artificial Intelligence https://amzn.to/2H5Iau4 Playlists: Data Science - https://www.youtube.com/playlist?list=PLZbbT5o_s2xo_SRS9wn9OSs_kzA9Jfz8k Machine Learning - https://www.youtube.com/playlist?list=PLZbbT5o_s2xq7LwI2y8_QtvuXZedL6tQU Keras - https://www.youtube.com/playlist?list=PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL Music: Brittle Rille by Kevin MacLeod Investigations by Kevin MacLeod YouTube: https://www.youtube.com/channel/UCSZXFhRIx6b0dFX3xS8L1yQ Website: http://incompetech.com/ Licensed under Creative Commons: By Attribution 3.0 License http://creativecommons.org/licenses/by/3.0/