Machine Learning & Deep Learning Fundamentals

with deeplizard.

Learnable Parameters in a Convolutional Neural Network (CNN) explained

April 28, 2018 by

Blog

Learnable parameters in a CNN

What’s going on everyone? Last time, we learned about learnable parameters in a fully connected network of dense layers. Now, we’re going to talk about these parameters in the scenario when our network is a convolutional neural network, or CNN.

We’ll first start out by discussing what the learnable parameters within a convolutional neural network are, and then see how the total number of learnable parameters within a CNN is calculated. And after we see how this is done, we’ll illustrate the calculation using a simple convolutional neural network.

neural network 3 layers.png

What are the learnable parameters in a CNN?

Alright, what are the learnable parameters in a CNN? Well, it turns out, that generally, they’re the same parameters we saw in a standard fully connected network. That is, the weights and biases. But, we have to consider how, architecturally, the two types of networks are different, and how that’s going to affect our calculation. Let’s explore that now.

How the number of learnable parameters is calculated

So, just as with a standard network, with a CNN, we’ll calculate the number of parameters per layer, and then we’ll sum up the parameters in each layer to get the total amount of learnable parameters in the entire network.

// pseudocode
let sum = 0;
network.layers.forEach(function(layer) {
    sum += layer.getLearnableParameters().length;
}

For a dense layer, this is what we determined would tell us the number of learnable parameters:

inputs * outputs + biases

Now, let’s consider what a convolutional layer has that a dense layer doesn’t.

A convolutional layer has filters, also known as kernels. As the architects of our network, we determine how many filters are in a convoltuional layer as well as how large these filters are, and we need to consider these things in our calculation.

With this in mind, we’ll modify our formula for determining the number of learnable parameters in a convolutional layer.

So, what is the input going to be for a given convolutional layer. Well that’s going to depend on what type of layer the previous layer was.

  • If the previous layer was a dense layer, the input to the conv layer is just the number of nodes in the previous dense layer.
  • If the previous layer was a convolutional layer, the input will be the number of filters from that previous convolutional layer.

Now, what’s the output of a convolutional layer?

  • With a dense layer, it was just the number of nodes.
  • With a convolutional layer, the output will be the number of filters times the size of the filters.

We’ll see this illustrated in just a sec. Finally, the number of biases, well that’ll just be equal to the number of filters in the layer.

So overall, we have the same general setup for the number of learnable parameters in the layer being calculated as the number of inputs times the number of output plus the number of biases.

inputs * outputs + biases

Just with a convolutional layer, the inputs and outputs themselves are considering the number of filters and the size of the filters. Let’s check ourselves by seeing this calculation in action with a simple CNN.

Calculating the number of learnable parameters in a CNN

Suppose we have a CNN made up of an input layer, two hidden convolutional layers, and a dense output layer.

  • input layer
  • hidden convolutional layer
  • hidden convolutional layer
  • dense output layer

Our input layer is made up of input data from images of size 20x20x3, where 20x20 specifies the width and height of the images, and 3 specifies the number of channels. The three channels indicate that our images are in RGB color scale, and these three channels will represent the input features in this layer.

Our first convolutional layer is made up of 2 filters of size 3x3. Our second convolutional layer is made up of 3 filters of size 3x3. And our output layer is a dense layer with 2 nodes.

We’ll assume that the network contains bias terms and that we’re using zero padding throughout the network to maintain the dimensions of the images. Check the zero padding video if you’re unfamiliar with this concept.

  • input layer - images of size 20x20x3
  • hidden convolutional layer - 2 filters of size 3x3
  • hidden convolutional layer - 3 filters of size 3x3
  • dense output layer - 2 nodes

Input layer

Now, the same rule applies here for the input layer that we talked about last time. The input layer has no learnable parameters since it just contains the input data.

Conv layer 1

Moving on to the first hidden convolutional layer, how many inputs do we have coming into this layer? We have 3 from our input layer. How many outputs? Well, let’s see. Remember, the number of outputs is the number of filters times the filter size. So we have two filters, each of size 3x3. So 2*3*3 = 18. Multiplying our three inputs by our 18 outputs, we have 54 weights. Now how many biases? Just two, since the number of biases is equal to the number of filters. So that gives us 56 total learnable parameters in this layer.

Conv layer 2

Now let’s move to our next convolutional layer. How many inputs are coming in to this layer? We have two from the number of filters in the previous layer. How many outputs? Well, we have three filters, again of size 3x3. So that’s 3*3*3 = 27 outputs. Multiplying our two inputs by the 27 outputs, we have 54 weights in this layer. Adding three bias terms from the three filters, we have 57 learnable parameters in this layer .

Output layer

Onto the output layer. How many inputs? We may think just three, right, since that’s the number of filters in the last convolutional layer? But, that’s not quite right. If you’ve followed the Keras series, you know that before passing output from a convolutional layer to a dense layer, that we have to flatten the output by multiplying the dimensions of the data from the conv layer by the number of filters in that layer. In our case, the data is image data.

Since we’re assuming that this network uses zero padding, the dimensions of our images of size 20x20 haven’t changed by the time we get to this layer. So multiplying 20x20 by the three filters gives us a total of 1200 inputs coming in to our output layer.

Now, since this output layer is a dense layer, the number of outputs is just equal to the number of nodes in this layer, so we have two outputs. Multiplying 1200*2 gives us 2400 weights. Adding in our two biases from this layer, we have 2402 learnable parameters in this layer.

The result

Summing up the parameters from all the layers gives us a total of 2515 learnable parameters within the entire network.

So we can see that the process for determining the number of learnable parameters in a convolutional network is generally the same as a standard fully connected network, but we have to do a little extra work by considering some extras, like the number of channels being used in image data, the number of filters, the filter sizes, and flattening convolutional output.

Next up in the Keras series

We’ll be implementing this in code using Keras in the Keras series, so be sure to check that out as well, and in the mean time, let me know your thoughts. See ya soon.

Description

Here, we’re going to learn about the learnable parameters in a convolutional neural network. Last time, we learned about learnable parameters in a fully connected network of dense layers. Now, we’re going to talk about these parameters in the scenario when our network is a convolutional neural network, or CNN. We’ll first start out by discussing what the learnable parameters within a convolutional neural network are. We’ll then see how the total number of learnable parameters within a CNN is calculated. And after we see how this is done, we’ll illustrate the calculation using a simple convolutional neural network. 💥🦎 DEEPLIZARD COMMUNITY RESOURCES 🦎💥 👉 Check out the blog post and other resources for this video: 🔗 https://deeplizard.com/learn/video/gmBfb6LNnZs 💻 DOWNLOAD ACCESS TO CODE FILES 🤖 Available for members of the deeplizard hivemind: 🔗 https://www.patreon.com/posts/27743395 🧠 Support collective intelligence, join the deeplizard hivemind: 🔗 https://deeplizard.com/hivemind 🤜 Support collective intelligence, create a quiz question for this video: 🔗 https://deeplizard.com/create-quiz-question 🚀 Boost collective intelligence by sharing this video on social media! ❤️🦎 Special thanks to the following polymaths of the deeplizard hivemind: Peder B. Helland 👀 Follow deeplizard: Twitter: https://twitter.com/deeplizard Facebook: https://www.facebook.com/Deeplizard-145413762948316 Patreon: https://www.patreon.com/deeplizard YouTube: https://www.youtube.com/deeplizard Instagram: https://www.instagram.com/deeplizard/ 🎓 Other deeplizard courses: Reinforcement Learning - https://deeplizard.com/learn/playlist/PLZbbT5o_s2xoWNVdDudn51XM8lOuZ_Njv NN Programming - https://deeplizard.com/learn/playlist/PLZbbT5o_s2xrfNyHZsM6ufI0iZENK9xgG DL Fundamentals - https://deeplizard.com/learn/playlist/PLZbbT5o_s2xq7LwI2y8_QtvuXZedL6tQU Keras - https://deeplizard.com/learn/playlist/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL TensorFlow.js - https://deeplizard.com/learn/playlist/PLZbbT5o_s2xr83l8w44N_g3pygvajLrJ- Data Science - https://deeplizard.com/learn/playlist/PLZbbT5o_s2xrth-Cqs_R9- Trading - https://deeplizard.com/learn/playlist/PLZbbT5o_s2xr17PqeytCKiCD-TJj89rII 🛒 Check out products deeplizard recommends on Amazon: 🔗 https://www.amazon.com/shop/deeplizard 📕 Get a FREE 30-day Audible trial and 2 FREE audio books using deeplizard’s link: 🔗 https://amzn.to/2yoqWRn 🎵 deeplizard uses music by Kevin MacLeod 🔗 https://www.youtube.com/channel/UCSZXFhRIx6b0dFX3xS8L1yQ 🔗 http://incompetech.com/ ❤️ Please use the knowledge gained from deeplizard content for good, not evil.