### Learnable parameters in a CNN

What’s going on everyone? Last time, we learned about learnable parameters in a fully connected network of dense layers. Now, we’re going to talk about these parameters in the scenario when our network is a convolutional neural network, or CNN.

We’ll first start out by discussing what the learnable parameters within a convolutional neural network are, and then see how the total number of learnable parameters within a CNN is calculated. And after we see how this is done, we’ll illustrate the calculation using a simple convolutional neural network.

### What are the learnable parameters in a CNN?

Alright, what are the learnable parameters in a CNN? Well, it turns out, that generally, they’re the same parameters we saw in a standard fully connected network. That is, the weights and biases. But, we have to consider how, architecturally, the two types of networks are different, and how that’s going to affect our calculation. Let’s explore that now.

### How the number of learnable parameters is calculated

So, just as with a standard network, with a CNN, we’ll calculate the number of parameters per layer, and then we’ll sum up the parameters in each layer to get the total amount of learnable parameters in the entire network.

// pseudocode let sum = 0; network.layers.forEach(function(layer) { sum += layer.getLearnableParameters().length; }

For a dense layer, this is what we determined would tell us the number of learnable parameters:

Now, let’s consider what a convolutional layer has that a dense layer doesn’t.

A convolutional layer has filters, also known as kernels. As the architects of our network, we determine how many filters are in a convoltuional layer as well as how large these filters are, and we need to consider these things in our calculation.

With this in mind, we’ll modify our formula for determining the number of learnable parameters in a convolutional layer.

So, what is the input going to be for a given convolutional layer. Well that’s going to depend on what type of layer the previous layer was.

- If the previous layer was a dense layer, the input to the conv layer is just the number of nodes in the previous dense layer.
- If the previous layer was a convolutional layer, the input will be the number of filters from that previous convolutional layer.

Now, what’s the output of a convolutional layer?

- With a dense layer, it was just the number of nodes.
- With a convolutional layer, the output will be the number of filters times the size of the filters.

We’ll see this illustrated in just a sec. Finally, the number of biases, well that’ll just be equal to the number of filters in the layer.

So overall, we have the same general setup for the number of learnable parameters in the layer being calculated as the number of inputs times the number of output plus the number of biases.

Just with a convolutional layer, the inputs and outputs themselves are considering the number of filters and the size of the filters. Let’s check ourselves by seeing this calculation in action with a simple CNN.

### Calculating the number of learnable parameters in a CNN

Suppose we have a CNN made up of an input layer, two hidden convolutional layers, and a dense output layer.

- input layer
- hidden convolutional layer
- hidden convolutional layer
- dense output layer

Our input layer is made up of input data from images of size `20x20x3`

, where `20x20`

specifies the width and height of the images, and `3`

specifies the number of channels.
The three channels indicate that our images are in RGB color scale, and these three channels will represent the input features in this layer.

Our first convolutional layer is made up of `2`

filters of size `3x3`

. Our second convolutional layer is made up of `3`

filters of size `3x3`

. And our output layer
is a dense layer with `2`

nodes.

We’ll assume that the network contains bias terms and that we’re using zero padding throughout the network to maintain the dimensions of the images. Check the zero padding video if you’re unfamiliar with this concept.

- input layer - images of size
`20x20x3`

- hidden convolutional layer -
`2`

filters of size`3x3`

- hidden convolutional layer -
`3`

filters of size`3x3`

- dense output layer -
`2`

nodes

#### Input layer

Now, the same rule applies here for the input layer that we talked about last time. The input layer has no learnable parameters since it just contains the input data.

#### Conv layer 1

Moving on to the first hidden convolutional layer, how many inputs do we have coming into this layer? We have `3`

from our input layer. How many outputs? Well, let’s see. Remember, the
number of outputs is the number of filters times the filter size. So we have two filters, each of size `3x3`

. So `2*3*3 = 18`

. Multiplying our three inputs by our `18`

outputs, we have `54`

weights. Now how many biases? Just two, since the number of biases is equal to the number of filters. So that gives us `56`

total learnable parameters in this
layer.

#### Conv layer 2

Now let’s move to our next convolutional layer. How many inputs are coming in to this layer? We have two from the number of filters in the previous layer. How many outputs? Well, we have three filters,
again of size `3x3`

. So that’s `3*3*3 = 27`

outputs. Multiplying our two inputs by the `27`

outputs, we have `54`

weights in this layer. Adding three
bias terms from the three filters, we have `57`

learnable parameters in this layer .

#### Output layer

Onto the output layer. How many inputs? We may think just three, right, since that’s the number of filters in the last convolutional layer? But, that’s not quite right. If you’ve followed the Keras series, you know that before passing output from a convolutional layer to a dense layer, that we have to flatten the output by multiplying the dimensions of the data from the conv layer by the number of filters in that layer. In our case, the data is image data.

Since we’re assuming that this network uses zero padding, the dimensions of our images of size `20x20`

haven’t changed by the time we get to this layer. So multiplying `20x20`

by the three filters gives us a total of `1200`

inputs coming in to our output layer.

Now, since this output layer is a dense layer, the number of outputs is just equal to the number of nodes in this layer, so we have two outputs. Multiplying `1200*2`

gives us `2400`

weights. Adding in our two biases from this layer, we have `2402`

learnable parameters in this layer.

#### The result

Summing up the parameters from all the layers gives us a total of `2515`

learnable parameters within the entire network.

So we can see that the process for determining the number of learnable parameters in a convolutional network is generally the same as a standard fully connected network, but we have to do a little extra work by considering some extras, like the number of channels being used in image data, the number of filters, the filter sizes, and flattening convolutional output.

### Next up in the Keras series

We’ll be implementing this in code using Keras in the Keras series, so be sure to check that out as well, and in the mean time, let me know your thoughts. See ya soon.