# Neural Network Programming - Deep Learning with PyTorch

Deep Learning Course 3 of 5 - Level: Intermediate

## CNN Training with Code Example - Neural Network Programming Course

### video

expand_more chevron_left

### text

expand_more chevron_left

### CNN Training Process

Welcome to this neural network programming series with PyTorch. In this episode, we will learn the steps needed to train a convolutional neural network. So far in this series, we learned about Tensors, and we've learned all about PyTorch neural networks. We are now ready to begin the training process.

• Prepare the data
• Build the model
• Train the model
• Calculate the loss, the gradient, and update the weights
• Analyze the model's results

### Training: What we do after the forward pass

During training, we do a forward pass, but then what? We'll suppose we get a batch and pass it forward through the network. Once the output is obtained, we compare the predicted output to the actual labels, and once we know how close the predicted values are from the actual labels, we tweak the weights inside the network in such a way that the values the network predicts move closer to the true values (labels).

All of this is for a single batch, and we repeat this process for every batch until we have covered every sample in our training set. After we've completed this process for all of the batches and passed over every sample in our training set, we say that an epoch is complete. We use the word epoch to represent a time period in which our entire training set has been covered.

During the entire training process, we do as many epochs as necessary to reach our desired level of accuracy. With this, we have the following steps:

1. Get batch from the training set.
2. Pass batch to network.
3. Calculate the loss (difference between the predicted values and the true values).
4. Calculate the gradient of the loss function w.r.t the network's weights.
5. Update the weights using the gradients to reduce the loss.
6. Repeat steps 1-5 until one epoch is completed.
7. Repeat steps 1-6 for as many epochs required to reach the minimum loss.

We already know exactly how to do steps `1` and `2`. If you've already covered the deep learning fundamentals series, then you know that we use a loss function to perform step `3`, and you know that we use backpropagation and an optimization algorithm to perform step `4` and `5`. Steps `6` and `7` are just standard Python loops (the training loop). Let's see how this is done in code.

### The Training Process

Since we disabled PyTorch's gradient tracking feature in a previous episode, we need to be sure to turn it back on (it is on by default).

```> torch.set_grad_enabled(True)
0x15b22d012b0>
```

#### Preparing for the Forward Pass

We already know how to get a batch and pass it forward through the network. Let's see what we do after the forward pass is complete.

We'll begin by:

1. Creating an instance of our `Network` class.
2. Creating a data loader that provides batches of size `100` from our training set.
3. Unpacking the images and labels from one of these batches.
```> network = Network()

> batch = next(iter(train_loader)) # Getting a batch
> images, labels = batch
```

Next, we are ready to pass our batch of images forward through the network and obtain the output predictions. Once we have the prediction tensor, we can use the predictions and the true labels to calculate the loss.

#### Calculating the loss

To do this we will use the `cross_entropy()` loss function that is available in PyTorch's `nn.functional` API. Once we have the loss, we can print it, and also check the number of correct predictions using the function we created a previous post.

```> preds = network(images)
> loss = F.cross_entropy(preds, labels) # Calculating the loss

> loss.item()
2.307542085647583

> get_num_correct(preds, labels)
9
```

The `cross_entropy()` function returned a scalar valued tenor, and so we used the `item()` method to print the loss as a Python number. We got `9` out of `100` correct, and since we have `10` prediction classes, this is what we'd expect by guessing at random.

Calculating the gradients is very easy using PyTorch. Since our network is a PyTorch `nn.Module`, PyTorch has created a computation graph under the hood. As our tensor flowed forward through our network, all of the computations where added to the graph. The computation graph is then used by PyTorch to calculate the gradients of the loss function with respect to the network's weights.

Before we calculate the gradients, let's verify that we currently have no gradients inside our `conv1` layer. The gradients are tensors that are accessible in the `grad` (short for gradient) attribute of the weight tensor of each layer.

```> network.conv1.weight.grad
None
```

To calculate the gradients, we call the `backward()` method on the loss tensor, like so:

```loss.backward() # Calculating the gradients
```

Now, the gradients of the loss function have been stored inside weight tensors.

```> network.conv1.weight.grad.shape
torch.Size([6, 1, 5, 5])
```

These gradients are used by the optimizer to update the respective weights. To create our optimizer, we use the `torch.optim` package that has many optimization algorithm implementations that we can use. We'll use `Adam` for our example.

#### Updating the Weights

To the `Adam` class constructor, we pass the network parameters (this is how the optimizer is able to access the gradients), and we pass the learning rate .

Finally, all we have to do to update the weights is to tell the optimizer to use the gradients to step in the direction of the loss function's minimum.

```optimizer = optim.Adam(network.parameters(), lr=0.01)
optimizer.step() # Updating the weights
```

When the `step()` function is called, the optimizer updates the weights using the gradients that are stored in the network's parameters. This means that we should expect our loss to be reduced if we pass the same batch through the network again. Checking this, we can see that this is indeed the case:

```> preds = network(images)
> loss.item()

> loss = F.cross_entropy(preds, labels)
2.262690782546997

> get_num_correct(preds, labels)
15
```

### Train Using a Single Batch

We can summarize the code for training with a single batch in the following way:

```network = Network()

batch = next(iter(train_loader)) # Get Batch
images, labels = batch

preds = network(images) # Pass Batch
loss = F.cross_entropy(preds, labels) # Calculate Loss

optimizer.step() # Update Weights

print('loss1:', loss.item())
preds = network(images)
loss = F.cross_entropy(preds, labels)
print('loss2:', loss.item())
```

#### Output:

```loss1: 2.3034827709198
loss2: 2.2825052738189697
```

### Building the Training Loop is Next

We should now have a good understanding of the training process. In the next episode, we'll see how these ideas are extended by completing the process by constructing the training loop. See you in the next one!

### quiz

expand_more chevron_left DEEPLIZARD Message notifications

### resources

expand_more chevron_left
In this episode, we discuss the training process in general and show how to train a CNN with PyTorch. 🕒🦎 VIDEO SECTIONS 🦎🕒 00:00 Welcome to DEEPLIZARD - Go to deeplizard.com for learning resources 00:30 Help deeplizard add video timestamps - See example in the description 17:48 Collective Intelligence and the DEEPLIZARD HIVEMIND 💥🦎 DEEPLIZARD COMMUNITY RESOURCES 🦎💥 👋 Hey, we're Chris and Mandy, the creators of deeplizard! 👀 CHECK OUT OUR VLOG: 🔗 https://youtube.com/deeplizardvlog 👉 Check out the blog post and other resources for this video: 🔗 https://deeplizard.com/learn/video/0VCOG8IeVf8 💻 DOWNLOAD ACCESS TO CODE FILES 🤖 Available for members of the deeplizard hivemind: 🔗 https://deeplizard.com/resources 🧠 Support collective intelligence, join the deeplizard hivemind: 🔗 https://deeplizard.com/hivemind 🤜 Support collective intelligence, create a quiz question for this video: 🔗 https://deeplizard.com/create-quiz-question 🚀 Boost collective intelligence by sharing this video on social media! ❤️🦎 Special thanks to the following polymaths of the deeplizard hivemind: Tammy Prash Zach Wimpee 👀 Follow deeplizard: Our vlog: https://youtube.com/deeplizardvlog Facebook: https://facebook.com/deeplizard Instagram: https://instagram.com/deeplizard Twitter: https://twitter.com/deeplizard Patreon: https://patreon.com/deeplizard YouTube: https://youtube.com/deeplizard 🎓 Deep Learning with deeplizard: Fundamental Concepts - https://deeplizard.com/learn/video/gZmobeGL0Yg Beginner Code - https://deeplizard.com/learn/video/RznKVRTFkBY Intermediate Code - https://deeplizard.com/learn/video/v5cngxo4mIg Advanced Deep RL - https://deeplizard.com/learn/video/nyjbcRQ-uQ8 🎓 Other Courses: Data Science - https://deeplizard.com/learn/video/d11chG7Z-xk Trading - https://deeplizard.com/learn/video/ZpfCK_uHL9Y 🛒 Check out products deeplizard recommends on Amazon: 🔗 https://amazon.com/shop/deeplizard 📕 Get a FREE 30-day Audible trial and 2 FREE audio books using deeplizard's link: 🔗 https://amzn.to/2yoqWRn 🎵 deeplizard uses music by Kevin MacLeod 🔗 https://youtube.com/channel/UCSZXFhRIx6b0dFX3xS8L1yQ 🔗 http://incompetech.com/ ❤️ Please use the knowledge gained from deeplizard content for good, not evil.

expand_more chevron_left DEEPLIZARD Message notifications