Neural Network Programming - Deep Learning with PyTorch

with deeplizard.

CNN Training Loop Explained - Neural Network Code Project

June 30, 2019 by

Blog

CNN Training Loop - Teach a Neural Network

Welcome to this neural network programming series. In this episode, we will learn how to build the training loop for a convolutional neural network using Python.

Without further ado, let's get started.

In the last episode, we learned that the training process is an iterative process, and to train a neural network, we build what is called the training loop.

  • Prepare the data
  • Build the model
  • Train the model
    • Build the training loop
  • Analyze the model's results

Training with a single batch

We can summarize the code for training with a single batch in the following way:

network = Network()

train_loader = torch.utils.data.DataLoader(train_set, batch_size=100)
optimizer = optim.Adam(network.parameters(), lr=0.01)

batch = next(iter(train_loader)) # Get Batch
images, labels = batch

preds = network(images) # Pass Batch
loss = F.cross_entropy(preds, labels) # Calculate Loss

loss.backward() # Calculate Gradients
optimizer.step() # Update Weights

print('loss1:', loss.item())
preds = network(images)
loss = F.cross_entropy(preds, labels)
print('loss2:', loss.item())

Output:

loss1: 2.3034827709198
loss2: 2.2825052738189697

One thing that you'll notice is that we get different results each time we run this code. This is because the model is created each time at the top, and we know from previous posts that the model weights are randomly initialized.

Let's see now how we can modify this code to train using all of the batches and thus, the entire training set.

Training with all batches (single epoch)

Now, to train with all of the batches available inside our data loader, we need to make a few changes and add one additional line of code:

network = Network()

train_loader = torch.utils.data.DataLoader(train_set, batch_size=100)
optimizer = optim.Adam(network.parameters(), lr=0.01)

total_loss = 0
total_correct = 0

for batch in train_loader: # Get Batch
    images, labels = batch 

    preds = network(images) # Pass Batch
    loss = F.cross_entropy(preds, labels) # Calculate Loss

    optimizer.zero_grad()
    loss.backward() # Calculate Gradients
    optimizer.step() # Update Weights

    total_loss += loss.item()
    total_correct += get_num_correct(preds, labels)
    
print(
    "epoch:", 0, 
    "total_correct:", total_correct, 
    "loss:", total_loss
)

Instead of getting a single batch from our data loader, we'll create a for loop that will iterate over all of the batches.

Since we have 60,000 samples in our training set, we will have 60,000 / 100 = 600 iterations. For this reason, we'll remove the print statement from within the loop, and keep track of the total loss and the total number of correct predictions printing them at the end.

Something to notice about these 600 iterations is that our weights will be updated 600 times by the end of the loop. If we raise the batch_size this number will go down and if we lower the batch_size this number will go up.

Finally, after we call the backward() method on our loss tensor, we know the gradients will be calculated and added to the grad attributes of our network's parameters. For this reason, we need to zero out these gradients. We can do this with a method called zero_grad() that comes with the optimizer.

We are ready to run this code. This time the code will take longer because the loop is working on 600 batches.

epoch: 0 total_correct: 42104 loss: 476.6809593439102

We get the results, and we can see that the total number correct out of 60,000 was 42,104.

> total_correct / len(train_set)
0.7017333333333333

That's pretty good after only one epoch (a single full pass over the data). Even though we did one epoch, we still have to keep in mind that the weights were updated 600 times, and this fact depends on our batch size. If made our batch_batch size larger, say 10,000, the weights would only be updated 6 times, and the results wouldn't be quite as good.

Training with multiple epochs

To do multiple epochs, all we have to do is put this code into a for loop. We'll also add the epoch number to the print statement.

network = Network()

train_loader = torch.utils.data.DataLoader(train_set, batch_size=100)
optimizer = optim.Adam(network.parameters(), lr=0.01)

for epoch in range(10):
    
    total_loss = 0
    total_correct = 0
    
    for batch in train_loader: # Get Batch
        images, labels = batch 

        preds = network(images) # Pass Batch
        loss = F.cross_entropy(preds, labels) # Calculate Loss

        optimizer.zero_grad()
        loss.backward() # Calculate Gradients
        optimizer.step() # Update Weights

        total_loss += loss.item()
        total_correct += get_num_correct(preds, labels)

    print(
        "epoch", epoch, 
        "total_correct:", total_correct, 
        "loss:", total_loss
    )

After running this code, we get the results for each epoch:

epoch 0 total_correct: 43301 loss: 447.59147948026657
epoch 1 total_correct: 49565 loss: 284.43429669737816
epoch 2 total_correct: 51063 loss: 244.08825492858887
epoch 3 total_correct: 51955 loss: 220.5841210782528
epoch 4 total_correct: 52551 loss: 204.73878084123135
epoch 5 total_correct: 52914 loss: 193.1240530461073
epoch 6 total_correct: 53195 loss: 184.50964668393135
epoch 7 total_correct: 53445 loss: 177.78808392584324
epoch 8 total_correct: 53629 loss: 171.81662507355213
epoch 9 total_correct: 53819 loss: 166.2412590533495

We can see that the number of correct values goes up and the loss goes down.

Complete Training Loop

Putting all of this together, we can pull the network, optimizer, and the train_loader out of the training loop cell.

network = Network()
optimizer = optim.Adam(network.parameters(), lr=0.01)
train_loader = torch.utils.data.DataLoader(
    train_set
    ,batch_size=100
    ,shuffle=True
)

This makes it so that we can run the training loop without resetting the networks weights.

for epoch in range(10):

    total_loss = 0
    total_correct = 0

    for batch in train_loader: # Get Batch
        images, labels = batch 

        preds = network(images) # Pass Batch
        loss = F.cross_entropy(preds, labels) # Calculate Loss

        optimizer.zero_grad()
        loss.backward() # Calculate Gradients
        optimizer.step() # Update Weights

        total_loss += loss.item()
        total_correct += get_num_correct(preds, labels)

    print(
        "epoch", epoch, 
        "total_correct:", total_correct, 
        "loss:", total_loss
    )

Visualizing the results is next

We should now have a good understanding of training loops and how we build them using PyTorch. The cool thing about PyTorch is that we can debug the training loop code just how we did with the forward() function.

In the next po st, we'll see how we can get the predictions for every sample in the training set and use those predictions to create a confusion matrix. See you in the next one!

Description

Welcome to this neural network programming series. In this episode, we will learn how to build the training loop for a convolutional neural network using Python and PyTorch. Episode on underfitting from the deep learning fundamentals course: https://deeplizard.com/learn/video/0h8lAm5Ki5g 💥🦎 DEEPLIZARD COMMUNITY RESOURCES 🦎💥 👀 OUR VLOG: 🔗 https://www.youtube.com/channel/UC9cBIteC3u7Ee6bzeOcl_Og 👉 Check out the blog post and other resources for this video: 🔗 https://deeplizard.com/learn/video/XfYmia3q2Ow 💻 DOWNLOAD ACCESS TO CODE FILES 🤖 Available for members of the deeplizard hivemind: 🔗 https://www.patreon.com/posts/27743395 🧠 Support collective intelligence, join the deeplizard hivemind: 🔗 https://deeplizard.com/hivemind 🤜 Support collective intelligence, create a quiz question for this video: 🔗 https://deeplizard.com/create-quiz-question 🚀 Boost collective intelligence by sharing this video on social media! ❤️🦎 Special thanks to the following polymaths of the deeplizard hivemind: yasser Prash 👀 Follow deeplizard: Our vlog: https://www.youtube.com/channel/UC9cBIteC3u7Ee6bzeOcl_Og Twitter: https://twitter.com/deeplizard Facebook: https://www.facebook.com/Deeplizard-145413762948316 Patreon: https://www.patreon.com/deeplizard YouTube: https://www.youtube.com/deeplizard Instagram: https://www.instagram.com/deeplizard/ 🎓 Deep Learning with deeplizard: Fundamental Concepts - https://deeplizard.com/learn/video/gZmobeGL0Yg Beginner Code - https://deeplizard.com/learn/video/RznKVRTFkBY Advanced Code - https://deeplizard.com/learn/video/v5cngxo4mIg Advanced Deep RL - https://deeplizard.com/learn/video/nyjbcRQ-uQ8 🎓 Other Courses: Data Science - https://deeplizard.com/learn/video/d11chG7Z-xk Trading - https://deeplizard.com/learn/video/ZpfCK_uHL9Y 🛒 Check out products deeplizard recommends on Amazon: 🔗 https://www.amazon.com/shop/deeplizard 📕 Get a FREE 30-day Audible trial and 2 FREE audio books using deeplizard’s link: 🔗 https://amzn.to/2yoqWRn 🎵 deeplizard uses music by Kevin MacLeod 🔗 https://www.youtube.com/channel/UCSZXFhRIx6b0dFX3xS8L1yQ 🔗 http://incompetech.com/ ❤️ Please use the knowledge gained from deeplizard content for good, not evil.