Build PyTorch CNN - Object Oriented Neural Networks
Building neural networks with PyTorch
Welcome back to this series on neural network programming with PyTorch. In this post, we will begin building our first convolutional neural network (CNN) using PyTorch. Without further ado, let's get started.
Bird's eye view of the process
From a high-level perspective or bird's eye view of our deep learning project, we prepared our data, and now, we are ready to build our model.
- Prepare the data
- Build the model
- Train the model
- Analyze the model's results
When say model, we mean our network. The words model and network mean the same thing. What we want our network to ultimately do is model or approximate a function that maps image inputs to the correct output class.
To build neural networks in PyTorch, we extend the
torch.nn.Module PyTorch class. This means we need to utilize a little bit of
object oriented programming (OOP) in Python.
We'll do a quick OOP review in this post to cover the details needed for working with PyTorch neural networks, but if you find that you need more, the Python docs have an overview tutorial here.
To build a convolutional neural network, we need to have a general understanding of how CNNs work and what components are used to build CNNs. This deep learning fundamentals series is a good prerequisite for this series, so I highly recommend you cover that one if you haven't already. If you just want a crash course on CNNs, these are the specific posts to see:
- Convolutional Neural Networks (CNNs) explained
- Visualizing Convolutional Filters from a CNN
- Zero Padding in Convolutional Neural Networks explained
- Max Pooling in Convolutional Neural Networks explained
- Learnable Parameters in a Convolutional Neural Network (CNN) explained
Let's jump in now with a quick object oriented programming review.
Quick object oriented programming review
When we're writing programs or building software, there are two key components, code and data. With object oriented programming, we orient our program design and structure around objects.
Objects are defined in code using classes. A class defines the object's specification or spec, which specifies what data and code each object of the class should have.
When we create an object of a class, we call the object an instance of the class, and all instances of a given class have two core components:
- Methods (code)
- Attributes (data)
The methods represent the code, while the attributes represent the data, and so the methods and attributes are defined by the class.
In a given program, many objects, a.k.a instances of a given class, can exist simultaneously, and all of the instances will have the same available attributes and the same available methods. They are uniform from this perspective.
The difference between objects of the same class is the values contained within the object for each attribute. Each object has its own attribute values. These values determine the internal state of the object. The code and data of each object is said to be encapsulated within the object.
Let's build a simple lizard class to demonstrate how classes encapsulate data and code:
class Lizard: #class declaration
def __init__(self, name): #class constructor (code)
self.name = name #attribute (data)
def set_name(self, name): #method declaration (code)
self.name = name #method implementation (code)
The first line declares the class and specifies the class name, which in this case is
The second line defines a special method called the class constructor. Class constructors are called when a new instance of the class is created. As parameters, we have
self parameter gives us the ability to create attribute values that are stored or encapsulated within the object. When we call this constructor or any of the other methods, we don't
self parameter. Python does this for us automatically.
Argument values for any other parameter are arbitrarily passed by the caller, and these passed values that come in to the method can be used in a calculation or saved and accessed later using
After we're done with the constructor, we can create any number of specialized methods like this one here that allows a caller to change the name value that was stored in
self. All we have
to do here is call the method and pass a new value for the name. Let's see this in action.
> lizard = Lizard('deep')
We create an object instance of the class by specifying the class name and passing the constructor arguments. The constructor will receive these arguments and the constructor code will run saving the passed name.
We can then access the
name and print it, and also call the
set_name() method to change the name. Multiple of these
Lizard instances can exist inside a program, and
each one will contain its own data.
From an object oriented standpoint, the important part about this setup is that the attributes and the methods are organized and contained within an object.
Let's switch gears now and look at how object oriented programming fits in with PyTorch.
To build neural networks in PyTorch, we use the
torch.nn package, which is PyTorch's neural network (nn) library. We typically import the package like so:
import torch.nn as nn
This allows us to access neural network package using the
nn alias. So from now on, if we say
nn, we mean
torch.nn. PyTorch's neural network library contains all of the typical components needed to build neural networks.
The primary component we'll need to build a neural network is a layer, and so, as we might expect, PyTorch's neural network library contains classes that aid us in constructing layers.
As we know, deep neural networks are built using multiple layers. This is what makes the network deep. Each layer in a neural network has two primary components:
- A transformation (code)
- A collection of weights (data)
Like many things in life, this fact makes layers great candidates to be represented as objects using OOP. OOP is short for object oriented programming.
In fact, this is the case with PyTorch. Within the
nn package, there is a class called
Module, and it is the base class for all of neural network modules which includes layers.
This means that all of the layers in PyTorch extend the
nn.Module class and inherit all of PyTorch's built-in functionality within the
nn.Module class. In OOP this concept
is known as inheritance.
Even neural networks extend the
nn.Module class. This makes sense because neural networks themselves can be thought of as one big layer (if needed, let that sink in over time).
Neural networks and layers in PyTorch extend the
nn.Module class. This means that we must extend the
nn.Module class when building a new layer or neural network in PyTorch.
nn.Modules have a
When we pass a tensor to our network as input, the tensor flows forward though each layer transformation until the tensor reaches the output layer. This process of a tensor flowing forward though the network is known as a forward pass.
Each layer has its own transformation (code) and the tensor passes forward through each layer. The composition of all the individual layer forward passes defines the overall forward pass transformation for the network.
The goal of the overall transformation is to transform or map the input to the correct prediction output class, and during the training process, the layer weights (data) are updated in such a way that cause the mapping to adjust to make the output closer to the correct prediction.
What this all means is that, every PyTorch
nn.Module has a
forward() method, and so when we are building layers and networks, we must provide an implementation of the
forward() method. The forward method is the actual transformation.
When we implement the
forward() method of our
nn.Module subclass, we will typically use functions from the
nn.functional package. This package provides us with many
neural network operations that we can use for building layers. In fact, many of the
nn.Module layer classes use
nn.functional functions to perform their operations.
nn.functional package contains methods that subclasses of
nn.Module use for implementing their
forward() functions. Later, we see an example of this by looking
at the PyTorch source code of the
nn.Conv2d convolutional layer class.
Building a neural network in PyTorch
We now have enough information to provide an outline for building neural networks in PyTorch. The steps are as follows:
- Define layers as class attributes.
More detailed version:
Create a neural network class that extends the
In the class constructor, define the network's layers as class attributes using pre-built layers from
Use the network's layer attributes as well as operations from the
nn.functionalAPI to define the network's forward pass.
Like we did with the
Lizard class example, let's create a simple class to represent a neural network.
self.layer = None
def forward(self, t):
t = self.layer(t)
This gives us a simple network class that has a single dummy layer inside the constructor and a dummy implementation for the forward function.
The implementation for the
forward() function takes in a tensor
t and transforms it using the dummy layer. After the tensor is transformed, the new tensor is returned.
This is a good start, but the class hasn't yet extended the
nn.Module class. To make our
Network class extend
nn.Module, we must do two additional things:
nn.Moduleclass in parentheses on line
Insert a call to the super class constructor on line
3inside the constructor.
This gives us:
class Network(nn.Module): # line 1
super().__init__() # line 3
self.layer = None
def forward(self, t):
t = self.layer(t)
These changes transform our simple neural network into a PyTorch neural network because we are now extending PyTorch's
nn.Module base class.
With this, we are done! Now we have a
Network class that has all of the functionality of the PyTorch
Define the network's layers as class attributes
At the moment, our Network class has a single dummy layer as an attribute. Let's replace this now with some real layers that come pre-built for us from PyTorch's
nn library. We're
building a CNN, so the two types of layers we'll use are linear layers and convolutional layers.
self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
self.fc1 = nn.Linear(in_features=12 * 4 * 4, out_features=120)
self.fc2 = nn.Linear(in_features=120, out_features=60)
self.out = nn.Linear(in_features=60, out_features=10)
def forward(self, t):
# implement the forward pass
Alright. At this point, we have a Python class called
Network that extends PyTorch's
nn.Module class. Inside of our
Network class, we have five layers that are
defined as attributes. We have two convolutional layers,
self.conv2, and three linear layers,
We used the abbreviation
fc2 because linear layers are also called
fully connected layers. They also have a third name that we may hear sometimes called
dense. So linear, dense, and fully connected are all ways to refer to the same type of layer. PyTorch uses the word
linear, hence the
nn.Linear class name.
We used the name
out for the last linear layer because the last layer in the network is the output layer.
We should now have a good idea about how to get started building neural networks in PyTorch using the
torch.nn library. In the next post we'll investigate the different types of parameters
of our layers and gain an understanding of how they are chosen. I'll see you in the next one.
Committed by on