What is an artificial neural network?
In the previous post, we defined deep learning as a sub-field of machine learning that uses algorithms inspired by the structure and function of the brain's neural networks. For this reason, the models used in deep learning are called artificial neural networks (ANNs).
Let’s give a definition for an artificial neural network.
The connected neural units form the so-called network. Each connection between neurons transmits a signal from one neuron to the other. The receiving neuron processes the signal and signals to downstream neurons connected to it within the network. Note that neurons are also commonly referred to as nodes.
Nodes are organized into what we call layers. At the highest level, there are three types of layers in every ANN:
- Input layer
- Hidden layers
- Output layer
Different layers perform different kinds of transformations on their inputs. Data flows through the network starting at the input layer and moving through the hidden layers until the output layer is reached. This is known as a forward pass through the network. Layers positioned between the input and output layers are known as hidden layers.
Let’s consider the number of nodes contained in each type of layer:
- Input layer - One node for each component of the input data.
- Hidden layers - Arbitrarily chosen number of nodes for each hidden layer.
- Output layer - One node for each of the possible desired outputs.
Now that we have a general idea of the definition and structure of an ANN, let’s have a look at how these ideas can be illustrated.
Visualizing an artificial neural network
I think this one does a pretty good job at illustrating what we just covered:
This ANN has three layers total. The layer on the left is the input layer. The layer on the right is the output layer, and the layer in the middle is the hidden layer. Remember that each layer is comprised of neurons or nodes. Here, the nodes are depicted with the circles, so let’s consider how many nodes are in each layer of this network.
Number of nodes in each layer:
- Input layer (left): 2 nodes
- Hidden layer (middle): 3 nodes
- Output layer (right): 2 nodes
Since this network has two nodes in the input layer, this tells us that each input to this network must have two dimensions, like for example height and weight.
Since this network has two nodes in the output layer, this tells us that there are two possible outputs for every input that is passed forward (left to right) through the network. For example, overweight or underweight could be the two output classes. Note that the output classes are also known as the prediction classes.
Now that we have this working knowledge, let’s see how we can build an ANN in code using Keras.
Keras sequential model
In Keras, we can build what is called a sequential model. Keras defines a sequential model as a sequential stack of linear layers. This is what we might expect as we have just learned that neurons are organized into layers.
This sequential model is Keras’ implementation of an artificial neural network. Let’s see now how a very simple sequential model is built using Keras.
First we import the required Keras classes.
from keras.models import Sequential from keras.layers import Dense, Activation
Then, we create a variable called
model, and we set it equal to an instance of a
model = Sequential(layers)
To the constructor, we pass an array of
Dense objects. Each of these objects called
Dense are actually layers.
layers = [ Dense(3, input_shape=(2,), activation='relu'), Dense(2, activation='softmax') ]
dense indicates that these layers are of type
Dense. Dense is one particular type of layer, but there are many other types that we will see as we continue our deep learning journey.
For now, just understand that dense is the most basic kind of layer in an ANN and that each output of a dense layer is computed using every input to the layer.
Looking at the arrows in our image (in the above section) coming from the hidden layer to the output layer, we can see that each node in the hidden layer is connected to all nodes in the output layer. This is how we know that the output layer in the image is a dense layer. This same logic applies to the hidden layer.
The first parameter passed to the
Dense layer constructor in each layer tells us how many neurons it should have.
The input shape parameter
input_shape=(2,) tells us how many neurons our input layer has, so in our case, we have two.
Lastly, we have a parameter for a so-called activation function.
More on this in future posts. For now, just know that an activation function is a non-linear function that typically follows a dense layer.
This gives us an example of a very basic model definition in Keras. For a more in depth look at Keras, be sure to check out the Keras series. For now, I hope you have an idea for what an ANN is and how you can build one using Keras. I hope you found this one helpful. See ya in the next one!