Train an Artificial Neural Network with TensorFlow's Keras API
text
Train an artificial neural network with TensorFlow's Keras API
In this episode, we'll demonstrate how to train an artificial neural network using the Keras API integrated within TensorFlow.
In the previous episode, we went through the steps to build a simple network, and now we'll focus on training it using data we generated in an even earlier episode.
You should have the following modules imported from last time.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Activation, Dense
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.metrics import categorical_crossentropy
Recall that this is the model we built previously.
model = Sequential([
Dense(units=16, input_shape=(1,), activation='relu'),
Dense(units=32, activation='relu'),
Dense(units=2, activation='softmax')
])
Compiling the model
The first thing we need to do to get the model ready for training is call the compile()
function on it.
model.compile(optimizer=Adam(learning_rate=0.0001), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
This function configures the model for training and expects a number of parameters. First, we specify the optimizer
Adam
. Adam
accepts an optional parameter learning_rate
,
which we'll set to 0.0001
.
Adam optimization is a stochastic gradient descent (SGD) method, and you can learn more about SGD, learning rates, what it actually means to train a network, or any other underlying deep learning concepts in the Deep Learning Fundamentals course.
The next parameter we specify is loss
. We'll be using sparse_categorical_crossentropy
, given that our labels are in integer format.
Note that when we have only two classes, we could instead configure our output layer to have only one output, rather than two, and use binary_crossentropy
as our loss, rather than categorical_crossentropy
.
Both options work equally well and achieve the exact same result.
With binary_crossentropy
, however, the last layer would need to use sigmoid
, rather than softmax
, as its activation function.
Moving on, the last parameter we specify in compile()
is metrics
. This parameter expects a list of metrics that we'd like to be evaluated by the model during training and
testing. We'll set this to a list that contains the string βaccuracy'
.
Training the model
Now that the model is compiled, we can train it using the fit()
function.
model.fit(x=scaled_train_samples, y=train_labels, batch_size=10, epochs=30, verbose=2)
The first item that we pass in to the fit()
function is the training set x
. Recall from a
previous episode, we created the training set and gave it the name scaled_train_samples
.
The next parameter that we set is the labels for the training set y
, which we previously gave the name train_labels
.
We then specify the batch_size
. Again, the concept of
batch size is covered in detail in the
Deep Learning Fundamentals course.
Next, we specify how many epochs
we want to run. We set this to 30
. Note that an epoch is a single pass of all the data to the network.
Lastly, we specify verbose=2
. This just specifies how much output to the console we want to see during each epoch of training. The verbosity levels range from 0
to 2
,
so we're getting the most verbose output.
When we call fit()
on the model, the model trains, and we get this output.
Train on 2100 samples
Epoch 1/30
2100/2100 - 1s - loss: 0.6288 - accuracy: 0.5662
Epoch 2/30
2100/2100 - 0s - loss: 0.6009 - accuracy: 0.6429
Epoch 3/30
2100/2100 - 0s - loss: 0.5679 - accuracy: 0.7186
Epoch 4/30
2100/2100 - 0s - loss: 0.5301 - accuracy: 0.7833
Epoch 5/30
2100/2100 - 0s - loss: 0.4887 - accuracy: 0.8205
Epoch 6/30
2100/2100 - 0s - loss: 0.4490 - accuracy: 0.8519
Epoch 7/30
2100/2100 - 0s - loss: 0.4097 - accuracy: 0.8776
Epoch 8/30
2100/2100 - 0s - loss: 0.3765 - accuracy: 0.8890
Epoch 9/30
2100/2100 - 0s - loss: 0.3499 - accuracy: 0.9081
Epoch 10/30
2100/2100 - 0s - loss: 0.3291 - accuracy: 0.9124
Epoch 11/30
2100/2100 - 0s - loss: 0.3132 - accuracy: 0.9219
Epoch 12/30
2100/2100 - 0s - loss: 0.3008 - accuracy: 0.9238
Epoch 13/30
2100/2100 - 0s - loss: 0.2909 - accuracy: 0.9238
Epoch 14/30
2100/2100 - 0s - loss: 0.2836 - accuracy: 0.9348
Epoch 15/30
2100/2100 - 0s - loss: 0.2777 - accuracy: 0.9314
Epoch 16/30
2100/2100 - 0s - loss: 0.2730 - accuracy: 0.9324
Epoch 17/30
2100/2100 - 0s - loss: 0.2692 - accuracy: 0.9367
Epoch 18/30
2100/2100 - 0s - loss: 0.2661 - accuracy: 0.9333
Epoch 19/30
2100/2100 - 0s - loss: 0.2636 - accuracy: 0.9381
Epoch 20/30
2100/2100 - 0s - loss: 0.2611 - accuracy: 0.9386
Epoch 21/30
2100/2100 - 0s - loss: 0.2592 - accuracy: 0.9371
Epoch 22/30
2100/2100 - 0s - loss: 0.2574 - accuracy: 0.9376
Epoch 23/30
2100/2100 - 0s - loss: 0.2558 - accuracy: 0.9376
Epoch 24/30
2100/2100 - 0s - loss: 0.2544 - accuracy: 0.9371
Epoch 25/30
2100/2100 - 0s - loss: 0.2529 - accuracy: 0.9362
Epoch 26/30
2100/2100 - 0s - loss: 0.2517 - accuracy: 0.9405
Epoch 27/30
2100/2100 - 0s - loss: 0.2507 - accuracy: 0.9381
Epoch 28/30
2100/2100 - 0s - loss: 0.2495 - accuracy: 0.9410
Epoch 29/30
2100/2100 - 0s - loss: 0.2486 - accuracy: 0.9381
Epoch 30/30
2100/2100 - 0s - loss: 0.2476 - accuracy: 0.9414
We can see corresponding output for each of the 30
epochs. Judging by the loss and accuracy, we can see that both metrics steadily improve over time with accuracy reaching almost
93%
and loss steadily decreasing until we reach 0.27
.
Note that although this is a very simple model trained on simple data, without much effort, we were able to reach pretty good results in a relatively quick manner of time. In subsequent episodes, we'll demo more complex models as well as more complex
data, but hopefully you've become encouraged by how easily we were able to get started with tf.keras
.
quiz
resources
updates
Committed by on