Reproducible results with Keras
text
Reproducible results with Keras
In this episode, we're going to show how we can achieve reproducible results from an artificial neural network using Keras.
You may have noticed that when you train a single model multiple times at distinctly different periods in time, you may get varying results during each separate time you train in regards to the loss and accuracy metrics, or your predictions.
This is because when we train a model, the weights for our model are first initialized with random numbers. Due to this random initialization, when we train the model, the weights are going to start out with different random values and will then dynamically change during training via gradient descent.
This will be true each time we train. So, for each of these different times, we're going to be starting off with a different set of random values for our weights.
Let's discuss an example. Suppose we create a model and train it today, and then use that model to make predictions for image classification.
The model may tell us that it is 98% certain that the last image we passed it was an image of a dog. Since the image was indeed a dog, we think this is great and then close our program without saving a copy of our model on disk.
The next day, we open our program again, and we still have the code in place for the architecture of our model. We then compile the model and train it on the exact same data as we did yesterday for the same amount of epochs.
We then give it the same image to predict on, but this time, it tells us that it's only 95% certain that our image is of a dog, whereas yesterday, it was 98%.
This illustrates varying results that we may get due to the random weight initialization that occurs when we train our model on the exact same training data.
This variation is fine, and it's expected due to the random nature of weight initialization, as well as other configurations that are random by nature involved with our network, like dropout, for example. Recall, dropout drops out nodes at random from a specified layer.
Although this variation is expected, there are times when we want our model to reproduce the exact same results regardless of when we train it as long as we're doing so on the same training data of course.
We may desire this type of reproducibility for a class assignment or a live presentation so that we can be prepared for the exact results our model will yield ahead of time. Perhaps we may even desire this reproducibility just for testing purposes during the development phase of our model.
Regardless for the reason for wanting to achieve reproducible results, we're going to now show how to achieve this reproducibility for a Keras model.
Reproducibility with random seeds
Essentially, what we need to do is strip out the randomness that occurs during training. We can do this by setting a random seed to any given number before we build and train our model.
By setting a random seed, we're forcing the βrandomβ initialization of the weights to be generated based upon the seed we set. Then, going forward, as long as we're using the same random seed, we can ensure that all the random variables in our model will always be generated in the exact same manner.
If we didn't set the random seed, then each time we trained our model, the random variables would be generated differently.
For Keras, we'll be generating a random seed for any random numbers that are generated by Python, NumPy, or TensorFlow. To do this, we'll have to set the random seed for each of these libraries separately.
We want to be sure to set our random seeds right at the start of the program before we run any other code in regards to our model.
Let's see what this looks like in code.
We first import numpy
, tensorflow
, and the Python library random
.
import numpy as np
import tensorflow as tf
import random as rn
As a quick note, before we set the random seeds, the Keras documentation lets us know that the piece of code below is necessary for any reproducibility for certain hash based algorithms, so we put this in directly underneath our import statements.
import os
os.environ['PYTHONHASHSEED'] = '0'
Also per the Keras documentation, note that when running code on a GPU, some operations have non-deterministic outputs due to the fact that GPUs run many operations in parallel, and so the order of execution is not always guaranteed. We can avoid the non-deterministic operations by forcing the code to run on a CPU simply by running the following line.
os.environ['CUDA_VISIBLE_DEVICES'] = ''
Next, we set our random seed for numpy.
np.random.seed(37)
I've specified 37
for my random seed, but you can use any int
you'd like.
Then, we specify the random seed for Python using the random
library.
rn.seed(1254)
Finally, we do the same thing for TensorFlow.
tf.random.set_seed(89)
As previously mentioned, all of this code needs to be at the start of your program. Afterwards, you can proceed with creating and training your model after all these seeds have been set.
Note that the Keras documentation has changed since the production of the video portion of this episode. The documentation no longer states that it is required to force TensorFlow to use a single thread using the corresponding code shown in the video to obtain reproducible results.
That's all there is to it for getting reproducible results from our Keras model!
Hopefully you now understand the intuition behind the randomness involved with training and how this may affect our ability to get reproducible results from our model, as well as how you can force reproducibility when needed.
quiz
resources
updates
Committed by on