### Training Loop Run Builder - Neural Network Experimentation

Welcome to this neural network programming series. In this episode, we’ll code a `RunBuilder`

class that will allow us to generate multiple runs with varying parameters.

Without further ado, let’s get started.

###
Using the `RunBuilder`

Class

The purpose of this episode and the last couple of episodes of this series is to get ourselves into a position to be able to efficiently experiment with the training process that we’ve constructed. For this reason, we’re going to expand on something we touched on in the episode on hyperparameter experimentation. We’re going to make what we saw there a bit cleaner.

We’re going to build a class called `RunBuilder`

. but, before we look at how to build the class. Let’s see what it will allow us to do. We’ll start with our imports.

from collections import OrderedDict from collections import namedtuple from itertools import product

We’re importing `OrderedDict`

and `namedtuple`

from collections and we’re importing a function called `product`

from `itertools`

. This
`product()`

function is the one we saw last time that computes a Cartesian product given multiple list inputs.

Alright. This is the `RunBuilder`

class that will build sets of parameters that define our runs. We’ll see how it works after we see how to use it.

class RunBuilder(): @staticmethod def get_runs(params): Run = namedtuple('Run', params.keys()) runs = [] for v in product(*params.values()): runs.append(Run(*v)) return runs

The main thing to note about using this class is that it has a `static`

method called `get_runs()`

. This method will get the runs for us that it builds based on the parameters we
pass in.

Let’s define some parameters now.

params = OrderedDict( lr = [.01, .001] ,batch_size = [1000, 10000] )

Here, we’ve defined a set of parameters and values inside an dictionary. We have a set of learning rates and a set of batch sizes we want to try out. When we say try out, we mean that we want to do a training run for each learning rate and each batch size in the dictionary.

To get these runs, we just call the `get_runs()`

function of the `RunBuilder`

class, passing in the parameters we’d like to use.

> runs = RunBuilder.get_runs(params) > runs [ Run(lr=0.01, batch_size=1000), Run(lr=0.01, batch_size=10000), Run(lr=0.001, batch_size=1000), Run(lr=0.001, batch_size=10000) ]

Great, we can see that the `RunBuilder`

class has built and returned a list of four runs. Each of these runs has a learning rate and a batch size that defines the run.

We can access an individual run by indexing into the list like so:

> run = runs[0] > run Run(lr=0.01, batch_size=1000)

Notice the string representation of the run output. This string representation was automatically generated for us by the `Run`

tuple class, and this string can be used to uniquely identify
the run if we want to write out run statistics to disk for
TensorBoard or any other visualization program.

Additionally, because the run is object is a `tuple`

with named attributes, we can access the values using dot notation like so:

> print(run.lr, run.batch_size) 0.01 1000

Finally, since the list of runs is a Python iterable, we can iterate over the runs cleanly like so:

for run in runs: print(run, run.lr, run.batch_size)

Output:

Run(lr=0.01, batch_size=1000) 0.01 1000 Run(lr=0.01, batch_size=10000) 0.01 10000 Run(lr=0.001, batch_size=1000) 0.001 1000 Run(lr=0.001, batch_size=10000) 0.001 10000

All we have to do to add additional values is to add them to the original parameter list, and if we want to add an additional type of parameter, all we have to do is add it. The new parameter and its values will automatically become available to be consumed inside the run. The string output for the run also updates as well.

Two parameters:

params = OrderedDict( lr = [.01, .001] ,batch_size = [1000, 10000] ) runs = RunBuilder.get_runs(params) runs

Output:

[ Run(lr=0.01, batch_size=1000), Run(lr=0.01, batch_size=10000), Run(lr=0.001, batch_size=1000), Run(lr=0.001, batch_size=10000) ]

Three parameters:

params = OrderedDict( lr = [.01, .001] ,batch_size = [1000, 10000] ,device = ["cuda", "cpu"] ) runs = RunBuilder.get_runs(params) runs

Output:

[ Run(lr=0.01, batch_size=1000, device='cuda'), Run(lr=0.01, batch_size=1000, device='cpu'), Run(lr=0.01, batch_size=10000, device='cuda'), Run(lr=0.01, batch_size=10000, device='cpu'), Run(lr=0.001, batch_size=1000, device='cuda'), Run(lr=0.001, batch_size=1000, device='cpu'), Run(lr=0.001, batch_size=10000, device='cuda'), Run(lr=0.001, batch_size=10000, device='cpu') ]

This functionality will allow us to have greater control as we experiment with different values during training.

Let’s sees how to build this `RunBuilder`

class.

###
Coding the `RunBuilder`

Class

The first thing we need to have is a dictionary of parameters and values we’d like to try.

params = OrderedDict( lr = [.01, .001] ,batch_size = [1000, 10000] )

Next, we get a list of keys from the dictionary.

> params.keys() odict_keys(['lr', 'batch_size'])

Then, we get a list of values from the dictionary.

> params.values() odict_values([[0.01, 0.001], [1000, 10000]])

Once we have both of these, we just make sure we understand both of them by inspecting their output. Once we do, we use these keys and values for what comes next. We’ll start with the keys.

Run = namedtuple('Run', params.keys())

This line creates a new `tuple`

subclass called `Run`

that has named fields. This `Run`

class is used to encapsulate the data for each of our runs. The field names of
this class are set by the list of names passed to the constructor. First, we are passing the class name. Then, we are passing the field names, and in our case, we are passing the list of keys from our
dictionary.

Now that we have a class for our runs, we are ready to create some.

runs = [] for v in product(*params.values()): runs.append(Run(*v))

First we create a list called `runs`

. Then, we use the `product()`

function from `itertools`

to create the Cartesian product using the values for each parameter inside
our dictionary. This gives us a set of ordered pairs that define our runs. We iterate over these adding a run to the `runs`

list for each one.

For each value in the Cartesian product we have an ordered tuples. The Cartesian product gives us every ordered pair so we have all possible order pairs of learning rates and batch sizes. When we pass the `tuple`

to the `Run`

constructor, we use the `*`

operator to tell the constructor to accept the tuple values as arguments opposed to the `tuple`

itself.

Finally, we wrap this code in our `RunBuilder`

class.

class RunBuilder(): @staticmethod def get_runs(params): Run = namedtuple('Run', params.keys()) runs = [] for v in product(*params.values()): runs.append(Run(*v)) return runs

Since the `get_runs()`

method is static, we can call it using the class itself. We don’t need an instance of the class.

Now, this allow us to update our training code in the following way:

Before:

for lr, batch_size, shuffle in product(*param_values): comment = f' batch_size={batch_size} lr={lr} shuffle={shuffle}' # Training process given the set of parameters

After:

for run in RunBuilder.get_runs(params): comment = f'-{run}' # Training process given the set of parameters

### What is a Cartesian Product?

Do you know about the Cartesian product? Like many things in life, the Cartesian product is a mathematical concept. The Cartesian product is a binary operation. The operation takes two sets as arguments and returns a third set as an output. Let's look at a general mathematical example.

Suppose that \(X\) is a set.

Suppose that \(Y\) is a set.

The Cartesian product between two sets is denoted as, \(X \times Y\). The Cartesian product between the sets \(X\) and the set \(Y\) is defined to be the set of all ordered pairs \((x,y)\) such that little \(x \in X\) and \(y \in Y\). This can be expressed in the following way:

This way of expressing the output of the Cartesian product is called set builder notation. It is cool. So \(X \times Y\) is the set of all ordered pairs \((x,y)\) such that little \(x \in X\) and \(y \in Y\).

To compute \(X \times Y\) we do the following:

For every \(x \in X\) and for every \(y \in Y\), we collect the corresponding pair \((x,y)\). The resulting collection gives us the set of all ordered pairs little \((x,y)\) such that \(x \in X\) and \(y \in Y\).

Here is a concrete example expressed in Python:

X = {1,2,3} Y = {1,2,3} { (x,y) for x in X for y in Y }

Output:

{(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)}

Notice how powerful the mathematical code is. It covers all cases. Maybe you noticed that this can be achieved using for-loop iteration like so:

X = {1,2,3} Y = {1,2,3} cartesian_product = set() for x in X: for y in Y: cartesian_product.add((x,y)) cartesian_product

Output:

{(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)}

### Wrapping Up

Alright, now we know how this works and we can use it going forward.