TensorFlow.js - Deep Learning with JavaScript

with deeplizard.

Broadcasting Explained - Tensors for Deep Learning and Neural Networks

July 13, 2018 by

Blog

Broadcasting for tensors & deep learning

What’s up, guys? In this post, we’ll learn about broadcasting and illustrate its importance and major convenience when it comes to tensor operations, so let’s get to it!

broadcasting

Over the last couple of posts, we’ve immersed ourselves in tensors, and hopefully now, we have a good understanding of how to work with, transform, and operate on them.

If you recall, a couple posts back, I mentioned the term “broadcasting” and said that we would later make use of it to vastly simplify our VGG16 preprocessing code. That’s exactly what we’ll be doing in this post!

Code preview

Before we get into the details about what broadcasting is, though, let’s get a sneak peak of what our transformed code will look like once we’ve introduced broadcasting.

Because I’m using Git for source management, I can see the diff between our original predict.js file and the modified version of the file that uses broadcasting.

git-diff-predict.js.png

On the left, we have our original predict.js file. Within the click() event, recall this is where we transformed our image into a tensor. Then, the rest of this code was all created to do the appropriate preprocessing for VGG16 where we centered and reversed the RGB values.

Now, on the right, this is our new and improved predict.js file that makes use of broadcasting in place of all of the explicit one-by-one tensor operations on the left.

So, everything in red on the left has now been replaced with what’s shown in green on the right.

That’s a pretty massive reduction of code. Before we show how this happened, we need to understand what broadcasting is.

Broadcasting describes how tensors with different shapes are treated during arithmetic operations.

Broadcasting Example 1: Same shapes

For example, it might be relatively easy to look at these two rank-2 tensors and figure out what the sum of them would be.

Tensor 1:

[[1, 2, 3],]

rank: 2
shape: (1,3)

Tensor 2:

[[4, 5, 6],]

rank: 2
shape: (1,3)

They have the same shape, so we just take the element-wise sum of the two tensors, where we calculate the sum element-by-element, and our resulting tensor looks like this.

Tensor 1 + Tensor 2:

    [[1, 2, 3],]
+ 
    [[4, 5, 6],]
-------------------- 
    [[5, 7, 9],]
    rank: 2
    shape: (1,3)

Now, since these two tensors have the same shape, (1, 3), no broadcasting is happening here. Remember, broadcasting comes into play when we have tensors with different shapes.

Example 2: Same rank, different shapes

Alright, so what would happen if our two rank-2 tensors looked like this, and we wanted to sum them?

Tensor 1:

[[1, 2, 3],]

rank: 2
shape: (1,3)

Tensor 2:

[[4],
 [5],
 [6]]

rank: 2
shape: (3,1)

We have one tensor with shape (1, 3), and the other with shape (3, 1). Well, here is where broadcasting will come into play.

Before we cover how this is done, go ahead and pause and see just intuitively, what comes to mind as the resulting tensor from adding these two together. Give it a go, write it down, and keep what you write handy because we’ll circle back around to what you wrote later.

Alright, we’re first going to look at the result, and then we’ll go over how we arrived there.

Our result from summing these two tensors is this (3, 3) tensor.

Tensor 1 + Tensor 2:

    [[1, 2, 3],]
+ 
    [[4],
     [5],
     [6]]
-------------------- 
    [[5, 6, 7],
     [6, 7, 8],
     [7, 8, 9]]
    rank: 2
    shape: (3,3)

Here’s how broadcasting works.

We have two tensors with different shapes. The goal of broadcasting is to make the tensors have the same shape so we can perform element-wise operations on them.

First, we have to see if the operation we’re trying to do is even possible between the given tensors. Based on the tensors’ original shapes, there may not be a way to reshape them to force them to be compatible, and if we can’t do that, then we can’t use broadcasting.

Step 1: Determine if tensors are compatible

The rule to see if broadcasting can be used is this.

We compare the shapes of the two tensors, starting at their last dimensions and working backwards. Our goal is to determine whether each dimension between the two tensors’ shapes is compatible.

The dimensions are compatible when either:

  • They’re equal to each other.
  • One of them is 1.

In our example, we have shapes (3, 1) and (1, 3). So we first compare the last dimensions.

Comparing the last dimensions of the two shapes, we have a 1 and a 3. Are these compatible? Well, let’s check the rule.

Are they equal to each other? No, 1 doesn’t equal 3.

Is one of them 1? Yes.

Great, the last dimensions are compatible. Working our way to the front, for the next dimension, we have a 3 and a 1. Similar story, just switched order, right? So, are these compatible? Yes, again, because one of them is 1.

Ok, that’s the first step. We’ve confirmed each dimension between the two shapes is compatible.

If, however, while comparing the dimensions, we confirmed that at least one dimension wasn’t compatible, then we would cease our efforts there because the arithmetic operation would not be possible between the two.

Now, since we’ve confirmed that our two tensors are compatible, we can sum them and use broadcasting to do it.

Step 2: Determine the shape of the resulting tensor

When we sum two tensors, the result of this sum will be a new tensor. Our next step is to find out the shape of this resulting tensor. We do that by, again, comparing the shapes of the original tensors.

Let’s see exactly how this is done.

Comparing the shape of (1, 3) to (3, 1), we first calculate the max of the last dimension.

The max of 3 and 1 is 3. 3 will be the last dimension of the shape of the resulting tensor.

Moving on to the next dimension, again, the max of 1 and 3 is 3. So, 3 will be the next dimension of the shape of the resulting tensor.

We’ve now stepped through each dimension of the shapes of the original tensors. We can conclude that the resulting tensor will have shape (3, 3).

The original tensors of shape (1, 3) and (3, 1) will now be expanded to shape (3, 3) in order to do the element-wise operation.

Broadcasting can be thought of as copying the existing values within the original tensor and expanding that tensor with these copies until it reaches the required shape.

The values in our (1, 3) tensor will now be broadcast to this (3, 3) tensor.

Tensor 1 broadcast to shape (3,3):

Before:
    [[1, 2, 3],]

After:
    [[1, 2, 3],
     [1, 2, 3],
     [1, 2, 3]]

The values in our (3, 1) tensor will now be broadcast to this (3, 3) tensor.

Tensor 2 broadcast to shape (3,3):

Before:
    [[4],
     [5],
     [6]]

After:
    [[4, 4, 4],
     [5, 5, 5],
     [6, 6, 6]]

We can now easily take the element-wise sum of these two to get this resulting (3, 3) tensor.

    [[1, 2, 3],
     [1, 2, 3],
     [1, 2, 3]]
+
    [[4, 4, 4],
     [5, 5, 5],
     [6, 6, 6]]
-------------------- 
    [[5, 6, 7],
     [6, 7, 8],
     [7, 8, 9]] 

Let’s do another example.

Broadcasting Example 3: Different ranks

What if we wanted to multiply this rank-2 tensor of shape (1, 3) with this rank-0 tensor, better known as a scalar?

Tensor 1:

[[1, 2, 3],]

rank: 2
shape: (1,3)

Tensor 2:

5

rank: 0
shape: ()

We can do this since there’s nothing in the rules preventing us from operating on two tensors of different ranks. Let’s see.

Step 1: Determine if tensors are compatible

We first compare the last dimensions of the two shapes.

When we’re in a situation where the ranks of the two tensors aren’t the same, like what we have here, then we simply substitute a one in for the missing dimensions of the lower-ranked tensor.

In our example, we substitute a one for both missing dimensions in the scalar's shape, making it now have shape (1,1)

Then, we ask, are the dimensions compatible? And the answer will always be yes in this type of scenario since one of them will always be a one.

Step 2: Determine the shape of the resulting tensor

Alright, all the dimensions are compatible, so what will the resulting tensor look like from multiplying these two together? Again, go ahead and pause here and try yourself before getting the answer.

Well, the max of 3 and 1 is 3, and the max of 1 and 1 is 1. So our resulting tensor will be of shape (1, 3).

Our first tensor is already this shape, so it gets left alone. Our second tensor is now expanded to this shape by broadcasting it’s value like this.

Tensor 2 broadcast to shape (1,3):

Before:
    5

After:
    [[5, 5, 5],]

Now, we can do our element-wise multiplication to get this resulting (1, 3) tensor.

Tensor 1 x Tensor 2:

    [[1, 2, 3],]
x 
    5
-------------------- 
    [[5, 10, 15],]
    rank: 2
    shape: (1,3)

Let’s do one more example.

Broadcasting Example 4: Different ranks… again

What if we wanted to sum this rank-3 tensor of shape (1, 2, 3) and this rank-2 tensor of shape (3, 3)?

Tensor 1:

[[[1, 2, 3],
  [4, 5, 6]]]

rank: 3
shape: (1,2,3)

Tensor 2:

[[1, 1, 1],
 [2, 2, 2],
 [3, 3, 3]]

rank: 2
shape: (3,3)

Before covering any of the incremental steps, go ahead and give it a shot yourself and see what you find out.

Step 1: Determine if tensors are compatible

Alright, the deal with these two tensors is that we can’t operate on them. Why?

"Error: Operands could not be broadcast together with shapes 1,2,3 and 3,3."

😳😳😳

Comparing the second-to-last dimensions of the shapes, they’re not equal to each other, and neither one of them is one, so we stop there.

We should now have a grip on broadcasting. Let’s go see how we’re able to make use of it in our VGG16 preprocessing code.

Use broadcasting in code

git-diff-predict.js.png

First, we can see we’re changing our meanImageNetRGB object into a rank-1 tensor, which makes sense, right? Because we’re going to be making use of broadcasting, which is going to require us to work with tensors, not arbitrary JavaScript objects.

meanImageNetRGB Before:

let meanImageNetRGB = {
    red: 123.68,
    green: 116.779,
    blue: 103.939
};

meanImageNetRGB After:

let meanImageNetRGB = tf.tensor1d([123.68, 116.779, 103.939]);

Alright, now get a load of this remaining code.

let indices = [
    tf.tensor1d([0], "int32"),
    tf.tensor1d([1], "int32"),
    tf.tensor1d([2], "int32")
];

let centeredRGB = {
    red: tf.gather(tensor, indices[0], 2)
    .sub(tf.scalar(meanImageNetRGB.red))
    .reshape([50176]),
    green: tf.gather(tensor, indices[1], 2)
    .sub(tf.scalar(meanImageNetRGB.green))
    .reshape([50176]),
    blue: tf.gather(tensor, indices[2], 2)
    .sub(tf.scalar(meanImageNetRGB.blue))
    .reshape([50176])
};

let processedTensor = tf.stack([centeredRGB.red, 
    centeredRGB.green, centeredRGB.blue], 1)
    .reshape([224, 224, 3])
    .reverse(2)
    .expandDims();

All of this code was written to handle the centering of the RGB values. This has now all been replaced with this single line, which is simply the result of subtracting the meanImageNetRGB tensor from the original tensor.

let processedTensor = tensor.sub(meanImageNetRGB);

Ok, so why does this work, and where is the broadcasting? Let’s see.

Our original tensor is a rank-3 tensor of shape (224, 224, 3).

Our meanImageNetRGB tensor is a rank-1 tensor of shape (3).

Our objective is to subtract each mean RGB value from each RGB value along the second axis of the original tensor.

From what we’ve now learned about broadcasting, we can do this really easily.

We compare the dimensions of the shapes from each tensor and confirm they’re compatible. The last dimensions are compatible because they’re equal to each other. The next two dimensions are compatible because we substitute a one in for the missing dimensions in our rank-1 tensor, making it now have shape (1, 1, 3).

Taking the max across each dimension, our resulting tensor will be of shape (224, 224, 3).

Our original tensor already has that shape, so we leave it alone. Our rank-1 tensor will be expanded to shape (224, 224, 3) by copying its three values along the second axis.

Now we can easily do the element-wise subtraction between these two tensors.

Condensing our code

Now, actually, if we wanted to make this code even more concise, then rather than creating two tensor objects, our original one and our preprocessed one, we can chain all these calls together to condense these two separate tensors into one.

We would first need to bring our meanImageNetRGB definition above our tensor definition. Then, we’d move our sub(), reverse(), and expandDims() calls up and chain them onto the original tensor.

let meanImageNetRGB = tf.tensor1d([123.68, 116.779, 103.939]);
let tensor = tf.fromPixels(image)
    .resizeNearestNeighbor([224, 224])
    .toFloat()
    .sub(meanImageNetRGB)
    .reverse(2)
    .expandDims();

Lastly, we replace the reference to processedTensor with just tensor.

let predictions = await model.predict(tensor).data();

And that’s it!

Reflection

So, if you took the time to truly understand the tensor operations we went through step-by-step in the last couple posts, then you should now be pretty blown away by how much easier broadcasting can make our lives and our code.

Given this, I want to hear from you! Let me know in the comments what you think about this. Did you follow? Do you see the value in broadcasting?

Oh, also, remember all those times I asked you to pause the video and record your answers to the examples we were going through? Let me know what ya got! And don’t be embarrassed if you were wrong. I was wrong when I tried to figure out examples like those when I first learning broadcasting, so no shame!

Let me know, and I’ll see ya in the next video!

Description

Tensors are the data structures of deep learning, and broadcasting is one of the most important operations that streamlines neural network programming operations. Over the last couple of videos, we’ve immersed ourselves in tensors, and hopefully now, we have a good understanding of how to work with, transform, and operate on them. If you recall, a couple videos back, I mentioned the term “broadcasting” and said that we would later make use of it to vastly simplify our VGG16 preprocessing code. That’s exactly what we’ll be doing in this video! Code: https://www.patreon.com/posts/19580029 Observable notebook: https://www.patreon.com/posts/20051386 Code files and notebooks are available as a perk for the deeplizard hivemind. Check out the details regarding deeplizard perks and rewards at: http://deeplizard.com/hivemind Check out the corresponding blog and other resources for this video at: http://deeplizard.com/learn/video/6_33ulFDuCg Support collective intelligence, and join the deeplizard hivemind: http://deeplizard.com/hivemind Follow deeplizard: YouTube: https://www.youtube.com/deeplizard Twitter: https://twitter.com/deeplizard Facebook: https://www.facebook.com/Deeplizard-145413762948316 Steemit: https://steemit.com/@deeplizard Instagram: https://www.instagram.com/deeplizard/ Pinterest: https://www.pinterest.com/deeplizard/ Checkout products deeplizard suggests on Amazon: https://www.amazon.com/shop/deeplizard Support deeplizard by browsing with Brave: https://brave.com/dee530 Support deeplizard with crypto: Bitcoin: 1AFgm3fLTiG5pNPgnfkKdsktgxLCMYpxCN Litecoin: LTZ2AUGpDmFm85y89PFFvVR5QmfX6Rfzg3 Ether: 0x9105cd0ecbc921ad19f6d5f9dd249735da8269ef Recommended books on AI: The Most Human Human: What Artificial Intelligence Teaches Us About Being Alive: http://amzn.to/2GtjKqu Life 3.0: Being Human in the Age of Artificial Intelligence https://amzn.to/2H5Iau4 Playlists: Data Science - https://www.youtube.com/playlist?list=PLZbbT5o_s2xrth-Cqs_R9-us6IWk9x27z Machine Learning - https://www.youtube.com/playlist?list=PLZbbT5o_s2xq7LwI2y8_QtvuXZedL6tQU Keras - https://www.youtube.com/playlist?list=PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL TensorFlow.js - https://www.youtube.com/playlist?list=PLZbbT5o_s2xr83l8w44N_g3pygvajLrJ- Music: Laser Groove by Kevin MacLeod Chillin Hard by Kevin MacLeod YouTube: https://www.youtube.com/channel/UCSZXFhRIx6b0dFX3xS8L1yQ Website: http://incompetech.com/ Licensed under Creative Commons: By Attribution 3.0 License http://creativecommons.org/licenses/by/3.0/