Tensors for Deep Learning - Broadcasting and Element-wise Operations with PyTorch

video

expand_more

text

expand_more

Element-wise tensor operations for deep learning

Welcome back to this series on neural network programming. In this post, we'll be expanding our knowledge beyond reshaping operations by learning about element-wise operations.

Without further ado, let's get started.

Reshaping operations
Element-wise operations
Reduction operations
Access operations

What does element-wise mean?

Element-wise operations are extremely common operations with tensors in neural network programming. Let's lead this discussion off with a definition of an element-wise operation.

An element-wise operation is an operation between two tensors that operates on corresponding elements within the respective tensors.

An element-wise operation operates on corresponding elements between tensors.

Two elements are said to be corresponding if the two elements occupy the same position within the tensor. The position is determined by the indexes used to locate each element.

Suppose we have the following two tensors:

> t1 = torch.tensor([
    [1,2],
    [3,4]
], dtype=torch.float32)

> t2 = torch.tensor([
    [9,8],
    [7,6]
], dtype=torch.float32)

Both of these tensors are rank-2 tensors with a shape of 2 x 2.

This means that we have two axes that both have a length of two elements each. The elements of the first axis are arrays and the elements of the second axis are numbers.

# Example of the first axis
> print(t1[0])
tensor([1., 2.])

# Example of the second axis
> print(t1[0][0])
tensor(1.)

This is the kind of thing we are used to seeing in this series now. Alright let's build on this.

We know that two elements are said to be corresponding if the two elements occupy the same position within the tensor, and the position is determined by the indexes used to locate each element. Let's see an example of corresponding elements.

> t1[0][0]
tensor(1.)

> t2[0][0]
tensor(9.)

This allows us to see that the corresponding element for the 1 in t1 is the 9 in t2.

The correspondence is defined by the indexes. This is important because it reveals an important feature of element-wise operations. We can deduce that tensors must have the same number of elements in order to perform an element-wise operation.

We'll go ahead and make this statement more restrictive. Two tensors must have the same shape in order to perform element-wise operations on them.

Addition is an element-wise operation

Let's look at our first element-wise operation, addition. Don't worry. It will get a more interesting.

> t1 + t2
tensor([[10., 10.],
        [10., 10.]])

This allow us to see that addition between tensors is an element-wise operation. Each pair of elements in corresponding locations are added together to produce a new tensor of the same shape.

So, addition is an element-wise operation, and in fact, all the arithmetic operations, add, subtract, multiply, and divide are element-wise operations.

Arithmetic operations are element-wise operations

An operation we commonly see with tensors are arithmetic operations using scalar values. There are two ways we can do this:

(1) Using these symbolic operations:

> print(t + 2)
tensor([[3., 4.],
        [5., 6.]])

> print(t - 2)
tensor([[-1.,  0.],
        [ 1.,  2.]])

> print(t * 2)
tensor([[2., 4.],
        [6., 8.]])

> print(t / 2)
tensor([[0.5000, 1.0000],
        [1.5000, 2.0000]])

or equivalently, (2) these built-in tensor object methods:

> print(t1.add(2))
tensor([[3., 4.],
        [5., 6.]])

> print(t1.sub(2))
tensor([[-1.,  0.],
        [ 1.,  2.]])

> print(t1.mul(2))
tensor([[2., 4.],
        [6., 8.]])

> print(t1.div(2))
tensor([[0.5000, 1.0000],
        [1.5000, 2.0000]])

Both of these options work the same. We can see that in both cases, the scalar value, 2, is applied to each element with the corresponding arithmetic operation.

Something seems to be wrong here. These examples are breaking the rule we established that said element-wise operations operate on tensors of the same shape.

Scalar values are Rank-0 tensors, which means they have no shape, and our tensor t1 is a rank-2 tensor of shape 2 x 2.

So how does this fit in? Let's break it down.

The first solution that may come to mind is that the operation is simply using the single scalar value and operating on each element within the tensor.

This logic kind of works. However, it's a bit misleading, and it breaks down in more general situations where we're note using a scalar.

To think about these operations differently, we need to introduce the concept of tensor broadcasting or broadcasting.

Broadcasting tensors

Broadcasting describes how tensors with different shapes are treated during element-wise operations.

Broadcasting is the concept whose implementation allows us to add scalars to higher dimensional tensors.

Let's think about the t1 + 2 operation. Here, the scaler valued tensor is being broadcasted to the shape of t1, and then, the element-wise operation is carried out.

We can see what the broadcasted scalar value looks like using the broadcast_to() Numpy function:

> np.broadcast_to(2, t1.shape)
array([[2, 2],
        [2, 2]])

This means the scalar value is transformed into a rank-2 tensor just like t1, and just like that, the shapes match and the element-wise rule of having the same shape is back in play. This is all under the hood of course.

This piece of code here paints the picture so to speak. This

> t1 + 2
tensor([[3., 4.],
        [5., 6.]])

is really this:

> t1 + torch.tensor(
    np.broadcast_to(2, t1.shape)
    ,dtype=torch.float32
)
tensor([[3., 4.],
        [5., 6.]])

At this point you may be thinking that this seems convoluted, so let's look at a trickier example to hit this point home. Suppose we have the following two tensors.

Trickier example of broadcasting

Let's look at a trickier example to hit this point home. Suppose we have the following tensor.

t1 = torch.tensor([
    [1,1],
    [1,1]
], dtype=torch.float32)

t2 = torch.tensor([2,4], dtype=torch.float32)

What will be the result of this element-wise addition operation? Is it even possible given the same shape rule for element-wise operations?

# t1 + t2 ???????

> t1.shape
torch.Size([2, 2])

> t2.shape
torch.Size([2])

Even though these two tenors have differing shapes, the element-wise operation is possible, and broadcasting is what makes the operation possible. The lower rank tensor t2 will be transformed via broadcasting to match the shape of the higher rank tensor t1, and the element-wise operation will be performed as usual.

The concept of broadcasting is the key to understanding how this operation will be carried out. As before, we can check the broadcast transformation using the broadcast_to() numpy function.

> np.broadcast_to(t2.numpy(), t1.shape)
array([[2., 4.],
        [2., 4.]], dtype=float32)

> t1 + t2
tensor([[3., 5.],
        [3., 5.]])

After broadcasting, the addition operation between these two tensors is a regular element-wise operation between tensors of the same shape.

Broadcasting is a more advanced topic than the basic element-wise operations, so don't worry if it takes longer to get comfortable with the idea.

Understanding element-wise operations and the same shape requirement provide a basis for the concept of broadcasting and why it is used.

When do we actually use broadcasting? We often need to use broadcasting when we are preprocessing our data, and especially during normalization routines.

There is a post in the TensorFlow.js section of the Keras course that covers broadcasting in greater detail. There is a practical example, and the algorithm for determining how a particular tensor is broadcasted is also covered, so check that out for, a deeper discussion on broadcasting.

Don't worry about not knowing TensorFlow.js. It's not a requirement, and I highly recommend the content there on broadcasting.

Comparison Operations are Element-wise

Comparison operations are also element-wise operations.

For a given comparison operation between two tensors, a new tensor of the same shape is returned with each element containing either a torch.bool value of True or False.

Behavior Change in PyTorch Version 1.2.0

Comparison operations returned dtype has changed from torch.uint8 to torch.bool ( 21113).

Version 1.1:

> torch.tensor([1, 2, 3]) < torch.tensor([3, 1, 2])
tensor([1, 0, 0], dtype=torch.uint8)

Version 1.2:

> torch.tensor([1, 2, 3]) < torch.tensor([3, 1, 2])
tensor([True, False, False])

Relevant links:

Release Notes: https://github.com/pytorch/pytorch/releases/tag/v1.2.0
Pull Request: https://github.com/pytorch/pytorch/pull/21113

The examples below show output for PyTorch version 1.2.0 and greater.

Element-wise Comparison Operation Examples

Suppose we have the following tensor:

> t = torch.tensor([
    [0,5,0],
    [6,0,7],
    [0,8,0]
], dtype=torch.float32)

Let's check out some of these comparison operations.

> t.eq(0)
tensor([[True, False, True],
        [False, True, False],
        [True, False, True]])

> t.ge(0)
tensor([[True, True, True],
        [True, True, True],
        [True, True, True]])

> t.gt(0)
tensor([[False, True, False],
        [True, False, True],
        [False, True, False]])

> t.lt(0)
tensor([[False, False, False],
        [False, False, False],
        [False, False, False]])

> t.le(7)
tensor([[True, True, True],
        [True, True, True],
        [True, False, True]])

Thinking about these operations from a broadcasting perspective, we can see that the last one, t.le(7), is really this:

> t <= torch.tensor(
    np.broadcast_to(7, t.shape)
    ,dtype=torch.float32
)

tensor([[True, True, True],
        [True, True, True],
        [True, False, True]])

and equivalently this:

> t <= torch.tensor([
    [7,7,7],
    [7,7,7],
    [7,7,7]
], dtype=torch.float32)

tensor([[True, True, True],
        [True, True, True],
        [True, False, True]])

Element-wise Operations using Functions

With element-wise operations that are functions, it's fine to assume that the function is applied to each element of the tensor.

Here are some examples:

> t.abs() 
tensor([[0., 5., 0.],
        [6., 0., 7.],
        [0., 8., 0.]])

> t.sqrt()
tensor([[0.0000, 2.2361, 0.0000],
        [2.4495, 0.0000, 2.6458],
        [0.0000, 2.8284, 0.0000]])

> t.neg()
tensor([[-0., -5., -0.],
        [-6., -0., -7.],
        [-0., -8., -0.]])

> t.neg().abs()
tensor([[0., 5., 0.],
        [6., 0., 7.],
        [0., 8., 0.]])

Some terminology

There are some other ways to refer to element-wise operations, so I just wanted to mention that all of these mean the same thing:

Element-wise
Component-wise
Point-wise

Just keep this in mind if you encounter any of these terms in the wild.

Wrapping up

Now, we should have a good understanding of element-wise operations and how they are applied to tensor operations for neural networks and deep learning. In the next post, we will be covering the last two categories of tensor operations:

Reshaping operations
Element-wise operations
Reduction operations
Access operations

See you in the next one!

quiz

expand_more

resources

expand_more

Learn about tensor broadcasting for artificial neural network programming and element-wise operations using Python, PyTorch, and NumPy. deeplizard on broadcasting: https://deeplizard.com/learn/video/6_33ulFDuCg Jeremy on broadcasting: https://youtu.be/PGC0UxakTvM?t=3141 fast.ai: http://www.fast.ai/ 🕒🦎 VIDEO SECTIONS 🦎🕒 00:00 Welcome to DEEPLIZARD - Go to deeplizard.com for learning resources 00:30 Help deeplizard add video timestamps - See example in the description 12:34 Collective Intelligence and the DEEPLIZARD HIVEMIND 💥🦎 DEEPLIZARD COMMUNITY RESOURCES 🦎💥 👋 Hey, we're Chris and Mandy, the creators of deeplizard! 👀 CHECK OUT OUR VLOG: 🔗 https://youtube.com/deeplizardvlog 💪 CHECK OUT OUR FITNESS CHANNEL: 🔗 https://www.youtube.com/channel/UCdCxHNCexDrAx78VfAuyKiA 🧠 Use code DEEPLIZARD at checkout to receive 15% off your first Neurohacker order: 🔗 https://neurohacker.com/shop?rfsn=6488344.d171c6 ❤️🦎 Special thanks to the following polymaths of the deeplizard hivemind: Mano Prime 👀 Follow deeplizard: Our vlog: https://youtube.com/deeplizardvlog Fitness: https://www.youtube.com/channel/UCdCxHNCexDrAx78VfAuyKiA Facebook: https://facebook.com/deeplizard Instagram: https://instagram.com/deeplizard Twitter: https://twitter.com/deeplizard Patreon: https://patreon.com/deeplizard YouTube: https://youtube.com/deeplizard 🎓 Deep Learning with deeplizard: AI Art for Beginners - https://deeplizard.com/course/sdcpailzrd Deep Learning Dictionary - https://deeplizard.com/course/ddcpailzrd Deep Learning Fundamentals - https://deeplizard.com/course/dlcpailzrd Learn TensorFlow - https://deeplizard.com/course/tfcpailzrd Learn PyTorch - https://deeplizard.com/course/ptcpailzrd Natural Language Processing - https://deeplizard.com/course/txtcpailzrd Reinforcement Learning - https://deeplizard.com/course/rlcpailzrd Generative Adversarial Networks - https://deeplizard.com/course/gacpailzrd Stable Diffusion Masterclass - https://deeplizard.com/course/dicpailzrd 🎓 Other Courses: DL Fundamentals Classic - https://deeplizard.com/learn/video/gZmobeGL0Yg Deep Learning Deployment - https://deeplizard.com/learn/video/SI1hVGvbbZ4 Data Science - https://deeplizard.com/learn/video/d11chG7Z-xk Trading - https://deeplizard.com/learn/video/ZpfCK_uHL9Y 🛒 Check out products deeplizard recommends on Amazon: 🔗 https://amazon.com/shop/deeplizard 📕 Get a FREE 30-day Audible trial and 2 FREE audio books using deeplizard's link: 🔗 https://amzn.to/2yoqWRn 🎵 deeplizard uses music by Kevin MacLeod 🔗 https://youtube.com/channel/UCSZXFhRIx6b0dFX3xS8L1yQ ❤️ Please use the knowledge gained from deeplizard content for good, not evil.

updates

expand_more

DEEPLIZARD Message notifications

Update history for this page

Did you know you that deeplizard content is regularly updated and maintained?

Updated
Maintained

Spot something that needs to be updated? Don't hesitate to let us know. We'll fix it!

All relevant updates for the content on this page are listed below.

PyTorch - Python Deep Learning Neural Network API