Neural Network Programming - Deep Learning with PyTorch

with deeplizard.

Stack vs Concat in PyTorch, TensorFlow & NumPy - Deep Learning Tensor Ops

July 18, 2019 by

Blog

Tensor Ops for Deep Learning: Concatenate vs Stack

Welcome to this neural network programming series. In this episode, we will dissect the difference between concatenating and stacking tensors together. We’ll look at three examples, one with PyTorch, one with TensorFlow, and one with NumPy.

Without further ado, let’s get started.

Existing vs New Axes

The difference between stacking and concatenating tensors can be described in a single sentence, so here goes.

Concatenating joins a sequence of tensors along an existing axis, and stacking joins a sequence of tensors along a new axis.

And that’s all there is to it!

This is the difference between stacking and concatenating. However, the description here is kind of tricky, so let’s look at some examples to get a handle on what exactly how this can be better understood. We’ll look at stacking and concatenating in PyTorch, TensorFlow, and NumPy. Let’s do it.

For the most part, concatenating along an existing axis of a tensor is pretty straight forward. The confusion usually arises when we want to concat along a new axis. For this we stack. Another way of saying that we stack is to say that we create a new axis and then concat on that axis.

Join Method Where
Concatenate Along an existing axis
Stack Along a new axis

For this reason, let’s be sure we know how to create a new axis for a given tensor, and then we’ll start stacking and concatenating.

How to Add or Insert an Axis into a Tensor

To demonstrate this idea of adding an axis, we’ll use PyTorch.

import torch
t1 = torch.tensor([1,1,1])

Here, we’re importing PyTorch and creating a simple tensor that has a single axis of length three. Now, to add an axis to a tensor in PyTorch, we use the unsqueeze() function. Note that this is the opposite of squeezing.

> t1.unsqueeze(dim=0)
tensor([[1, 1, 1]])

Here, we are we are adding an axis, a.k.a dimension at index zero of this tensor. This gives us a tensor with a shape of 1 x 3. When we say index zero of the tensor, we mean the first index of the tensor's shape.

Now, we can also add an axis at the second index of this tensor.

> t1.unsqueeze(dim=1)
tensor([[1],
        [1],
        [1]])

This gives us a tensor with a shape of 3 x 1. Adding axes like this changes the way the data is organized inside the tensor, but it does not change the data itself. Basically, we are just reshaping the tensor. We can see that by checking the shape of each one of these.

> print(t1.shape)
> print(t1.unsqueeze(dim=0).shape)
> print(t1.unsqueeze(dim=1).shape)
torch.Size([3])
torch.Size([1, 3])
torch.Size([3, 1])

Now, thinking back about concatenating verses stacking, when we concat, we are joining a sequence of tensors along an existing axis. This means that we are extending the length of an existing axis.

When we stack, we are creating a new axis that didn’t exist before and this happens across all the tensors in our sequence, and then we concat along this new sequence.

Let’s see how this is done in PyTorch.

Stack vs Cat in PyTorch

With PyTorch the two functions we use for these operations are stack and cat. Let’s create a sequence of tensors.

import torch

t1 = torch.tensor([1,1,1])
t2 = torch.tensor([2,2,2])
t3 = torch.tensor([3,3,3])

Now, let’s concatenate these with one another. Notice that each of these tensors have a single axis. This means that the result of the cat function will also have a single axis. This is because when we concatenate, we do it along an existing axis. Notice that in this example, the only existing axis is the first axis.

> torch.cat(
    (t1,t2,t3)
    ,dim=0
)
tensor([1, 1, 1, 2, 2, 2, 3, 3, 3])

Alright, so we took three single axis tensors each having an axis length of three, and now we have a single tensor with an axis length of nine.

Now, let’s stack these tensors along a new axis that we’ll insert. We’ll insert an axis at the first index. Note that this insertion will be happening implicitly under the hood by the stack function.

> torch.stack(
    (t1,t2,t3)
    ,dim=0
)
tensor([[1, 1, 1],
        [2, 2, 2],
        [3, 3, 3]])

This gives us a new tensor that has a shape of 3 x 3. Notice how the three tensors are concatenated along the first axis of this tensor. Note that we can also insert the new axis explicitly, and preform the concatenation directly.

To see that this statement is true. Let’s add a new axis of length one to all of our tensors by unsqueezing them and then, cat along the first axis.

> torch.cat(
    (
         t1.unsqueeze(0)
        ,t2.unsqueeze(0)
        ,t3.unsqueeze(0)
    )
    ,dim=0
)
tensor([[1, 1, 1],
        [2, 2, 2],
        [3, 3, 3]])

In this case, we can see that we get the same result that we got by stacking. However, the call to stack was much cleaner because the new axis insertion was handed by the stack function.

Concatenation happens along an existing axis.

Note that we cannot concat this sequence of tensors along the second axis because there currently is no second axis in existence, so in this case, stacking is our only option.

Let’s try stacking along the second axis.

> torch.stack(
    (t1,t2,t3)
    ,dim=1
)
tensor([[1, 2, 3],
        [1, 2, 3],
        [1, 2, 3]])

Alright, we stack with respect to the second axis and this is the result.

> torch.cat(
    (
         t1.unsqueeze(1)
        ,t2.unsqueeze(1)
        ,t3.unsqueeze(1)
    )
    ,dim=1
)
tensor([[1, 2, 3],
        [1, 2, 3],
        [1, 2, 3]])

To understand this result, think back to what it looked like when we inserted a new axis at the end of the tensor. Now, we just do that to all our tensors, and we can cat them like so along the second axis. Inspecting the unsqueezed outputs can help make this solid.

> t1.unsqueeze(1)
tensor([[1],
        [1],
        [1]])

> t2.unsqueeze(1)
tensor([[2],
        [2],
        [2]])
        
> t3.unsqueeze(1)
tensor([[3],
        [3],
        [3]])

Stack vs Concat in TensorFlow

Let's work with TensorFlow now.

import tensorflow as tf

t1 = tf.constant([1,1,1])
t2 = tf.constant([2,2,2])
t3 = tf.constant([3,3,3])

Here, we have imported TensorFlow and created three tensors using the tf.constant() function. Now, let's concatenate these tensors with one another. To do this in TensorFlow, we use the tf.concat() function, and instead of specifying a dim (like with PyTorch), we specify an axis. These two mean the same thinking.

> tf.concat(
    (t1,t2,t3)
    ,axis=0
)
tf.Tensor: id=4, shape=(9,), dtype=int32, numpy=array([1, 1, 1, 2, 2, 2, 3, 3, 3])

Here, the result is the same as when we did it with PyTorch. Alright, let's stack them now.

> tf.stack(
    (t1,t2,t3)
    ,axis=0
)
tf.Tensor: id=6, shape=(3, 3), dtype=int32, numpy=
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])

Again, the results are the same as the PyTorch results. Now, we'll concatenate these after manually inserting the new dimension.

> tf.concat(
    (
         tf.expand_dims(t1, 1)
        ,tf.expand_dims(t2, 1)
        ,tf.expand_dims(t3, 1)
    )    
    ,axis=1
)
tf.Tensor: id=15, shape=(3, 3), dtype=int32, numpy=
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])

The difference with this TensorFlow code opposed to the PyTorch call is that the cat() function is now called concat(). Additionally, we use the expand_dims() function to add an axis opposed to the unsqueeze() function.

Unsqueezing and expanding dims mean the same thing.

Alright, let's stack with respect to the second axis.

> tf.stack(
    (t1,t2,t3)
    ,axis=1
)
tf.Tensor: id=17, shape=(3, 3), dtype=int32, numpy=
array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]])

And in the manual axis insertion way.

> tf.concat(
    (
         tf.expand_dims(t1, 0)
        ,tf.expand_dims(t2, 0)
        ,tf.expand_dims(t3, 0)
    )
    ,axis=0
)
tf.Tensor: id=26, shape=(3, 3), dtype=int32, numpy=
array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]])

Observe that these results are consist with PyTorch.

Stack vs Concatenate in NumPy

Let's work with NumPy now.

import numpy as np

t1 = np.array([1,1,1])
t2 = np.array([2,2,2])
t3 = np.array([3,3,3])

Here, we've created our three tensors. Now, let's concatenate them with one another.

> np.concatenate(
    (t1,t2,t3)
    ,axis=0
)
array([1, 1, 1, 2, 2, 2, 3, 3, 3])

Alright, this gives us what we expect. Note that like TensorFlow, NumPy also used the axis parameter name, but here, we are also seeing another naming variation. NumPy uses the full word concatenate as the function name.

Library Function Name
PyTorch cat()
TensorFlow concat()
NumPy concatenate()

Okay, let's stack now.

> np.stack(
    (t1,t2,t3)
    ,axis=0
)
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])

As expected, the result is a rank-2 tensor with a shape of 3 x 3. Now, we'll try the manual way.

> np.concatenate(
    (
         np.expand_dims(t1, 0)
        ,np.expand_dims(t2, 0)
        ,np.expand_dims(t3, 0)
    )
    ,axis=0
)
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])

Note that the result is the same as when we used the stack() function. Additionally, observe that NumPy also uses the term expand dims for the function name.

Now, we'll finish this off by stacking using the second axis.

> np.stack(
    (t1,t2,t3)
    ,axis=1
)
array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]])

And, with manual insertion.

> np.concatenate(
    (
         np.expand_dims(t1, 1)
        ,np.expand_dims(t2, 1)
        ,np.expand_dims(t3, 1)
    )
    ,axis=1
)
array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]])

Stack or Concat: Real-Life Examples

Here are three concrete examples that we can encounter in real life. Let’s decide when we need to stack and when we need to concat.

Joining Images into a Single Batch

Suppose we have three individual images as tensors. Each image tensor has three dimensions, a channel axis, a height axis, a width axis. Note that each of these tensors are separate from one another. Now, assume that our task is to join these tensors together to form a single batch tensor of three images.

Do we concat or do we stack?

Well, notice that in this example, there are only three dimensions in existence, and for a batch, we need four dimensions. This means that the answer is to stack the tensors along a new axis. This new axis will be the batch axis. This will give us a single tensor with four dimensions by adding one for the batch.

Note that if we join these three along any of the existing dimensions, we would be messing up either the channels, the height, or the width. We don’t want to mess our data up like that.

import torch
t1 = torch.zeros(3,28,28)
t2 = torch.zeros(3,28,28)
t3 = torch.zeros(3,28,28)

torch.stack(
    (t1,t2,t3)
    ,dim=0
).shape

## output ##
torch.Size([3, 3, 28, 28])

Joining Batches into a Single Batch

Now, suppose we have the same three images as before, but this time the images already have a dimension for the batch. This actually means we have three batches of size one. Assume that it is our task to obtain a single batch of three images.

Do we concat or stack?

Well, notice how there is an existing dimension that we can concat on. This means that we concat these along the batch dimension. In this case there is no need to stack.

Here is a code example of this:

import torch
t1 = torch.zeros(1,3,28,28)
t2 = torch.zeros(1,3,28,28)
t3 = torch.zeros(1,3,28,28)
torch.cat(
    (t1,t2,t3)
    ,dim=0
).shape

## output ##
torch.Size([3, 3, 28, 28])

Let’s see a third. This one is hard. Or at least more advanced. You will see why.

Joining Images with an Existing Batch

Suppose we have the same three separate image tensors. Only, this time, we already have a batch tensor. Assume our task is to join these three separate images with the batch.

Do we concat or do we stack?

Well, notice how the batch axis already exists inside the batch tensor. However, for the images, there is no batch axis in existence. This means neither of these will work. To join with stack or cat, we need the tensors to have matching shapes. So then, are we stuck? Is this impossible?

It is indeed possible. It’s actually a very common task. The answer is to first stack and then to concat.

We first stack the three image tensors with respect to the first dimension. This creates a new batch dimension of length three. Then, we can concat this new tensor with the batch tensor.

Let's see an example of this in code:

import torch
batch = torch.zeros(3,3,28,28)
t1 = torch.zeros(3,28,28)
t2 = torch.zeros(3,28,28)
t3 = torch.zeros(3,28,28)
​
torch.cat(
    (
        batch
        ,torch.stack(
            (t1,t2,t3)
            ,dim=0
        )
    )
    ,dim=0
).shape

## output ##
torch.Size([6, 3, 28, 28])

In the same way:

import torch
batch = torch.zeros(3,3,28,28)
t1 = torch.zeros(3,28,28)
t2 = torch.zeros(3,28,28)
t3 = torch.zeros(3,28,28)
​
torch.cat(
    (
        batch
        ,t1.unsqueeze(0)
        ,t2.unsqueeze(0)
        ,t3.unsqueeze(0)
    )
    ,dim=0
).shape

## output ##
torch.Size([6, 3, 28, 28])

I hope this helps and you get it now.

Description

Welcome to this neural network programming series. In this episode, we will dissect the difference between concatenating and stacking tensors together. We’ll look at three examples, one with PyTorch, one with TensorFlow, and one with NumPy. 👉 Learn about squeezing tensors: 🔗 https://deeplizard.com/learn/video/fCVuiW9AFzY 🙏 Thank you to Liu Xinxin who asked the question that lead to this video's creation. 👉 What's the difference between stack and cat? 💥🦎 DEEPLIZARD COMMUNITY RESOURCES 🦎💥 👀 OUR VLOG: 🔗 https://www.youtube.com/channel/UC9cBIteC3u7Ee6bzeOcl_Og 👉 Check out the blog post and other resources for this video: 🔗 https://deeplizard.com/learn/video/kF2AlpykJGY 💻 DOWNLOAD ACCESS TO CODE FILES 🤖 Available for members of the deeplizard hivemind: 🔗 https://www.patreon.com/posts/27743395 🧠 Support collective intelligence, join the deeplizard hivemind: 🔗 https://deeplizard.com/hivemind 🤜 Support collective intelligence, create a quiz question for this video: 🔗 https://deeplizard.com/create-quiz-question 🚀 Boost collective intelligence by sharing this video on social media! ❤️🦎 Special thanks to the following polymaths of the deeplizard hivemind: yasser Prash 👀 Follow deeplizard: Our vlog: https://www.youtube.com/channel/UC9cBIteC3u7Ee6bzeOcl_Og Twitter: https://twitter.com/deeplizard Facebook: https://www.facebook.com/Deeplizard-145413762948316 Patreon: https://www.patreon.com/deeplizard YouTube: https://www.youtube.com/deeplizard Instagram: https://www.instagram.com/deeplizard/ 🎓 Deep Learning with deeplizard: Fundamental Concepts - https://deeplizard.com/learn/video/gZmobeGL0Yg Beginner Code - https://deeplizard.com/learn/video/RznKVRTFkBY Advanced Code - https://deeplizard.com/learn/video/v5cngxo4mIg Advanced Deep RL - https://deeplizard.com/learn/video/nyjbcRQ-uQ8 🎓 Other Courses: Data Science - https://deeplizard.com/learn/video/d11chG7Z-xk Trading - https://deeplizard.com/learn/video/ZpfCK_uHL9Y 🛒 Check out products deeplizard recommends on Amazon: 🔗 https://www.amazon.com/shop/deeplizard 📕 Get a FREE 30-day Audible trial and 2 FREE audio books using deeplizard’s link: 🔗 https://amzn.to/2yoqWRn 🎵 deeplizard uses music by Kevin MacLeod 🔗 https://www.youtube.com/channel/UCSZXFhRIx6b0dFX3xS8L1yQ 🔗 http://incompetech.com/ ❤️ Please use the knowledge gained from deeplizard content for good, not evil.