Tensors Explained - Data Structures of Deep Learning
text
Introducing tensors for deep learning
Welcome back to this series on neural network programming with PyTorch. In this post, we will kick off section two of the series, which is all about tensors.
We'll talk tensors, terminology, and look at tensor indexes. This will give us the knowledge we need to look at some fundamental tensor attributes that are used in deep learning. Without further ado, let's get started.
What is a tensor?
The inputs, outputs, and transformations within neural networks are all represented using tensors, and as a result, neural network programming utilizes tensors heavily.
The concept of a tensor is a mathematical generalization of other more specific concepts. Let's look at some specific instances of tensors.
Specific instances of tensors
Each of these examples are specific instances of the more general concept of a tensor:
- number
- scalar
- array
- vector
- 2d-array
- matrix
Let's organize the above list of example tensors into two groups:
- number, array, 2d-array
- scalar, vector, matrix
The first group of three terms (number, array, 2d-array) are terms that are typically used in computer science, while the second group (scalar, vector, matrix) are terms that are typically used in mathematics.
We often see this kind of thing where different areas of study use different words for the same concept. In deep learning, we usually just refer to all of these as tensors.
Let's investigate these terms further. The terms in each group correspond to one another as we move from left to right. To show this correspondence, we can reshape our list of terms to get three groups of two terms each:
- number, scalar
- array, vector
- 2d-array, matrix
Indexes required to access an element
The relationship within each of these pairs is that both elements require the same number of indexes to refer to a specific element within the data structure.
Indexes required | Computer science | Mathematics |
---|---|---|
\(0\) | number | scalar |
\(1\) | array | vector |
\(2\) | 2d-array | matrix |
For example, suppose we have this array:
> a = [1,2,3,4]
Now, suppose we want to access (refer to) the number \(3\) in this data structure. We can do it using a single index like so:
> a[2]
3
This logic works the same for a vector.
As another example, suppose we have this 2d-array:
> dd = [
[1,2,3],
[4,5,6],
[7,8,9]
]
Now, suppose we want to access (refer to) the number \(3\) in this data structure. In this case, we need two indexes to locate the specific element.
> dd[0][2]
3
This logic works the same for a matrix.
Note that, if we have a number or scalar, we don't need an index, we can just refer to the number or scalar directly.
This gives us the working knowledge we need, so we are now ready to generalize.
Tensors are generalizations
Let's look at what happens when there are more than two indexes required to access (refer to) a specific element within these data structures we have been considering.
When more than two indexes are required to access a specific element, we stop giving specific names to the structures, and we begin using more general language.
Mathematics
In mathematics, we stop using words like scalar, vector, and matrix, and we start using the word tensor or nd-tensor. The \(n\) tells us the number of indexes required to access a specific element within the structure.
Computer science
In computer science, we stop using words like, number, array, 2d-array, and start using the word multidimensional array or nd-array. The \(n\) tells us the number of indexes required to access a specific element within the structure.
Indexes required | Computer science | Mathematics |
---|---|---|
\(n\) | nd-array | nd-tensor |
Let's make this clear. For practical purposes in neural network programming, tensors and nd-arrays are one in the same.
So tensors are multidimensional arrays or nd-arrays for short. The reason we say a tensor is a generalization is because we use the word tensor for all values of \(n\) like so:
- A scalar is a \(0\) dimensional tensor
- A vector is a \(1\) dimensional tensor
- A matrix is a \(2\) dimensional tensor
- A nd-array is an \(n\) dimensional tensor
Tensors allow us to drop these specific terms and just use an \(n\) to identify the number of dimensions we are working with.
One thing to note about the dimension of a tensor is that it differs from what we mean when we refer to the dimension of a vector in a vector space. The dimension of a tensor does not tell us how many components exist within the tensor.
If we have a three dimensional vector from three dimensional euclidean space, we have an ordered triple with three components.
A three dimensional tensor, however, can have many more than three components. Our two dimensional tensor dd
for example has nine components.
> dd = [
[1,2,3],
[4,5,6],
[7,8,9]
]
Wrapping up
In the next post, when we cover the concepts of rank, axes and shape, and we'll see how to determine the number of components contained within a tensor. These are the fundamental attributes of tensors that we use in deep learning.
Keep indexes in mind as we go over these concepts because indexes give us a concrete way of thinking about tensor related concepts. I'll see you in the next one!
quiz
resources
updates
Committed by on