Pytorch study notes (1) - Pytorch basics

leading

For more article code details, please check the blogger’s personal website: https://www.iwtmbtly.com/

Import the required libraries and files:

>>> import torch
>>> import numpy as np

1. What is Tensor

In deep learning, from the organization of data to the internal parameters of the model, it is represented and processed through a data structure called tensor .

Tensor is a very basic concept in the deep learning framework and one of the most important knowledge points in PyTroch and TensorFlow. It is a data storage and processing structure.

Recall the several data representations we currently know:

A scalar, also called Scalar, is a quantity with only magnitude but no direction, such as 1.8, e, 10, etc.
Vector, also known as Vector, is a quantity with magnitude and direction, such as (1,2,3,4) and so on.
Matrix, also called Matrix, is the quantity obtained by merging multiple vectors together, such as [(1,2,3),(4,5,6)] and so on.

In order to help you better understand scalars, vectors and matrices, I have specially prepared a schematic diagram, which can be understood with pictures.

It is not difficult to find that several data representations are actually related. Scalars can be combined into vectors, and vectors can be combined into matrices. So, can we think of them as a form of data?

The answer is yes, this unified data form is called Tensor in PyTorch. Judging from the relationship between scalars, vectors, and matrices, you may think that they are Tensors of different "dimensions". This statement is correct, but not entirely correct.

It is not entirely true because in the concept of Tensor, we prefer to use Rank (rank) to represent this "dimension". For example, a scalar is a Tensor with a rank of 0; a vector is a Tensor with a rank of 1; a matrix is Rank is a Tensor of order 2. There are also Tensors with Rank greater than 2. Of course, if you say that there is nothing wrong with the dimension, many people usually call it that.

After talking about the meaning of Tensor, let's take a look at the types of Tensor and how to create Tensor.

2. The type, creation and conversion of Tensor

Under different deep learning frameworks, the characteristics of Tensor are similar, and the way we use it is similar. In this lesson, we will take the usage method in PyTorch as an example to learn.

Tensor type

In PyTorch, there are many data types supported by Tensor. Here are some of the more commonly used formats:

Generally speaking, torch.float32, torch.float64, torch.uint8 and torch.int64 are used relatively more, but they are not absolute, and they should be selected according to the actual situation.

Tensor creation

PyTorch is already very friendly to the operation of Tensor. You can create a Tensor of any shape in many different ways, and each method is very simple. Let's take a look.

create directly

First look at the method of direct creation, which is also the easiest way to create. We need to use the following torch.tensor function to create directly.

torch.tensor(data, dtype=None, device=None,requires_grad=False)

For example:

>>> A = torch.tensor([x for x in range(6)])
>>> A
tensor([0, 1, 2, 3, 4, 5])

Combined with the code, let's see what the parameters mean.

Let's look at it from left to right. The first is data, which is the data we want to pass into the model. PyTorch supports data input through various types such as list, tuple, numpy array, and scalar, and converts it into tensor.

Then there is dtype, which declares what type of Tensor you need to return. For specific types, please refer to the types of Tensor listed in the previous table.

Then there is device. This parameter specifies the device to which the data is to be returned. At present, it is not necessary to pay attention to it, and the default is fine.

The last parameter is requires_grad, which is used to indicate whether the current quantity needs to retain the corresponding gradient information in the calculation. In PyTorch, only when a Tensor sets requires_grad to True, the Tensor and other Tensors calculated by this Tensor will be derived, and then the derivative value will be stored in the Tensor's grad attribute, which is convenient for the optimizer to update parameter.

So, what you need to pay attention to is that setting requires_grad to true or false should be handled flexibly. If it is a training process, it should be set to true, the purpose is to facilitate derivation and update parameters. When it comes to the verification or testing process, our purpose is to check the generalization ability of the current model, so we need to set requires_grad to Fasle to avoid automatic update of this parameter according to loss.

Created from NumPy

In practical applications, we often use NumPy in the stage of data processing, and after the data is processed, if we want to pass it into PyTorch's deep learning model, we need to use Tensor, so PyTorch provides a transfer from NumPy to Tensor statement:

torch.from_numpy(ndarry)

For example:

>>> A = torch.from_numpy(np.array([x for x in range(6)]))
>>> A
tensor([0, 1, 2, 3, 4, 5])

Sometimes we need to use some specific form of matrix Tensor in the process of developing the model, such as all 0s or all 1s. At this time, we can use this method to create, for example, first generate a NumPy array full of 0s, and then convert it into a Tensor. But this is also quite troublesome, because it means that you have to introduce more packages (NumPy), and you will use more code, which will increase the possibility of errors.

But don't worry, PyTorch has provided a simpler method internally, let's look down.

Create a special form of Tensor

Let's take a look at the following commonly used functions, which are all used inside the PyTorch model.

Create a zero-matrix Tensor: As the name implies, a zero-matrix is a matrix in which all elements are 0.

torch.zeros(size, dtype=None...)

For example:

>>> A = torch.zeros((3,4), dtype=float)
>>> A
tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]], dtype=torch.float64)

Among them, we use the size parameter and the dtype parameter more often. size A sequence of integers defining the shape of the output tensor.

Here you may have noticed that I added an ellipsis in the function parameter list, which means that torch.zeros has many parameters. However, now is the concept of introducing the zero matrix, and the shape is relatively more important. Other parameters (such as the requires_grad parameter mentioned above) have nothing to do with this, and we don't pay attention to it at this stage.

Create an identity matrix Tensor: An identity matrix is a matrix whose elements on the main diagonal are all 1.

torch.eye(size, dtype=None...)

For example:

>>> A = torch.eye(6)
>>> A
tensor([[1., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0.],
        [0., 0., 1., 0., 0., 0.],
        [0., 0., 0., 1., 0., 0.],
        [0., 0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 0., 1.]])

Create an all-one matrix Tensor: All-one matrix, as the name implies, is a matrix in which all elements are 1.

torch.ones(size, dtype=None...)

For example:

>>> A = torch.ones((6,6))
>>> A
tensor([[1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1.]])

Create a random matrix Tensor: There are several commonly used random matrix creation methods in PyTorch, as follows.

torch.rand(size)
torch.randn(size)
torch.normal(size, mean, std)
torch.randint(low, high, size）

Each of these methods has different usages, and you can use them flexibly according to your own needs.

torch.rand is used to generate a random Tensor whose data type is floating point and whose dimension is specified. The randomly generated floating point data is evenly distributed in the interval 0~1.
torch.randn is used to generate a random Tensor whose data type is floating-point and whose dimension is specified. The value of the randomly generated floating-point number satisfies the standard normal distribution with a mean of 0 and a variance of 1.
torch.normal is used to generate a random Tensor whose data type is floating point and whose dimension is specified, and the mean and standard deviation can be specified.
torch.randint is used to generate a Tensor of random integers, which is filled with random integers uniformly generated in [low, high).

Tensor conversion

In actual projects, we come into contact with many data types, such as Int, list, NumPy, etc. In order to allow data to flow unimpeded at all stages, the conversion between different data types and Tensor is very important. Next, let's take a look at how int, list, and NumPy convert to and from Tensor.

Conversion between int and Tensor:

>>> a = torch.Tensor([1])
>>> b = a.item()
>>> b
1.0

We convert a number (or scalar) into a Tensor through torch.Tensor, and then convert the Tensor into a number (scalar) through the item() function. The function of the item() function is to convert the Tensor into a python number.

Conversion of list and tensor:

>>> a = [1, 2, 3]
>>> b = torch.Tensor(a)
>>> c = b.numpy().tolist()
>>> c
[1.0, 2.0, 3.0]

Here, for a list a, we still use torch.Tensor directly, and then we can convert it to Tensor. And the process of restoring back needs one more step, we need to convert the Tensor to the NumPy structure first, and then use the tolist() function to get the list.

Conversion between NumPy and Tensor:

With the previous two examples, can you imagine how to convert NumPy to Tensor? Yes, we can still use torch.Tensor, isn't it very convenient?

Conversion between CPU and GPU Tensor:

CPU->GPU: data.cuda()
GPU->CPU: data.cpu()

3. Common operations of Tensor

Ok, just now we learned about the types of Tensor, how to create Tensor, and how to realize the mutual conversion between Tensor and some common data types. In fact, Tensor also has some more commonly used functions, such as obtaining shape, dimension conversion, shape transformation, and adding or subtracting dimensions. Next, let's take a look at these functions.

get shape

In the design of deep learning network, we need to know the situation of Tensor at all times, including obtaining the form and shape of Tensor.

In order to get the shape of Tensor, we can use shape or size to get it. The difference between the two is that shape is an attribute of torch.tensor, and size() is a method owned by torch.tensor.

>>> a = torch.zeros(2, 3, 5)
>>> a.shape
torch.Size([2, 3, 5])
>>> a.size()
torch.Size([2, 3, 5])

Knowing the shape of Tensor, we can know the number of elements contained in this Tensor. The specific calculation method is to directly multiply the sizes of all dimensions. For example, the number of elements contained in Tensor a above is 235=30. This seems a bit cumbersome, we can use the numel() function in PyTorch to directly count the number of elements.

>>> a.numel()
30

Matrix transpose (dimension conversion)

There are two functions in PyTorch, namely permute() and transpose(), which can be used to realize the transformation of matrix, or exchange data of different dimensions. For example, when adjusting the size of the convolutional layer, modifying the order of the channels, and changing the size of the fully connected layer, we will use them.

Among them, the permute function can be used to transpose any high-dimensional matrix, but only tensor.permute() is called. Let's look at the code first:

>>> x = torch.rand(2,3,5)
>>> x.shape
torch.Size([2, 3, 5])
>>> x = x.permute(2,1,0)
>>> x.shape
torch.Size([5, 3, 2])

Have you noticed that the shape of the original Tensor is [2,3,5], we write the new position of the original index position in permute, x.permute(2,1,0), 2 means the original second dimension It is now placed in the zeroth dimension; similarly, 1 means that the original first dimension is still in the first dimension; 0 means that the original zeroth dimension is placed in the current second dimension, and the shape becomes [5, 3,2].

The other function, transpose, is different from permute in that it can only convert two dimensions at a time, or exchange data in two dimensions. Let's take a look at the code:

>>> x.shape
torch.Size([5, 3, 2])
>>> x = x.transpose(1,0)
>>> x.shape
torch.Size([3, 5, 2])

It should be noted that the data processed by transpose or permute becomes no longer continuous. What does that mean?

Still following the previous example, the tensor obtained by using torch.rand(2,3,4) is continuous in memory, but after transpose or permute, such as transpose(1,0), the memory has not changed , but the data we get "looks" that the data of the 0th and 1st dimensions have been exchanged, and the current 0th dimension is the original 1st dimension, so Tensor will become discontinuous.

Then you may ask, if it is not continuous, it will not be continuous, it seems to have no effect, right? If you think so, you will be sloppy. Let's continue to look at the shape transformation of Tensor. After learning, you will know the consequences of Tensor discontinuity.

shape transformation

There are two commonly used functions for changing shapes in PyTorch, namely view and reshape. Let's take a look at the view first.

>>> x = torch.randn(4, 4)
>>> x.shape
torch.Size([4, 4])
>>> x = x.view(2,8)
>>> x.shape
torch.Size([2, 8])

We first declare a Tensor with a size of [4, 4], and then modify it to a Tensor with a shape of [2, 8] through the view function. Let's continue with the x just now, and do one more operation, the code is as follows:

>>> x = x.permute(1,0)
>>> x.shape
torch.Size([8, 2])
>>> x.view(4, 4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

Combined with the code, we can see that using permute, we transformed the data of the 0th and 1st dimensions, and obtained a Tensor of the shape [8, 2]. When performing view operations on this new Tensor, an error was reported suddenly. Why? ? In fact, it is because the view cannot handle the structure of Tensor with discontinuous memory.

What should we do at this time? We can use another function, reshape:

>>> x = x.reshape(4, 4)
>>> x.shape
torch.Size([4, 4])

Increase or decrease dimension

Sometimes we need to add or delete certain dimensions to Tensor, such as deleting or adding several channels of an image. PyTorch provides the squeeze() and unsqueeze() functions to solve this problem.

Let's look at squeeze() first. If the value of the dimension specified by dim is 1, the dimension will be deleted; if the value of the specified dimension is not 1, the original Tensor will be returned. In order to facilitate your understanding, I will explain it with examples.

>>> x = torch.rand(2,1,3)
>>> x.shape
torch.Size([2, 1, 3])
>>> y = x.squeeze(1)
>>> y.shape
torch.Size([2, 3])
>>> z = y.squeeze(1)
>>> z.shape
torch.Size([2, 3])

Combined with the code, we can see that we have created a new Tensor with dimensions [2, 1, 3], and then deleted the data in the first dimension to get y. The reason why squeeze is executed successfully is that the size of the first dimension is 1. However, when we intend to further delete the first dimension on y, we will find that the deletion fails. This is because the size of the first dimension of y at the moment is 3, and suqeeze cannot delete it.

unsqueeze(): This function mainly expands the data dimension. Add a dimension whose dimension is 1 to the specified position, let's also take a look at it with the code example.

>>> x = torch.rand(2,1,3)
>>> y = x.unsqueeze(2)
>>> y.shape
torch.Size([2, 1, 1, 3])

Here we create a new Tensor with a dimension of [2, 1, 3], and then insert a data in the second dimension, thus obtaining a tensor with a size of [2,1,1,3].

4. Connection operation of Tensor

In project development, the data of neurons in a certain layer of deep learning may come from many different sources, so the data needs to be combined, and the operation of this combination is called connection.

cat

The connection operation function is as follows:

torch.cat(tensors, dim = 0, out = None)

cat is the meaning of concatnate, that is, the meaning of splicing and connection. This function has two important parameters that you need to master.

The first parameter is tensors, which is easy to understand. It is a number of Tensors that we are going to splice.

The second parameter is dim. Let's recall the definition of Tensor. The dimension (rank) of Tensor has many situations. For example, if there are two 3-dimensional Tensors, there can be several different splicing methods (as shown in the figure below), and the dim parameter can make an agreement on this.

Seeing this, you may think that the picture drawn above is three-dimensional, which looks rather obscure, so let's start with a simple two-dimensional situation. We first declare two 3x3 matrices. The code is as follows:

>>> A=torch.ones(3,3)
>>> B=2*torch.ones(3,3)
>>> A
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
>>> B
tensor([[2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.]])

Let's first look at the case of dim=0, what is the result of splicing:

>>> C=torch.cat((A,B),0)
>>> C
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.]])

You will find that the two matrices are concatenated in the "row" direction.

Let's take a look at what happens when dim=1:

>>> D=torch.cat((A,B),1)
>>> D
tensor([[1., 1., 1., 2., 2., 2.],
        [1., 1., 1., 2., 2., 2.],
        [1., 1., 1., 2., 2., 2.]])

Obviously, the two matrices are spliced according to the "column" direction. So what if Tensor is three-dimensional or even higher-dimensional? In fact, the reason is the same, what is the value of dim, the two matrices will link the two Tensors according to the direction of the corresponding dimension.

Seeing this, you may ask, cat actually connects multiple Tensors in existing dimensions, so what should you do if you want to add a new dimension for connection? At this time, the stack function needs to appear.

stack

In order to let you deepen your understanding, let's take a look at a specific example. Suppose we have two two-dimensional matrix Tensor, "stack" them together to form a three-dimensional Tensor, as shown below:

This is equivalent to the original dimension (rank) being 2, but now it has become 3, and it has become a three-dimensional structure with an added dimension. What you need to pay attention to is that this is different from the previous cat. The example of the schematic diagram in cat was originally 3-dimensional. After cat, it is still 3-dimensional, but now we have changed from 2-dimensional to 3-dimensional.

In the actual image algorithm development, we sometimes need to combine multiple single-channel Tensors (2-dimensional) to obtain multi-channel results (3-dimensional). And to achieve this method of increasing dimension splicing, we call it stack.

The stack function is defined as follows:

torch.stack(inputs, dim=0)

Among them, inputs indicates the Tensor that needs to be spliced, and dim indicates the direction of the newly created dimension.

How to use the stack? Let's look at an example together:

>>> A=torch.arange(0,4)
>>> A
tensor([0, 1, 2, 3])
>>> B=torch.arange(5,9)
>>> B
tensor([5, 6, 7, 8])
>>> C=torch.stack((A,B),0)
>>> C
tensor([[0, 1, 2, 3],
        [5, 6, 7, 8]])
>>> D=torch.stack((A,B),1)
>>> D
tensor([[0, 5],
        [1, 6],
        [2, 7],
        [3, 8]])

Combined with the code, we can see that first we construct two 4-element vectors A and B whose dimension is 1. Then, we create a new dimension in the direction of dim=0, which is the "row", so that the dimension becomes 2, and we get C. For D, we create a new dimension in the direction of dim=1, which is the "column".

Five, Tensor segmentation operation

After learning the connection operation, let's take a look at the inverse operation of the connection: segmentation.

Segmentation is the inverse process of connection. With the experience just now, you can easily imagine that there should be many kinds of operations for slicing, such as slicing and dicing. That's right, there are three main types of segmentation operations: chunk, split, and unbind.

At first glance, there are quite a few, but in fact, they have their own characteristics and are suitable for different usage scenarios. Let us take a look together.

chunk

The role of the chunk is to divide the Tensor as evenly as possible according to the declared dim.

For example, we have a feature of 32 channels, which needs to be evenly divided into 4 groups according to the channels, and each group has 8 channels. This segmentation can be realized through the chunk function. The specific functions are as follows:

torch.chunk(input, chunks, dim=0)

Let's take a look at the three parameters involved in the function one by one:

The first is input, which represents the Tensor to be chunked.

Next, let's look at chunks, which represent the number of blocks to be divided, not the number of each group. Note that chunks must be an integer.

Finally, dim, think about what this parameter means? Yes, that is according to which dimension to chunk.

Still the same as before, let's get an intuitive feel through a few code examples. We start with a simple 1D vector:

>>> A=torch.tensor([x for x in range(1,11)])
>>> B = torch.chunk(A, 2, 0)
>>> B
(tensor([1, 2, 3, 4, 5]), tensor([ 6,  7,  8,  9, 10]))

Here we use the chunk function to split the original 10-bit Tensor A into two vectors with the same 5-bit length. (Note that B is a tuple composed of two split results).

So what will happen if the chunk parameter is not divisible? Let's look down:

>>> B = torch.chunk(A, 3, 0)
>>> B
(tensor([1, 2, 3, 4]), tensor([5, 6, 7, 8]), tensor([ 9, 10]))

We found that Tensor A with a length of 10 bits is divided into three vectors with lengths of 4, 4, and 2 bits respectively. How is this divided? Shouldn't it be a more even way like 3, 3, 4?

If you want to solve the problem, you have to find the law. Let's look at a bigger example and change A to be 17 bits long.

>>> A=torch.tensor([x for x in range(1, 18)])
>>> B = torch.chunk(A, 4, 0)
>>> B
(tensor([1, 2, 3, 4, 5]), tensor([ 6,  7,  8,  9, 10]), tensor([11, 12, 13, 14, 15]), tensor([16, 17]))

The 17-bit Tensor A is divided into four vectors of 5, 5, 5, and 2 bits in length. At this time, you will find that when calculating the number of each result element, the chunk function first performs division, and then rounds up to get the number of each group.

For example, in the above example, 17/4=4.25, rounded up is 5, then first generate several vectors with a length of 5 one by one, and finally put them together as the last vector (length 2).

So what if the chunk parameter is greater than the length that Tensor can split? Let's actually do it, the code is as follows:

>>> A=torch.tensor([1,2,3])
>>> B = torch.chunk(A, 5, 0)
>>> B
(tensor([1]), tensor([2]), tensor([3]))

Obviously, the split Tensor can only be divided into several vectors with a length of 1.

From this, we can infer the two-dimensional situation. Let's take another example to see the situation of the two-dimensional matrix Tensor:

>>> A=torch.ones(4,4)
>>> A
tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
>>> B = torch.chunk(A, 2, 0)
>>> B
(tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.]]), 
 tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.]]))

Still the same as the previous cat, the dim parameter here means to split in the direction of the dim dimension.

The chunk function introduced just now is divided according to "dividing into a certain number of parts", so if you want to divide according to "each part according to a certain size", how to do it? PyTorch also provides a corresponding method called split.

split

The function of split is defined as follows. As before, let's take a look at the parameters involved here.

torch.split(tensor, split_size_or_sections, dim=0)

The first is tensor, which is the Tensor to be divided.

Then there is the split_size_or_sections parameter. When it is an integer, it means that the tensor is cut according to the value of each block whose size is the integer; when this parameter is a list, it means that the tensor is cut into blocks of the same size as the elements in the list.

The last is also dim, which defines which dimension to split.

Similarly, let's take a few examples to see the specific operation of split. The first is the case where split_size_or_sections is an integer:

>>> A=torch.rand(4,4)
>>> A
tensor([[0.3480, 0.4864, 0.2071, 0.6849],
        [0.5214, 0.6468, 0.9263, 0.6596],
        [0.8465, 0.0030, 0.9253, 0.3342],
        [0.7469, 0.4648, 0.8200, 0.4193]])
>>> B=torch.split(A, 2, 0)
>>> B
(tensor([[0.3480, 0.4864, 0.2071, 0.6849],
        [0.5214, 0.6468, 0.9263, 0.6596]]), tensor([[0.8465, 0.0030, 0.9253, 0.3342],
        [0.7469, 0.4648, 0.8200, 0.4193]]))

In this example, we see that the Tensor A of the original 4x4 size is divided along the 0th dimension, that is, along the direction of the "row", according to the size of each group of 2 "rows", and two 2x4 sizes are obtained. Tensor.

So the question is, if split_size_or_sections cannot divide the size of the corresponding direction, what will be the result? Let's modify the code slightly:

>>> C=torch.split(A, 3, 0)
>>> C
(tensor([[0.3480, 0.4864, 0.2071, 0.6849],
        [0.5214, 0.6468, 0.9263, 0.6596],
        [0.8465, 0.0030, 0.9253, 0.3342]]), tensor([[0.7469, 0.4648, 0.8200, 0.4193]]))

According to the code just now, we can find that, it turns out that PyTorch will make up each result as much as possible, so that the data size corresponding to dim is equal to split_size_or_sections. If there is not enough left at the end, then put the remaining content together as the last result.

Next, let's take a look at the case when split_size_or_sections is a list. As mentioned just now, when split_size_or_sections is a list, it means that the tensor is cut into blocks of the same size as the elements in the list. Let's look at a piece of corresponding code:

>>> A=torch.rand(5,4)
>>> A
tensor([[0.0871, 0.9387, 0.2978, 0.8540],
        [0.4216, 0.5009, 0.6090, 0.1782],
        [0.7486, 0.6665, 0.3248, 0.9010],
        [0.0457, 0.1507, 0.5208, 0.3595],
        [0.4709, 0.9482, 0.0524, 0.0906]])
>>> B=torch.split(A,(2,3),0)
>>> B
(tensor([[0.0871, 0.9387, 0.2978, 0.8540],
        [0.4216, 0.5009, 0.6090, 0.1782]]), tensor([[0.7486, 0.6665, 0.3248, 0.9010],
        [0.0457, 0.1507, 0.5208, 0.3595],
        [0.4709, 0.9482, 0.0524, 0.0906]]))

How to explain this part of the code? In fact, it is also easy to understand. It is to divide Tensor A along the 0th dimension, and each result corresponds to the size or size of the dimension, which is 2 (rows) and 3 (rows).

unbind

By learning the previous functions, we know how to divide by a fixed size, or select according to the index. Now let's imagine an application scenario. If we now have a Tensor with 3 channel images and want to get the data of each channel one by one, how should we do it?

If we use chunk, we need to set chunks to 3; if we use split, we need to set split_size_or_sections to 1.

Although they can all achieve the same purpose, if the number of channels is large, it will be more troublesome to fetch them one by one. At this time, you need to use another function: unbind, its function is defined as follows:

torch.unbind(input, dim=0)

Among them, input represents the Tensor to be processed, and dim is still the same as the previous function, representing the direction of the slice.

Let's understand with examples:

>>> A=torch.arange(16).view(4,4)
>>> A
tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15]])
>>> b=torch.unbind(A, 0)
>>>
>>> b
(tensor([0, 1, 2, 3]), tensor([4, 5, 6, 7]), tensor([ 8,  9, 10, 11]), tensor([12, 13, 14, 15]))

In this example, we first create a 4x4 two-dimensional matrix Tensor, and then we split from the 0th dimension, which is the "row" direction, because the matrix has 4 rows, so we will get 4 results.

Next, let's take a look: If we split from the first dimension, that is, the "column" direction, what will be the result:

>>> b=torch.unbind(A, 1)
>>> b
(tensor([ 0,  4,  8, 12]), tensor([ 1,  5,  9, 13]), tensor([ 2,  6, 10, 14]), tensor([ 3,  7, 11, 15]))

It is not difficult to find that it is disassembled in the direction of "column". Therefore, unbind is a method of dimensionality reduction and segmentation, which is equivalent to the result after deleting a dimension.

6. Index operation of Tensor

Have you noticed that in the chunk and split operations we just talked about, we split the data as a whole and get all the results. But sometimes, we only need a part of it, how to do this? A very natural idea is to directly tell Tensor which parts I want. This method is called indexing operation.

There are many ways to operate the index, some provide good ready-made API, and some user-customized operations, the two most commonly used operations are index_select and masked_select, let's take a look at the usage respectively.

index_select

Here you need the index_select function, which is defined as follows:

torch.index_select(tensor, dim, index)

The tensor and dim here are the same as those in the previous function, so I won’t repeat them here. Let's focus on the index, which indicates where to select data from the dim dimension. It should be noted here that the index is of the torch.Tensor type.

Still the same as before, let's look at a few sample codes:

>>> A=torch.arange(16).view(4,4)
>>> A
tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15]])
>>> B=torch.index_select(A,0,torch.tensor([1,3]))
>>> B
tensor([[ 4,  5,  6,  7],
        [12, 13, 14, 15]])
>>> C=torch.index_select(A,1,torch.tensor([0,3]))
>>> C
tensor([[ 0,  3],
        [ 4,  7],
        [ 8, 11],
        [12, 15]])

In this example, we first create a 4x4 matrix Tensor A. Then, we select the data of 1 (row) and 3 (row) from dimension 0 and get the final Tensor B, which has a size of 2x4. Then we select the data of 0 (column) and 3 (column) from Tensor A to get the final Tensor C with a size of 4x2.

How, is not very simple?

masked_select

The indexed_select just introduced extracts data based on a given index. But sometimes, we also want to make selections through some judgment conditions, such as extracting parameters with a value greater than 0 in a certain layer of the deep learning network.

At this time, you need to use the masked_select function provided by PyTorch. Let's first look at its definition:

torch.masked_select(input, mask, out=None)

Here we only need to care about the first two parameters, input and mask.

input represents the Tensor to be processed. mask represents the mask tensor, that is, the feature mask that satisfies the condition. What you need to note here is that the mask must have the same number of elements as the input tensor, but it does not need to have the same shape or dimension.

Do you still feel a little foggy? Let me give an example, after you read it, you will understand it at once.

Have you ever wondered in your usual practice, if we compare Tensor with numbers, what will be the result? For example, in the following code, we randomly generate a 5-bit Tensor A:

>>> A=torch.rand(5)
>>> A
tensor([0.7484, 0.7311, 0.6890, 0.0034, 0.3469])
>>> B=A>0.3
>>> B
tensor([ True,  True,  True, False,  True])

In this code, we compare A with 0.3 and get a new Tensor, each internal value indicates whether the corresponding value in A is greater than 0.3.

For example, the first value is originally 0.3731, which is greater than 0.3, so it is True; the last value 0.2285 is less than 0.3, so it is False.

This new Tensor is actually a mask tensor, and each bit of it represents the result of judging whether a condition is true.

Then, we continue to write a piece of code to see what the result of the selection based on mask B is:

>>> C=torch.masked_select(A, B)
>>> C
tensor([0.7484, 0.7311, 0.6890, 0.3469])

You will find that what C actually gets is: the data corresponding to the position in A that "satisfies the value of the element in B that is True".

Well, now you should know the function of masked_select, right? In fact, we get a mask tensor according to the conditions to be filtered, and then use this tensor to extract the data in the Tensor.

According to this idea, the above example can be simplified to:

>>> A=torch.rand(5)
>>> A
tensor([0.8546, 0.7931, 0.5147, 0.7306, 0.2706])
>>> C=torch.masked_select(A, A>0.3)
>>> C
tensor([0.8546, 0.7931, 0.5147, 0.7306])

Is it very simple?

Let's do a little exercise:

Now there is a Tensor, as follows:

>>> A=torch.tensor([[4,5,7], [3,9,8],[2,3,4]])
>>> A
tensor([[4, 5, 7],
        [3, 9, 8],
        [2, 3, 4]])

We want to extract the first one in the first row, the first and second ones in the second row, and the last one in the third row. How to do it?

Obviously, it can be done through masked_select():

>>> B = torch.tensor([[1,0,0],[1,1,0],[0,0,1]])
>>> C = B==1
>>> C
tensor([[ True, False, False],
        [ True,  True, False],
        [False, False,  True]])
>>> D = torch.masked_select(A, C)
>>> D
tensor([4, 3, 9, 4])

There are many main functions and usages in Tensor, which are summarized in the following table. When we use it, we can flexibly query the relevant parameter list as needed: