[1] Pytorch for deep learning - basic concepts such as tensor size, storage offset and step size

insert image description here

insert image description here

Basic concepts of deep learning

Deep learning is a learning method of artificial neural network. It processes information by mimicking the way the human brain learns. The network of deep learning has many layers, and each layer can learn more abstract concepts. This method has good applications in speech recognition, computer vision, natural language processing and other fields.

Deep learning also has many applications, which often involve taking data in one form (such as images or text) and generating data in another form (such as labels, numbers or more text).

From this perspective, deep learning consists of building a system that transforms data from one representation to another. This transformation is driven by extracting commonalities from a series of samples that reflect the desired mapping.

of this processThe first step is to convert the input to a float, since the network uses floating point numbers to process information, we need to encode the real world data into a form the network understands, and then decode the output back into a form we can understand and use for some purpose.
insert image description here

The transformation from one data form to another is usually learned hierarchically by deep neural networks , which means that we can treat the data transformed between layers as a series of intermediate representations. Taking image recognition as an example, shallow representations can be features (such as edge detection) or textures (such as hair) , and deeper representations can capture more complex structures (such as ears, noses, or eyes).

Typically, this intermediate representation is a collection of floating-point numbers that characterize the input and capture structure in the data, helping to describe how the input maps to the output of the neural network. Collections of these floating point numbers and their manipulations are at the heart of modern AI.

These intermediate representations (such as shown in the second step in the figure above) are the result of combining the input with the neuron weights of the previous layer, and each intermediate representation is unique to the previous input.

Before starting to convert data into floating-point input, we must have a deep understanding of how PyTorch processes and stores data (input, intermediate representation, and output), which leads to the data structure in Pytorch - tensor (tensor) the concept of

Basic concepts of tensors

Tensors in Pytorch are multidimensional arrays, similar to Numpy arrays. Tensors are a fundamental mathematical concept in PyTorch that can represent scalars, vectors, and matrices.

Advantages of tensors:

  • Support GPU computing, which can accelerate the training process of the model.
  • Easy to use and expand, Pytorch provides rich APIs and tools for developers to create and use neural network models.
  • Provides automatic derivation function, which can easily calculate backpropagation.

PyTorch application scenarios are mainly deep learning, which can be used forImage classification, speech recognition, natural language processingand other multi-field tasks.

For those coming from mathematics, physics, or engineering, the word tensor is tied up with the concepts of space, frames of reference, and transformations between them. For others, tensors refer to theVector (vector) and matrix (matrix) generalized to arbitrary dimensions, as shown in the figure below. Another name for the same concept as tensor isMultidimensional Arrays(multidimensional array)。

The dimensionality of a tensor corresponds to the number of indices used to index a scalar value in the tensor.

insert image description here

For example: a two-dimensional tensor, which contains four numbers, can use two indexes (row and column) to index a scalar value. Therefore, this 2D tensor has the same number of dimensions as the index used to index one of the scalar values ​​in it, i.e. both are 2-dimensional.

Compared with NumPy arrays, PyTorch's tensors have some more powerful features, such as the ability to perform fast operations on the GPU , perform distributed operations on multiple devices or machines, and track the calculation graph created. All of these features are important for implementing modern deep learning libraries.

Basic operations on tensors

A tensor is an array, a data structure that stores a collection of numbers that can be accessed individually by index or indexed by multiple indexes. This is similar to numpy in our python. How to index multidimensional arrays and related concepts. If you have the basis of linear algebra, you can know what is called a matrix and this series of related operations, and you can understand these concepts.

import torch

a = torch.ones(3,device='cuda')
print(a)
print(a[1])
print(float(a[1]))
a[2] = 2.0
print(a)

insert image description here
After importing the torch module, we call a function that creates a (one-dimensional) tensor of size 3 and fills it with a value of 1.0. You can access elements using a 0-based index, or you can assign new values ​​to them.

Although on the surface, this example doesn't look much different from a Python list, it's actually quite different. A Python list or tuple of numbers is a collection of Python objects that are allocated individually in memory , as shown on the left in the figure below. However, PyTorch tensors or NumPy arrays are (usually) views on contiguous blocks of memory that hold unboxed C numeric types, in this case, as shown on the right side of the figure below, is 32-bit floats (4 bytes), not Python objects. Thus, a 1D tensor containing 1 million floats requires 4 million contiguous bytes of storage , plus a small overhead for metadata (size, datatype, etc.).

insert image description here

# 使用.zeros是获取适当大小的数组的一种方法
points = torch.zeros(6)
print(points)
# 用所需的值覆盖这些0
points[0] = 1.0
points[1] = 4.0
points[2] = 2.0
points[3] = 1.0
points[4] = 3.0
points[5] = 5.0
print(points)
points = torch.tensor([1.0, 4.0, 2.0, 1.0, 3.0, 5.0])
print(points)
# 获取第一个点的坐标
print(float(points[0]), float(points[1]))

insert image description here

# 创建一个5乘5的张量
point=torch.ones(5,5)
print(point.shape)#获取其维度大小
point[0][0]=2.0
# 使用索引来访问张量
print(point[0][0])#获取第一行第一列的元素

insert image description here

points = torch.FloatTensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]])
points

>>>
tensor([[1., 4.],
        [2., 1.],
        [3., 5.]])

points[0, 1]
>>>
tensor(4.)

points[0]
>>>
tensor([1., 4.])

Note that the output is another tensor, a 1D tensor of size 2, containing the values ​​in the first row of points. Does the above output copy the values ​​to a newly allocated block of memory and wrap the new memory in a new tensor object? The answer is no, because it's not efficient, especially if you have millions of point data. In contrast, the above output is a view of the same data block limited to the first row.

Tensors and storage

Values ​​are allocated in contiguous blocks of memory, managed by torch.Storage instances. Storage is a one-dimensional array of numeric data, such as a contiguous block of memory containing numbers of a specified type (possibly float or int32). PyTorch's Tensor is a view of this Storage, into which we can index using offsets and strides for each dimension.

Multiple tensors can index the same storage, even though they may be indexed differently, as shown. In fact, when you get points[0] in the last code snippet in the previous section, what you get is another tensor that indexes the same storage as points, just doesn't index the entirety of that storage and has a different Dimensionality (1D vs. 2D). Since the underlying memory is only allocated once, regardless of the size of the data managed by the Storage instance, different tensor views can be quickly created on that data.

insert image description here

points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]])
print(points.storage())
points_storage = points.storage()
print(points_storage[0])
print(points.storage()[1])

insert image description here
Two-dimensional tensor storage cannot be indexed with two indices, because storage is always one-dimensional independent of the dimensionality of any tensor that references it.

So changing the stored value will of course also change the contents of the tensor that references it :

points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]])
points_storage = points.storage()
points_storage[0] = 100.0#重新赋值
print(points)

insert image description here

Dimensions, memory offsets and strides

In addition to storing storage, for indexing storage, tensors rely on several pieces of information that clearly define them: size, storage offset, and stride.

A dimension (or in NumPy parlance: a shape) is atuple, representing the tensorHow many elements are there in each dimension.
storage offset isthe index in storage corresponding to the first element in the tensor.
The stride is in storage for order along each dimensionThe number of elements to skip to get the next element

insert image description here

points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]])
second_point = points[2]
print(second_point.storage_offset())#输出4

Why is it 4, because we get the second element [3.0,5.0] in the two-dimensional array, and then the corresponding index of 3.0 stored in the storage is 4

points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]])
print(points)
second_point = points[2]
print(second_point.storage_offset())#输出4
print(second_point)
print(points.size())
print(second_point.size())#这里打印为:2,不要理解为2行。
# second_point是一个一维的数组,里面涵盖的是2个元素,
# 相对于我们的points里面也存在3个元素,但是每一个元素里面又包含2个元素

Tensor dimension information is the same as that contained in the tensor object's shape attribute

stride is a tuple representing the number of elements in storage that must be skipped when the index is incremented by 1 in each dimension. For example, the above example

The stride of the points tensor: (2, 1)

Accessing a 2D tensor with subscripts i and j is equivalent to accessing thestorage_offset + stride[0] * i + stride[1] * jelement. The offset is usually zero, but may be positive if this tensor is a view into storage that can hold a larger tensor.

This indirection between the Tensor and the Storage makes certain operations (such as transposing or extracting subtensors) cheap because they do not cause memory reallocation; instead, they (merely) allocate A new Tensor object with a different dimension, storage offset or stride.

We've just seen subtensors being extracted by indexing specific points, and we've also seen storage offsets increase. Now let's see what happens with the size and stride:

points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]])
second_point = points[1]
print(second_point.size())
print(second_point.storage_offset())
print(second_point.stride())

insert image description here
The result is that the subtensor has one dimension reduced (as we would expect), while indexing the same storage as the original point tensor.

Here we take a look at how this stride() is calculated. According to the most primitive definition: the stride is stored in order along each dimensionThe number of elements to skip to get the next element

The official document describes it like this: stride is the step size necessary to jump from one element to the next in the specified dimension dim. When no arguments are passed, a tuple of all strides is returned. Otherwise, an integer value is returned as the step size in the specified dimension dim.

>>> b
tensor([[0, 1, 2],
        [3, 9, 5]])
>>> b.stride()
(3, 1)
>>> b.stride(0)
3
>>> b.stride(1)
1

The above 3 refers to the step size required from one element [0, 1, 2] to the next element [3, 9, 5] in the 0th dimension is 3, and it can also be understood that from the first to the first index to the next element The first index stride is 3. And 1 means that the step size required from one element 0 to the next element 1 in the first dimension [0, 1, 2] is 1.

Changing the subtensor also affects the original tensor:

We know that when we use an index to modify a certain fixed value of a tensor, we don't want all of these to change. At this time, we can choose to clone its tensor.

clone operation

points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]])
second_point = points[1]
second_point[0] = 10.0
points
points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]])
second_point = points[1].clone()#克隆方法
second_point[0] = 10.0
points

transpose operation

points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]])
print(points)

points_t = points.t()
print(points_t)

# 验证两个张量共享同一存储:
print(id(points.storage()) == id(points_t.storage()))

# 它们的仅仅是尺寸和步长不同
print(points.stride())
# 与,4之间的差距为1,与2之间的差距为2,这一切都是在存储的索引上解释的,也就是说还没有发生转置的时候
print(points_t.stride())
print(points_t.storage())#虽然转置,但是存储的顺序没有改变

insert image description here
The above result tells us that incrementing the first index in points by 1 (i.e., from points[0,0] to points[1,0]) skips two elements along the store, and increments the second index from points [0,0] to points[0,1] will skip one element along the storage.

In other words, storage stores the elements of the points tensor row by row.

You can transpose points to points_t as shown in the figure. You changed the order of the elements in the step. This way, incrementing the row (the first index of the tensor) skips 1 element along the store, just as points move along the column, which is the definition of a transpose. No new memory is allocated (for this process) : the transpose is only achieved by creating a new Tensor instance with a different stride order than the original Tensor.

insert image description here

Transposing in PyTorch is not limited to matrices (i.e. two-dimensional arrays). Using the example of flipping the stride and dimensions of a 3D array, you can transpose a multidimensional array by specifying the two dimensions that should be transposed:

some_tensor = torch.ones(3, 4, 5)
print(some_tensor)
a =some_tensor.shape, some_tensor.stride()
print(a)
some_tensor_t = some_tensor.transpose(0, 2)
print(some_tensor_t)
b =some_tensor_t.shape, some_tensor_t.stride()
print(b)
import torch
a=torch.Tensor([[[1,2,3],[2,3,4]],[[3,4,5],[4,5,6]]])
b=a.transpose(1,2)  
c=a.transpose(2,1)
print(a.shape)
print(b.shape)
print(c.shape)
 
print(a)
print(b)
print(c)
输出:
torch.Size([2, 2, 3])
torch.Size([2, 3, 2])
torch.Size([2, 3, 2])
tensor([[[1., 2., 3.],
         [2., 3., 4.]],
 
        [[3., 4., 5.],
         [4., 5., 6.]]])
tensor([[[1., 2.],
         [2., 3.],
         [3., 4.]],
 
        [[3., 4.],
         [4., 5.],
         [5., 6.]]])
tensor([[[1., 2.],
         [2., 3.],
         [3., 4.]],
 
        [[3., 4.],
         [4., 5.],
         [5., 6.]]])

If you don't understand this very well, transpose (, 2) and transpose (2, 1) are similar, swap dimensions, what does swap dimension mean here. Take a simple example:

x = torch.randn(2, 3)
>>> x
tensor([[ 1.0028, -0.9893,  0.5809],
        [-0.1669,  0.7299,  0.4942]])

>>> torch.transpose(x, 0, 1)
tensor([[ 1.0028, -0.1669],
        [-0.9893,  0.7299],
        [ 0.5809,  0.4942]])

We can see that the initial tensor is one, one array contains 2 arrays, and this separate 2 arrays contains 3 egg-to-home numbers, so now the dimension is swapped, the original large The array contains 3 arrays, and these three separate arrays contain 2 numbers.

contiguous method

Tensors that have their values ​​stored in storage starting from the rightmost dimension (such as a 2D tensor that is stored in storage along rows) are defined as contiguous tensors. Contiguous tensors are convenient because you can access their elements efficiently and in order instead of jumping around in storage.

(Improving data locality improves performance due to how memory access works in modern CPUs. i.e. contiguous tensors satisfy the locality principle)

The contiguous method obtains a new continuous tensor from a non-contiguous tensor. The contents of the tensor remain the same, but the stride changes, as does the storage:

points.is_contiguous(), points_t.is_contiguous()
(True, False)

You can get new contiguous tensors from non-contiguous tensors using the contiguous method.

The contents of the tensor remain the same, but the stride changes

the same goes for storage

points = torch.tensor([[1.0, 4.0], [2.0, 1.0], [3.0, 5.0]])
points_t = points.t()
points_t
tensor([[1., 2., 3.],
        [4., 1., 5.]])
points_t.storage()
 1.0
 4.0
 2.0
 1.0
 3.0
 5.0
[torch.FloatStorage of size 6]
points_t.stride()
(1, 2)

change the situation

The content remains the same, but the stride and storage change

points_t_cont = points_t.contiguous()
points_t_cont
tensor([[1., 2., 3.],
        [4., 1., 5.]])
points_t_cont.stride()
(3, 1)
points_t_cont.storage()
 1.0
 2.0
 3.0
 4.0
 1.0
 5.0
[torch.FloatStorage of size 6]

The new storage reorganizes the elements to store tensor elements row by row. The step size has also been changed to reflect the new layout

insert image description here

every word

Maybe time will dilute everything, but remember those good things to forge ahead

Guess you like

Origin blog.csdn.net/weixin_47723732/article/details/128863131
Recommended