torch THTensorApply.h about Tensor computation

1. First of all, it is necessary to understand the problem of array storage format in C language. c storage
2. Then it is necessary to understand the access
to the array. The access to the array at the bottom layer of c will check whether the memory storage is continuous, and the continuous access will be centralized to save time. The determination of whether the memory array is continuous is determined by two parameters, stride and size.
For arrays a, b, a is taken from b.
a[3][3][3][3][3][2][3] —> stride={1296, 432, 144, 36, 12, 3, 1} size = {3,3,3, 3,3,2,3}
b[3][3][3][4][3][4][3] —> stride={1296, 432, 144, 36, 12, 3, 1} size = {3,3,3,4,3,4,3}

It is found that there is a broken place in the middle, which is the place where the two indexes of 4 correspond to b. The array storage in the c language is stored in one-dimensional form. The first link can be seen, but a and b are still the same piece of memory. It's just that a is empty in some places in the middle. This causes a to be discontinuous. For access to discontinuous arrays, the bottom layer of torch's c is handled from a block perspective. For example, the above a will merge the three dimensions in front of it, the two dimensions in the middle, and the two dimensions in the back respectively, resulting in

1 2 3 4
0 0 0 counter
27 9 6 sizes
144 12 1 strides

This table describes the memory allocation method for arrays. For array a, it is broken at [2], and [2][3] is used as a group, and for the middle [3][3] as a group, The previous [3][3][3] is used as the first group, but the stride is still calculated in the way of b, which is the same as the memory array storage mechanism, and must be one-dimensional storage.
Describing
000 001 002 010 011 012 xxxxxx 100 101 102 110 111 112 xxxxxx 200 201 202 210 211 212 xxxxxx with the last three dimensions,
the last access will access the place that does not hit x, and there is a link 2 about the memory array access method Introduce, how to achieve efficient access, and the principle used by the THTensorApply file is similar. It is to merge consecutive dimensions together and reduce the number of inner loops

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325742454&siteId=291194637