Instant-NGP manuscript

Instant-NGP is a paper published by Nvidia in 2022. Its full name is multi-resolution instant neurographic primitive using hash encoding.

What about this article?

This article proposes a method of hash encoding the input so that a very small network can learn high quality.

This picture is a picture in the article, which shows that the image can be expressed clearly in a short time of training.

So how to get these input features? It uses a multi-resolution hash encoding. We can combine it with the following parameters to see

This L represents how many resolution levels it has. Only two levels are drawn in the picture, which are L=0 and L=1. In practice, there are 16 levels, and each level actually corresponds to a hash. Table, the specific number of items is represented by T (it can be understood that the table length of the hash table is 2 14 to 2 20), and then the dimension of each feature is represented by F, here is F=2

The following Nmin and Nmax are also fixed values ​​in the code; these two represent the minimum resolution and maximum resolution.
These two values ​​determine how many grids I want to divide this image into at this resolution.

Then let’s look at how he got the hash feature. First of all, we can calculate which squares he is in. Then if he falls in these two, he will correspond to different levels.

Under the registration L=1, we calculate its hash characteristics. First, we will perform a hash map on these four vertices, and get a hash index of 0 2 3 6. Then we will go to the corresponding hash table to find the corresponding hash feature, and then we will use bilinear The interpolation method is used to obtain the hash feature of this point.

For each level, a hash feature of F=2 will be obtained. Finally, after multi-resolution hash encoding, a hash feature of 16*2 will be obtained. In addition, there is an auxiliary feature for splicing, and finally sent to into this smaller hash network

2^14 For a relatively small level, if the number of grids is less than T, then hash conflicts will not occur.
But at a relatively fine level, such as this maximum resolution, the total number of grids may be greater than the table length. It is possible for a hash conflict to occur, indexing to the same value, and a hash collision. However, the author did not design to handle this hash collision at the code level, because during the training of the neural network, the neural network can be used to process this hash. collision.
Rely on gradient to optimize hash table parameters.
If two points indexed to the same position conflict, because the value in the scene on the cube on n 3 is meaningless, the object surface n 2 is meaningful, so the gradient at the n^2 point is Valuable. The value at an empty point is worthless. In this way, the gradient of the point on the surface will serve as a guide and resolve the conflict.

That is to say, points on the surface will more strongly promote network updates.
Insert image description here

For example, here is 2 of 14 and 2 of 19. The difference in iteration time is not particularly big, but the performance will be better. But if updated to 21, it means that the table will be very long, so it is worth changing. If it is longer, the time will be longer and the performance will be improved.

Two of them are two data sets

The other one is F’s experiment
Insert image description here

This table compares their experiment with three other experiments. These eight are the experimental results of the data set. If it is
gold, it means gold medal.

The main reason is that the training speed here is far better than other methods.
Insert image description here

Guess you like

Origin blog.csdn.net/pjm616/article/details/131301367